--- license: gemma library_name: transformers datasets: - jondurbin/gutenberg-dpo-v0.1 model-index: - name: gemma-2-Ifable-9B results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 29.84 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 41.03 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 8.91 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 12.19 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 8.52 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 35.85 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ifable/gemma-2-Ifable-9B name: Open LLM Leaderboard --- # ifable/gemma-2-Ifable-9B This model ranked first on the Creative Writing Benchmark (https://eqbench.com/creative_writing.html) on September 10, 2024 ## Training and evaluation data - Gutenberg: https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1 - Carefully curated proprietary creative writing dataset ## Training procedure Training method: SimPO (GitHub - princeton-nlp/SimPO: SimPO: Simple Preference Optimization with a Reference-Free Reward) It achieves the following results on the evaluation set: - Loss: 1.0163 - Rewards/chosen: -21.6822 - Rewards/rejected: -47.8754 - Rewards/accuracies: 0.9167 - Rewards/margins: 26.1931 - Logps/rejected: -4.7875 - Logps/chosen: -2.1682 - Logits/rejected: -17.0475 - Logits/chosen: -12.0041 ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-07 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 16 - total_train_batch_size: 128 - total_eval_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 1.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Sft Loss | |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:| | 1.4444 | 0.9807 | 35 | 1.0163 | -21.6822 | -47.8754 | 0.9167 | 26.1931 | -4.7875 | -2.1682 | -17.0475 | -12.0041 | 0.0184 | ### Framework versions - Transformers 4.43.4 - Pytorch 2.3.0a0+ebedce2 - Datasets 2.20.0 - Tokenizers 0.19.1 We are looking for product manager and operations managers to build applications through our model, and also open for business cooperation, and also AI engineer to join us, contact with : contact@ifable.ai # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ifable__gemma-2-Ifable-9B) | Metric |Value| |-------------------|----:| |Avg. |22.73| |IFEval (0-Shot) |29.84| |BBH (3-Shot) |41.03| |MATH Lvl 5 (4-Shot)| 8.91| |GPQA (0-shot) |12.19| |MuSR (0-shot) | 8.52| |MMLU-PRO (5-shot) |35.85|