--- base_model: unsloth/gemma-2-2b-it-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - gemma2 - trl - dpo --- # Uploaded model - **Developed by:** SameedHussain - **License:** apache-2.0 - **Finetuned from model :** unsloth/gemma-2-2b-it-bnb-4bit This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) | Step | Training Loss | Rewards / Chosen | Rewards / Rejected | Rewards / Accuracies | Rewards / Margins | Logps / Rejected | Logps / Chosen | Logits / Rejected | Logits / Chosen | |------|---------------|------------------|--------------------|----------------------|-------------------|------------------|----------------|-------------------|-----------------| | 100 | 0.454700 | 6.241566 | 3.175092 | 0.750000 | 3.066474 | -102.758446 | -53.181263 | -14.580903 | -14.938275 | | 200 | 0.264100 | 6.640531 | 2.823826 | 0.888750 | 3.816705 | -110.525520 | -50.815018 | -14.796252 | -15.198202 | | 300 | 0.110200 | 6.310797 | 1.718347 | 0.985000 | 4.592450 | -118.720840 | -48.524315 | -15.263680 | -15.698647 | | 400 | 0.046900 | 6.744057 | 0.677384 | 0.997500 | 6.066672 | -128.757660 | -48.107479 | -15.710546 | -16.174524 | | 500 | 0.019700 | 6.714230 | -0.529035 | 1.000000 | 7.243264 | -143.408020 | -49.327625 | -16.120342 | -16.611662 | | 600 | 0.013700 | 6.605389 | -1.275738 | 1.000000 | 7.881127 | -146.968491 | -48.847641 | -16.320650 | -16.836390 | | 700 | 0.007900 | 6.333577 | -2.010140 | 1.000000 | 8.343716 | -154.255066 | -50.590134 | -16.486574 | -16.987421 | | 800 | 0.006300 | 6.489099 | -2.076626 | 1.000000 | 8.565723 | -150.381393 | -49.992256 | -16.614525 | -17.117744 | | 900 | 0.005100 | 6.429256 | -2.340122 | 1.000000 | 8.769380 | -160.874405 | -51.164425 | -16.687891 | -17.165791 | | 1000 | 0.004700 | 6.494193 | -2.520164 | 1.000000 | 9.014358 | -163.852982 | -54.317467 | -16.757954 | -17.206339 | | 1100 | 0.005900 | 6.287598 | -2.524287 | 1.000000 | 8.811884 | -161.473770 | -52.012741 | -16.825716 | -17.266563 | | 1200 | 0.005200 | 6.246828 | -3.126722 | 0.998750 | 9.373549 | -167.766861 | -52.052780 | -16.795412 | -17.277397 | | 1300 | 0.004300 | 6.347938 | -2.930621 | 1.000000 | 9.278559 | -165.971939 | -50.738480 | -16.836918 | -17.304783 | | 1400 | 0.003900 | 6.232501 | -3.073614 | 1.000000 | 9.306114 | -165.787643 | -50.953049 | -16.813383 | -17.290031 |