dball
/

zephyr-7b-sft-qlora

alignment-handbook

Generated from Trainer

4-bit precision

Model card Files Files and versions Metrics Training metrics Community

Resources

View closed (0)

Adding Evaluation Results

#2 opened 10 months ago by

leaderboard-pr-bot

Is the drop in many metrics expected? Why do SFT first if it makes the model worse? Why not do DPO directly on the mistral model?

#1 opened 12 months ago by