Edit model card

llama2-7b-dolly-15k-japanese-brainstorming

This model is a fine-tuned version of NousResearch/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0642
  • Rouge Scores: {'rouge1': 0.9653633690526473, 'rouge2': 0.9344851380895685, 'rougeL': 0.9554449155418876, 'rougeLsum': 0.9653412268073773}
  • Bleu Scores: [0.900726205206837, 0.8659579174904524, 0.8282110782853057, 0.788805544882148]
  • Gen Len: 467.6554

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge Scores Bleu Scores Gen Len
0.9844 1.0 398 0.9678 {'rouge1': 0.9691461778318393, 'rouge2': 0.9382969684393989, 'rougeL': 0.9578632643524403, 'rougeLsum': 0.9690528081779507} [0.9044212000231915, 0.86963718873592, 0.8320476513312739, 0.7928883742871506] 467.6554
0.6979 2.0 796 0.9827 {'rouge1': 0.9660674664057384, 'rouge2': 0.935760349759682, 'rougeL': 0.9563208459437473, 'rougeLsum': 0.9658214441757752} [0.9032042405966677, 0.8700278506363877, 0.8335211486398665, 0.7953375975436736] 467.6554
0.4532 3.0 1194 1.0642 {'rouge1': 0.9653633690526473, 'rouge2': 0.9344851380895685, 'rougeL': 0.9554449155418876, 'rougeLsum': 0.9653412268073773} [0.900726205206837, 0.8659579174904524, 0.8282110782853057, 0.788805544882148] 467.6554

Framework versions

  • PEFT 0.8.2
  • Transformers 4.39.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.17.2.dev0
  • Tokenizers 0.15.2
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for DrishtiSharma/llama2-7b-dolly-15k-japanese-brainstorming

Adapter
(133)
this model