File size: 2,471 Bytes
e1e331a
 
 
958afc9
 
 
 
 
 
e1e331a
 
 
 
958afc9
e1e331a
 
 
 
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
 
958afc9
 
 
 
e1e331a
3eb8b73
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
958afc9
e1e331a
 
 
 
 
 
958afc9
e1e331a
 
 
 
 
 
 
 
958afc9
e1e331a
 
 
 
 
 
 
 
 
958afc9
e1e331a
 
 
 
 
958afc9
e1e331a
 
 
958afc9
 
 
e1e331a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
958afc9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
library_name: peft
base_model: meta-llama/Llama-2-7b-hf
license: mit
language:
- en
metrics:
- bertscore
- perplexity
---

# Model Card for Model ID

Fine-tuned using QLoRA for story generation task.


### Model Description

We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.

The input to the model is structred as follows:

'''

\#\#\# Instruction:  Below is a story idea. Write a short story based on this context.

\#\#\# Input: [story idea here]

\#\#\# Response:

'''


- **Developed by:** Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
- **Model type:** Causal LM
- **Language(s) (NLP):** English
- **Finetuned from model [optional]:** meta-llama/Llama-2-7b-hf

### Model Sources

- **Repository:** https://github.com/BodaSadalla98/llm-optimized-fintuning

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

The model is the result of our AI project. If you intend to use it, please, refer to the repo.


### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.


## Training Details

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

Github for the dataset: https://github.com/kevalnagda/StoryGeneration


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics


Test split of the same dataset.

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

We are using perplexity and BERTScore.

### Results

Perplexity: 8.0546

BERTScore: 80.11

## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32

### Framework versions


- PEFT 0.6.0.dev0