mohamedemam commited on
Commit
7b05d2d
Β·
verified Β·
1 Parent(s): f25cbbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -69
README.md CHANGED
@@ -1,21 +1,29 @@
1
-
2
-
3
  language:
4
- - en
5
  license: gpl
6
  tags:
7
- - autograding
8
- - essay quetion
9
- - sentence similarity
10
  metrics:
11
- - accuracy
12
  library_name: peft
13
  datasets:
14
- - mohamedemam/Essay-quetions-auto-grading
 
 
 
 
 
 
 
 
 
15
 
16
- Model Card for Model ID
17
- Model Details
18
- Model Description
19
 
20
  We are thrilled to introduce our graduation project, the EM5 model, designed for automated essay grading in both Arabic and English. πŸ“βœ¨
21
 
@@ -23,7 +31,11 @@ To develop this model, we first created a custom dataset for training. We adapte
23
 
24
  Our model utilizes the following impressive models:
25
 
26
- Mistral: 96% LLaMA: 93% FLAN-T5: 93% BLOOMZ (Arabic): 86% MT0 (Arabic): 84%
 
 
 
 
27
 
28
  You can try our models for auto-grading on Hugging Face! 🌐
29
 
@@ -31,31 +43,39 @@ We then deployed these models for practical use. We are proud of our team's hard
31
 
32
  #MachineLearning #AI #Education #EssayGrading #GraduationProject
33
 
34
- Developed by: mohamed emam
35
- Model type: decoder only
36
- Language(s) (NLP): English
37
- License: gpl
38
- Finetuned from model : llama
 
 
 
39
 
40
- Repository: https://github.com/mohamed-em2m/auto-grading
 
41
 
42
- Direct Use
43
 
44
  auto grading for essay quetions
45
- Downstream Use [optional]
 
 
 
46
 
47
  Text generation
48
 
49
  [More Information Needed]
50
- Training Data
51
 
52
- mohamedemam/Essay-quetions-auto-grading-arabic
 
53
 
54
- Training Procedure
55
 
56
- using Trl
57
- Pipline
58
 
 
 
 
59
  from transformers import Pipeline
60
  import torch.nn.functional as F
61
 
@@ -120,52 +140,50 @@ base_model = AutoModelForCausalLM.from_pretrained("NousResearch/Llama-2-7b-hf")
120
  model = PeftModel.from_pretrained(base_model, "mohamedemam/Em2-llama-7b")
121
  pipe=MyPipeline(model,tokenizer)
122
  print(pipe(context,quetion,answer,generate=True,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1))
 
 
123
 
124
- output:{'response': ["Instruction:/n check answer is true or false of next quetion using context below:\n#context: Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:\n\n Model Selection: Choosing a pre-trained model that fits the application's needs.\n Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.\n Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.\n Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.\n Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed..\n#quetion: What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?.\n#student answer: When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:\n Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.\n Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.\n Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.\n Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.\n Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.\n Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.\n\nBy evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model..\n#response: true the answer is"], 'true': 0.943033754825592}
125
-
126
- Chat Format Function
127
-
128
  This function formats the input context, question, and answer into a specific structure for the model to process.
129
 
 
130
  def chat_Format(self, context, question, answer):
131
  return "Instruction:/n check answer is true or false of next question using context below:\n" + "#context: " + context + f".\n#question: " + question + f".\n#student answer: " + answer + ".\n#response:"
132
-
133
- Configuration
134
- Dropout Probability for LoRA Layers
135
-
136
- lora_dropout: 0.05
137
-
138
- Quantization Settings
139
-
140
- use_4bit: True
141
- bnb_4bit_compute_dtype: "float16"
142
- bnb_4bit_quant_type: "nf4"
143
- use_nested_quant: False
144
-
145
- Output Directory
146
-
147
- output_dir: "./results"
148
-
149
- Training Parameters
150
-
151
- num_train_epochs: 1
152
- fp16: False
153
- bf16: False
154
- per_device_train_batch_size: 1
155
- per_device_eval_batch_size: 4
156
- gradient_accumulation_steps: 8
157
- gradient_checkpointing: True
158
- max_grad_norm: 0.3
159
- learning_rate: 5e-5
160
- weight_decay: 0.001
161
- optim: "paged_adamw_8bit"
162
- lr_scheduler_type: "constant"
163
- max_steps: -1
164
- warmup_ratio: 0.03
165
- group_by_length: True
166
-
167
- Logging and Saving
168
-
169
- save_steps: 100
170
- logging_steps: 25
171
- max_seq_length: False
 
1
+ ---
 
2
  language:
3
+ - en
4
  license: gpl
5
  tags:
6
+ - autograding
7
+ - essay quetion
8
+ - sentence similarity
9
  metrics:
10
+ - accuracy
11
  library_name: peft
12
  datasets:
13
+ - mohamedemam/Essay-quetions-auto-grading
14
+ ---
15
+
16
+ # Model Card for Model ID
17
+
18
+ <!-- Provide a quick summary of what the model is/does. -->
19
+
20
+
21
+
22
+ ## Model Details
23
 
24
+ ### Model Description
25
+
26
+ <!-- Provide a longer summary of what this model is. -->
27
 
28
  We are thrilled to introduce our graduation project, the EM5 model, designed for automated essay grading in both Arabic and English. πŸ“βœ¨
29
 
 
31
 
32
  Our model utilizes the following impressive models:
33
 
34
+ Mistral: 96%
35
+ LLaMA: 93%
36
+ FLAN-T5: 93%
37
+ BLOOMZ (Arabic): 86%
38
+ MT0 (Arabic): 84%
39
 
40
  You can try our models for auto-grading on Hugging Face! 🌐
41
 
 
43
 
44
  #MachineLearning #AI #Education #EssayGrading #GraduationProject
45
 
46
+ - **Developed by:** mohamed emam
47
+ - **Model type:** decoder only
48
+ - **Language(s) (NLP):** English
49
+ - **License:** gpl
50
+ - **Finetuned from model :** llama
51
+
52
+
53
+ <!-- Provide the basic links for the model. -->
54
 
55
+ - **Repository:** https://github.com/mohamed-em2m/auto-grading
56
+ ### Direct Use
57
 
58
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
59
 
60
  auto grading for essay quetions
61
+
62
+ ### Downstream Use [optional]
63
+
64
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
65
 
66
  Text generation
67
 
68
  [More Information Needed]
 
69
 
70
+ ### Training Data
71
+ - **mohamedemam/Essay-quetions-auto-grading-arabic**
72
 
 
73
 
74
+ ### Training Procedure
 
75
 
76
+ using Trl
77
+ ### Pipline
78
+ ```python
79
  from transformers import Pipeline
80
  import torch.nn.functional as F
81
 
 
140
  model = PeftModel.from_pretrained(base_model, "mohamedemam/Em2-llama-7b")
141
  pipe=MyPipeline(model,tokenizer)
142
  print(pipe(context,quetion,answer,generate=True,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1))
143
+ ```
144
+ - **output:**{'response': ["Instruction:/n check answer is true or false of next quetion using context below:\n#context: Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:\n\n Model Selection: Choosing a pre-trained model that fits the application's needs.\n Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.\n Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.\n Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.\n Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed..\n#quetion: What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?.\n#student answer: When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:\n Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.\n Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.\n Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.\n Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.\n Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.\n Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.\n\nBy evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model..\n#response: true the answer is"], 'true': 0.943033754825592}
145
 
146
+ ### Chat Format Function
 
 
 
147
  This function formats the input context, question, and answer into a specific structure for the model to process.
148
 
149
+ ```python
150
  def chat_Format(self, context, question, answer):
151
  return "Instruction:/n check answer is true or false of next question using context below:\n" + "#context: " + context + f".\n#question: " + question + f".\n#student answer: " + answer + ".\n#response:"
152
+ ```
153
+
154
+
155
+ ## Configuration
156
+
157
+ ### Dropout Probability for LoRA Layers
158
+ - **lora_dropout:** 0.05
159
+
160
+ ### Quantization Settings
161
+ - **use_4bit:** True
162
+ - **bnb_4bit_compute_dtype:** "float16"
163
+ - **bnb_4bit_quant_type:** "nf4"
164
+ - **use_nested_quant:** False
165
+
166
+ ### Output Directory
167
+ - **output_dir:** "./results"
168
+
169
+ ### Training Parameters
170
+ - **num_train_epochs:** 1
171
+ - **fp16:** False
172
+ - **bf16:** False
173
+ - **per_device_train_batch_size:** 1
174
+ - **per_device_eval_batch_size:** 4
175
+ - **gradient_accumulation_steps:** 8
176
+ - **gradient_checkpointing:** True
177
+ - **max_grad_norm:** 0.3
178
+ - **learning_rate:** 5e-5
179
+ - **weight_decay:** 0.001
180
+ - **optim:** "paged_adamw_8bit"
181
+ - **lr_scheduler_type:** "constant"
182
+ - **max_steps:** -1
183
+ - **warmup_ratio:** 0.03
184
+ - **group_by_length:** True
185
+
186
+ ### Logging and Saving
187
+ - **save_steps:** 100
188
+ - **logging_steps:** 25
189
+ - **max_seq_length:** False