beyoru
/

MCQ-o1-512

@@ -1,23 +1,65 @@
 ---
-base_model: unsloth/Qwen2.5-3B-Instruct
 tags:
 - text-generation-inference
 - transformers
-- unsloth
 - qwen2
 - trl
 - sft
 license: apache-2.0
 language:
 - en
 ---
 # Uploaded  model
 - **Developed by:** beyoru
 - **License:** apache-2.0
-- **Finetuned from model :** unsloth/Qwen2.5-3B-Instruct
-This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
 tags:
 - text-generation-inference
 - transformers
 - qwen2
 - trl
 - sft
 license: apache-2.0
 language:
 - en
+- vi
+datasets:
+- beyoru/Tin_hoc_mcq
 ---
 # Uploaded  model
 - **Developed by:** beyoru
 - **License:** apache-2.0
+# Usage
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "beyoru/MCQ-o1-512"
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+messages = [
+    {"role": "system", "content": "Bạn là một trợ lý thông minh có khả năng tạo ra một câu hỏi trắc nghiệm từ bất kỳ ngữ cảnh"},
+    {"role": "user", "content": "<YOUR CONTEXT>"}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    do_sample=True
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
+# Notes:
+- For small datasets with narrow content which the model has already done well on our domain, and doesn't want the model to forget the knowledge => Just need to focus on o.
+- Fine-tuned lora with rank = 1 and alpha = 64, epoch = 1, linear (optim)
+- DoRA
+# Improvement
+- Increasing rank can help the model do better at robust structure.
+- Try more efficient fine-tuning