lordjia commited on
Commit
a1d0285
·
verified ·
1 Parent(s): 1a997f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -1
README.md CHANGED
@@ -11,4 +11,52 @@ tags:
11
  - Cantonese
12
  - Qwen2
13
  - chat
14
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - Cantonese
12
  - Qwen2
13
  - chat
14
+ ---
15
+
16
+ # Qwen2-Cantonese-7B-Instruct
17
+
18
+ ## Model Overview
19
+
20
+ Qwen2-Cantonese-7B-Instruct is a Cantonese language model based on Qwen2-7B-Instruct, fine-tuned using LoRA. It aims to enhance Cantonese text generation and comprehension capabilities, supporting various tasks such as dialogue generation, text summarization, and question-answering.
21
+
22
+ ## Model Features
23
+
24
+ - **Base Model**: Qwen2-7B-Instruct
25
+ - **Fine-tuning Method**: LoRA instruction tuning
26
+ - **Training Steps**: 4572 steps
27
+ - **Primary Language**: Cantonese
28
+ - **Datasets**:
29
+ - [jed351/cantonese-wikipedia](https://huggingface.co/datasets/jed351/cantonese-wikipedia)
30
+ - [raptorkwok/cantonese-traditional-chinese-parallel-corpus](https://huggingface.co/datasets/raptorkwok/cantonese-traditional-chinese-parallel-corpus)
31
+
32
+ ## Usage
33
+
34
+ You can easily load and use this model with Hugging Face's Transformers library. Here is a simple example:
35
+
36
+ ```python
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer
38
+
39
+ tokenizer = AutoTokenizer.from_pretrained("lordjia/Qwen2-Cantonese-7B-Instruct")
40
+ model = AutoModelForCausalLM.from_pretrained("lordjia/Qwen2-Cantonese-7B-Instruct")
41
+
42
+ input_text = "唔該你用廣東話講下你係邊個。"
43
+ inputs = tokenizer(input_text, return_tensors="pt")
44
+ outputs = model.generate(**inputs)
45
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
46
+ ```
47
+
48
+ ## Quantized Version
49
+
50
+ A 4-bit quantized version of this model is also available: [qwen2-cantonese-7b-instruct-q4_0.gguf](https://huggingface.co/lordjia/Qwen2-Cantonese-7B-Instruct/blob/main/qwen2-cantonese-7b-instruct-q4_0.gguf).
51
+
52
+ ## License
53
+
54
+ This model is licensed under the Apache 2.0 license. Please review the terms before use.
55
+
56
+ ## Contributors
57
+
58
+ - LordJia
59
+
60
+ ## Acknowledgements
61
+
62
+ Thanks to Hugging Face for providing the platform and tools, and to all the developers and researchers contributing to the open-source community.