jefson08 commited on
Commit
7910dc4
·
verified ·
1 Parent(s): e621bc9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +142 -3
README.md CHANGED
@@ -1,3 +1,142 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # **Khasi Fill-Mask Model**
3
+
4
+ This project demonstrates how to use the Hugging Face Transformers library to perform a fill-mask task using the **`jefson08/kha-roberta`** model. The fill-mask task predicts the most likely token(s) to replace the `[MASK]` token in a given sentence.
5
+
6
+ ---
7
+
8
+ ## **Setup**
9
+
10
+ ### **1. Clone the Repository**
11
+
12
+ ```bash
13
+ git clone https://github.com/your-username/khasi-fill-mask.git
14
+ cd khasi-fill-mask
15
+ ```
16
+
17
+ ### **2. Install Dependencies**
18
+
19
+ Ensure you have Python 3.7 or later installed and the required libraries:
20
+
21
+ ```bash
22
+ pip install transformers torch
23
+ ```
24
+
25
+ If you intend to use GPU acceleration, ensure CUDA is installed on your system, and you have a compatible version of PyTorch.
26
+
27
+ ---
28
+
29
+ ## **Usage**
30
+
31
+ ### **1. Import Dependencies**
32
+
33
+ ```python
34
+ from transformers import pipeline, AutoTokenizer
35
+ ```
36
+
37
+ ### **2. Initialize the Model and Tokenizer**
38
+
39
+ Load the tokenizer and model pipeline:
40
+
41
+ ```python
42
+ # Initialisation
43
+ tokenizer = AutoTokenizer.from_pretrained('jefson08/kha-roberta')
44
+ fill_mask = pipeline(
45
+ "fill-mask",
46
+ model="jefson08/kha-roberta",
47
+ tokenizer=tokenizer,
48
+ device="cuda", # Use "cuda" for GPU or omit for CPU
49
+ )
50
+ ```
51
+
52
+ ### **3. Predict the [MASK] Token**
53
+
54
+ Provide a sentence with a `[MASK]` token for prediction:
55
+
56
+ ```python
57
+ # Predict [MASK] token
58
+ sentence = "Nga dei u briew u ba [MASK] bha."
59
+ predictions = fill_mask(sentence)
60
+
61
+ # Display predictions
62
+ for prediction in predictions:
63
+ print(f"{prediction['sequence']} (score: {prediction['score']:.4f})")
64
+ ```
65
+
66
+ ---
67
+
68
+ ## **Example Output**
69
+
70
+ Given the input sentence:
71
+
72
+ ```plaintext
73
+ "Nga dei u briew u ba [MASK] bha."
74
+ ```
75
+
76
+ The model might output:
77
+
78
+ ```plaintext
79
+ [{'score': 0.09230164438486099,
80
+ 'token': 6086,
81
+ 'token_str': 'mutlop',
82
+ 'sequence': 'Nga dei u briew u ba mutlop bha.'},
83
+ {'score': 0.051360130310058594,
84
+ 'token': 2059,
85
+ 'token_str': 'stad',
86
+ 'sequence': 'Nga dei u briew u ba stad bha.'},
87
+ {'score': 0.045497000217437744,
88
+ 'token': 1864,
89
+ 'token_str': 'khuid',
90
+ 'sequence': 'Nga dei u briew u ba khuid bha.'},
91
+ {'score': 0.04180142655968666,
92
+ 'token': 668,
93
+ 'token_str': 'kham',
94
+ 'sequence': 'Nga dei u briew u ba kham bha.'},
95
+ {'score': 0.027332570403814316,
96
+ 'token': 2817,
97
+ 'token_str': 'khlaiñ',
98
+ 'sequence': 'Nga dei u briew u ba khlaiñ bha.'}]
99
+ ```
100
+
101
+ ---
102
+
103
+ ## **Model Information**
104
+
105
+ The `jefson08/kha-roberta` model is fine-tuned for Khasi text tasks. It uses the fill-mask pipeline to predict and replace `[MASK]` tokens in sentences, providing insights into contextual language understanding.
106
+
107
+ ---
108
+
109
+ ## **Project Structure**
110
+
111
+ ```plaintext
112
+ ├── README.md # Documentation
113
+ ├── example.py # Example script for fill-mask task
114
+ ```
115
+
116
+ ---
117
+
118
+ ## **Dependencies**
119
+
120
+ - [Transformers](https://huggingface.co/docs/transformers): Provides the pipeline and model-loading utilities.
121
+ - [PyTorch](https://pytorch.org/): Backend framework for running the model.
122
+
123
+ Install the dependencies with:
124
+
125
+ ```bash
126
+ pip install transformers torch
127
+ ```
128
+
129
+ ---
130
+
131
+ ## **Acknowledgements**
132
+
133
+ - Hugging Face [Transformers](https://huggingface.co/docs/transformers) library.
134
+ - Model by [jefson08](https://huggingface.co/jefson08/kha-roberta).
135
+
136
+ ---
137
+
138
+ ## **License**
139
+
140
+ This project is licensed under the MIT License. See the [LICENSE](./LICENSE) file for more details.
141
+
142
+ ---