instruction-pretrain commited on
Commit
91c109a
1 Parent(s): 2e7b4f8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -128,16 +128,15 @@ We simply discard the system prompts.
128
 
129
  **To put it all together, the text before tokenization looks like this:**
130
 
131
- `general_instruction_response_text = "<|begin_of_text|>{question} {response}<|end_of_text|>"`
132
-
133
- or
134
-
135
- `instruction_augmented_text = "<|begin_of_text|>{instruction augmented text}<|end_of_text|>"`
136
 
 
 
137
  Then, for tokenization, you don't need to add BOS and EOS token ids. The tokenization code looks like this:
138
-
139
- `text_ids = tokenizer(text, add_special_tokens=False, **kwargs).input_ids`
140
-
141
 
142
  ## Citation
143
  If you find our work helpful, please cite us:
 
128
 
129
  **To put it all together, the text before tokenization looks like this:**
130
 
131
+ ```python
132
+ general_instruction_response_text = "<|begin_of_text|>{question} {response}<|end_of_text|>"
 
 
 
133
 
134
+ instruction_augmented_text = "<|begin_of_text|>{instruction augmented text}<|end_of_text|>"
135
+ ```
136
  Then, for tokenization, you don't need to add BOS and EOS token ids. The tokenization code looks like this:
137
+ ```python
138
+ text_ids = tokenizer(text, add_special_tokens=False, **kwargs).input_ids
139
+ ```
140
 
141
  ## Citation
142
  If you find our work helpful, please cite us: