iansotnek commited on
Commit
ea2ac33
1 Parent(s): 60062a7

update to use instruct_pipeline

Browse files
Files changed (1) hide show
  1. README.md +26 -66
README.md CHANGED
@@ -42,87 +42,47 @@ Just as with any other LLM, we advise users of this technology to exercise good
42
 
43
  ## Usage
44
 
45
- The code below shows how to use `chopt-2_7b` in the way which it was trained. While the model can be used "out of the box" using the
46
- `transformers` library, using the function defined below to create a response from the model will achieve better results.
47
-
48
- ### Load Model and Tokenizer from this Repository Using the `transformers` Package
49
 
50
  ```python
51
- from transformers import AutoModelForCausalLM, AutoTokenizer
52
- import numpy as np
53
- import re
54
-
55
- model_id = 'aisquared/chopt-2_7b'
56
-
57
- tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side = 'left')
58
- model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code = True, device_map = 'auto')
59
  ```
60
 
61
-
62
- ### Create the Prompt Format and Other Variables
 
 
63
 
64
  ```python
65
- PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
 
66
 
67
- ### Instruction:
68
- {instruction}
69
 
70
- ### Response:
71
- """
72
 
73
- END_KEY = '### End'
74
- RESPONSE_KEY = '### Response:\n'
 
75
  ```
76
 
77
-
78
- ### Create a Function to Retrieve a Response
79
 
80
  ```python
81
- def create_response(
82
- instruction,
83
- model,
84
- tokenizer,
85
- do_sample = True,
86
- max_new_tokens = 256,
87
- top_p = 0.92,
88
- top_k = 0,
89
- **kwargs
90
- ):
91
- """
92
- Create a response from the model by using a formatted prompt
93
- """
94
- input_ids = tokenizer(
95
- PROMPT.format(instruction=instruction), return_tensors="pt"
96
- ).input_ids
97
-
98
- gen_tokens = model.generate(
99
- input_ids,
100
- pad_token_id=tokenizer.pad_token_id,
101
- do_sample=do_sample,
102
- max_new_tokens=max_new_tokens,
103
- top_p=top_p,
104
- top_k=top_k,
105
- **kwargs,
106
- )
107
- decoded = tokenizer.batch_decode(gen_tokens)[0]
108
-
109
- # The response appears after "### Response:". The model has been trained to append "### End" at the end.
110
- m = re.search(r"#+\s*Response:\s*(.+?)#+\s*End", decoded, flags=re.DOTALL)
111
-
112
- response = None
113
- if m:
114
- response = m.group(1).strip()
115
- else:
116
- # The model might not generate the "### End" sequence before reaching the max tokens. In this case, return
117
- # everything after "### Response:".
118
- m = re.search(r"#+\s*Response:\s*(.+)", decoded, flags=re.DOTALL)
119
- if m:
120
- response = m.group(1).strip()
121
- else:
122
- pass
123
- return response
124
  ```
125
 
 
126
  ### Model Performance Metrics
127
 
128
  We present the results from various model benchmarks on the EleutherAI LLM Evaluation Harness for all models in the ChOPT family.
 
42
 
43
  ## Usage
44
 
45
+ To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers` and `accelerate` libraries installed.
46
+ From your terminal, run:
 
 
47
 
48
  ```python
49
+ pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"
 
 
 
 
 
 
 
50
  ```
51
 
52
+ The instruction following pipeline can be loaded using the `pipeline` function as shown below. This loads a custom `InstructionTextGenerationPipeline`
53
+ found in the model repo [here](https://huggingface.co/aisquared/chopt-2_7b/blob/main/instruct_pipeline.py), which is why `trust_remote_code=True` is required.
54
+ Including `torch_dtype=torch.bfloat16` is generally recommended if this type is supported in order to reduce memory usage. It does not appear to impact output quality.
55
+ It is also fine to remove it if there is sufficient memory.
56
 
57
  ```python
58
+ from transformers import pipeline
59
+ import torch
60
 
61
+ generate_text = pipeline(model="aisquared/chopt-2_7b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
62
+ ```
63
 
64
+ You can then use the pipeline to answer instructions:
 
65
 
66
+ ```python
67
+ res = generate_text("Who was George Washington?")
68
+ print(res[0]["generated_text"])
69
  ```
70
 
71
+ Alternatively, if you prefer to not use `trust_remote_code=True` you can download [instruct_pipeline.py](https://huggingface.co/aisquared/chopt-2_7b/blob/main/instruct_pipeline.py),
72
+ store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer:
73
 
74
  ```python
75
+ from instruct_pipeline import InstructionTextGenerationPipeline
76
+ from transformers import AutoModelForCausalLM, AutoTokenizer
77
+ import torch
78
+
79
+ tokenizer = AutoTokenizer.from_pretrained("aisquared/chopt-2_7b", padding_side="left")
80
+ model = AutoModelForCausalLM.from_pretrained("aisquared/chopt-2_7b", device_map="auto", torch_dtype=torch.bfloat16)
81
+
82
+ generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ```
84
 
85
+
86
  ### Model Performance Metrics
87
 
88
  We present the results from various model benchmarks on the EleutherAI LLM Evaluation Harness for all models in the ChOPT family.