Mrw33554432
commited on
Commit
•
ba01de7
1
Parent(s):
1ed66f8
Update README.md
Browse files
README.md
CHANGED
@@ -24,9 +24,9 @@ The model is trained on a 3090(24GB) for 16 hours.
|
|
24 |
|
25 |
### For training code, check --placeholder--.
|
26 |
|
27 |
-
The training code should be compatible with most of the LLMs in huggingface
|
28 |
|
29 |
-
Using pretrained model weight will not work due to gradient explosion.
|
30 |
|
31 |
## Sample inference code
|
32 |
|
|
|
24 |
|
25 |
### For training code, check --placeholder--.
|
26 |
|
27 |
+
The training code should be compatible with most of the LLMs in huggingface.
|
28 |
|
29 |
+
Using pretrained model weight (normal models) for training will not work due to gradient explosion.
|
30 |
|
31 |
## Sample inference code
|
32 |
|