Update README.md
Browse files
README.md
CHANGED
@@ -10,9 +10,9 @@ Llama-13B converted from official [Llama-13B](https://github.com/facebookresearc
|
|
10 |
|
11 |
This is updated from [decapoda-research/llama-13b-hf](https://huggingface.co/decapoda-research/Llama-13b-hf) to include (since the many pull requests are not merged yet in decapoda's repo, so I directly open a new repo here):
|
12 |
|
13 |
-
(1) The naming changes (LLaMA -> Llama) to best fit for `transformers` naming rule, in both `LlamaForCausalLM` and `LlamaTokenizer`. This works perfectly for `transformers
|
14 |
|
15 |
-
(2) The model checkpoints are saved in
|
16 |
|
17 |
--
|
18 |
license: other
|
|
|
10 |
|
11 |
This is updated from [decapoda-research/llama-13b-hf](https://huggingface.co/decapoda-research/Llama-13b-hf) to include (since the many pull requests are not merged yet in decapoda's repo, so I directly open a new repo here):
|
12 |
|
13 |
+
(1) The naming changes (LLaMA -> Llama) to best fit for `transformers` naming rule, in both `LlamaForCausalLM` and `LlamaTokenizer`. This works perfectly for `transformers>=4.28.0`.
|
14 |
|
15 |
+
(2) The model checkpoints are saved in 3 shards (instead of 61 shards in [decapoda-research/Llama-13b-hf](https://huggingface.co/decapoda-research/Llama-13b-hf)). Less shards would accelerate loading speed from disk.
|
16 |
|
17 |
--
|
18 |
license: other
|