JosephusCheung
commited on
Commit
·
5751faf
1
Parent(s):
34a4df0
Update README.md
Browse files
README.md
CHANGED
@@ -38,6 +38,7 @@ tags:
|
|
38 |
Use the transformers library that does not require remote/external code to load the model, AutoModelForCausalLM and AutoTokenizer (or manually specify LlamaForCausalLM to load LM, GPT2Tokenizer to load Tokenizer), and model quantization is fully compatible with GGUF (llama.cpp), GPTQ, and AWQ.
|
39 |
|
40 |
# Friendly reminder: If your VRAM is insufficient, you should use the 7B model instead of the quantized version.
|
|
|
41 |
|
42 |
**llama.cpp GGUF models**
|
43 |
GPT2Tokenizer fixed by [Kerfuffle](https://github.com/KerfuffleV2) on [https://github.com/ggerganov/llama.cpp/pull/3743](https://github.com/ggerganov/llama.cpp/pull/3743), new models are now reuploaded.
|
@@ -113,6 +114,7 @@ We are currently unable to produce accurate benchmark templates for non-QA tasks
|
|
113 |
使用无需远程/外部代码的transformers库加载模型,AutoModelForCausalLM和AutoTokenizer(或者手动指定LlamaForCausalLM加载LM, GPT2Tokenizer加载Tokenizer),并且模型量化与GGUF(llama.cpp)、GPTQ、AWQ完全兼容。
|
114 |
|
115 |
# 友情提示:如果您的显存不足,您应该使用7B模型而不是量化版本。
|
|
|
116 |
|
117 |
**llama.cpp GGUF models**
|
118 |
GPT2Tokenizer 支持由 [Kerfuffle](https://github.com/KerfuffleV2) 修复于 [https://github.com/ggerganov/llama.cpp/pull/3743](https://github.com/ggerganov/llama.cpp/pull/3743),新模型稍后上传。
|
|
|
38 |
Use the transformers library that does not require remote/external code to load the model, AutoModelForCausalLM and AutoTokenizer (or manually specify LlamaForCausalLM to load LM, GPT2Tokenizer to load Tokenizer), and model quantization is fully compatible with GGUF (llama.cpp), GPTQ, and AWQ.
|
39 |
|
40 |
# Friendly reminder: If your VRAM is insufficient, you should use the 7B model instead of the quantized version.
|
41 |
+
Compared to the quantized versions, the 7B version and the 14B version demonstrate a high level of consistency.
|
42 |
|
43 |
**llama.cpp GGUF models**
|
44 |
GPT2Tokenizer fixed by [Kerfuffle](https://github.com/KerfuffleV2) on [https://github.com/ggerganov/llama.cpp/pull/3743](https://github.com/ggerganov/llama.cpp/pull/3743), new models are now reuploaded.
|
|
|
114 |
使用无需远程/外部代码的transformers库加载模型,AutoModelForCausalLM和AutoTokenizer(或者手动指定LlamaForCausalLM加载LM, GPT2Tokenizer加载Tokenizer),并且模型量化与GGUF(llama.cpp)、GPTQ、AWQ完全兼容。
|
115 |
|
116 |
# 友情提示:如果您的显存不足,您应该使用7B模型而不是量化版本。
|
117 |
+
与量化版本相比,7B 版本和 14B 版本具有高度的一致性。
|
118 |
|
119 |
**llama.cpp GGUF models**
|
120 |
GPT2Tokenizer 支持由 [Kerfuffle](https://github.com/KerfuffleV2) 修复于 [https://github.com/ggerganov/llama.cpp/pull/3743](https://github.com/ggerganov/llama.cpp/pull/3743),新模型稍后上传。
|