ChatGLM3-6B-GGML
介绍 (Introduction
ChatGLM3-6B-GGML 是 ChatGLM3-6B 的量化版本,可以在 CPU 服务器上运行。
ChatGLM3-6B-GGML is a quantized version of ChatGLM3-6B that can run on CPU servers.
软件依赖 (Dependencies)
git clone --recursive https://github.com/li-plus/chatglm.cpp.git
python3 -m pip install torch tabulate tqdm transformers accelerate sentencepiece
pip install -U 'chatglm-cpp[api]'
模型调用 (Model Usage)
可以通过如下代码调用 ChatGLM3-6B-GGML 模型来生成对话:
You can generate dialogue by invoking the ChatGLM3-6B model with the following code:
>>> import chatglm_cpp
>>>
>>> pipeline = chatglm_cpp.Pipeline("./chatglm-ggml.bin")
>>> pipeline.chat([chatglm_cpp.ChatMessage(role="user", content="你好")])
ChatMessage(role="assistant", content="你好!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。", tool_calls=[])
关于更多的使用说明,包括如何运行命令行和网页版本的 DEMO,请参考我的 文章。
For more usage instructions, including how to run the command line and web versions of the DEMO, please refer to my article.
协议 (License)
本仓库的代码依照 MIT 协议开源,ChatGLM3-6B 模型的权重的使用则需要遵循 Model License。
The code in this repository is open-sourced under the MIT LICENSE, while the use of the ChatGLM3-6B model weights needs to comply with the Model License.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.