|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- deepseek-ai/deepseek-vl-1.3b-chat |
|
pipeline_tag: image-to-text |
|
--- |
|
# Deepseek-VL-1.3b-chat-4bit |
|
|
|
![Deepseek Logo](https://cdn.deepseek.com/logo.png) |
|
|
|
## Overview |
|
|
|
**Deepseek-VL-1.3b-chat-4bit** is a state-of-the-art multimodal model that combines visual and linguistic processing capabilities. It has been optimized for efficient performance by quantizing the model to 4 bits, significantly reducing its size while maintaining high performance. |
|
|
|
### Model Details |
|
- **Model Type**: Multimodal Causal Language Model |
|
- **Base Model Size**: 1.3 billion parameters |
|
- **Quantized Size**: Approximately **1.72 GB** (from the original size) |
|
- **Files Included**: |
|
- `config.json`: Model configuration file. |
|
- `model.safetensors`: The quantized model weights. |
|
- `preprocessor_config.json`: Configuration for the preprocessor. |
|
- `processor_config.json`: Configuration for the processor. |
|
- `special_tokens_map.json`: Mapping for special tokens used in the tokenizer. |
|
- `tokenizer.json`: Tokenizer configuration. |
|
- `tokenizer_config.json`: Additional tokenizer settings. |
|
|
|
## Quantization |
|
|
|
Quantization is a technique used to reduce the model size and improve inference speed by using lower precision arithmetic. In this case, the model was quantized to 4 bits, which means it utilizes 4 bits to represent each weight instead of the typical 16 or 32 bits. This results in: |
|
|
|
- **Size Reduction**: The model size has been reduced from several gigabytes to approximately 1.72 GB. |
|
- **Performance**: The quantized model maintains a high level of accuracy and efficiency, making it suitable for deployment in environments with limited resources. |
|
|
|
## Installation |
|
|
|
To use the **Deepseek-VL-1.3b-chat-4bit** model, follow these steps: |
|
|
|
1. **Install the Required Libraries**: |
|
```bash |
|
pip install transformers huggingface-hub |