Deepseek-VL-1.3b-chat-4bit

Deepseek Logo

Overview

Deepseek-VL-1.3b-chat-4bit is a state-of-the-art multimodal model that combines visual and linguistic processing capabilities. It has been optimized for efficient performance by quantizing the model to 4 bits, significantly reducing its size while maintaining high performance.

Model Details

  • Model Type: Multimodal Causal Language Model
  • Base Model Size: 1.3 billion parameters
  • Quantized Size: Approximately 1.72 GB (from the original size)
  • Files Included:
    • config.json: Model configuration file.
    • model.safetensors: The quantized model weights.
    • preprocessor_config.json: Configuration for the preprocessor.
    • processor_config.json: Configuration for the processor.
    • special_tokens_map.json: Mapping for special tokens used in the tokenizer.
    • tokenizer.json: Tokenizer configuration.
    • tokenizer_config.json: Additional tokenizer settings.

Quantization

Quantization is a technique used to reduce the model size and improve inference speed by using lower precision arithmetic. In this case, the model was quantized to 4 bits, which means it utilizes 4 bits to represent each weight instead of the typical 16 or 32 bits. This results in:

  • Size Reduction: The model size has been reduced from several gigabytes to approximately 1.72 GB.
  • Performance: The quantized model maintains a high level of accuracy and efficiency, making it suitable for deployment in environments with limited resources.

Installation

To use the Deepseek-VL-1.3b-chat-4bit model, follow these steps:

  1. Install the Required Libraries:
    pip install transformers huggingface-hub
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for zamal/Deepseek-VL-1.3b-chat-4bit

Finetuned
(1)
this model