kosbu
/

QVQ-72B-Preview-AWQ

Image-Text-to-Text

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

kosbu commited on Dec 24, 2024

Commit

9e4f192

·

verified ·

1 Parent(s): cbf4f46

Update README.md

Files changed (1) hide show

README.md +16 -5

README.md CHANGED Viewed

@@ -1,5 +1,16 @@
----
-license: other
-license_name: qwen
-license_link: https://huggingface.co/Qwen/QVQ-72B-Preview/blob/main/LICENSE
----

+---
+license: other
+license_name: qwen
+license_link: https://huggingface.co/Qwen/QVQ-72B-Preview/blob/main/LICENSE
+language:
+- en
+pipeline_tag: image-text-to-text
+base_model: Qwen/Qwen2-VL-72B
+tags:
+  - chat
+  - awq
+library_name: transformers
+---
+# QVQ-72B-Preview AWQ 4-Bit Quantized Version
+This repository provides the AWQ 4-bit quantized version of the QVQ-72B-Preview model, originally developed by Qwen. This model's weights are padded with zeros before quantization to ensure compatibility with multi-GPU tensor parallelism by resolving divisibility constraints. The padding minimally impacts computation while enabling efficient scaling across multiple GPUs.