Spaces:
Running
Running
title: README | |
emoji: 🐢 | |
colorFrom: yellow | |
colorTo: blue | |
sdk: static | |
pinned: false | |
<h1 style="line-height: 50px;"> Spectra Suite </h1> | |
We release the Spectra Suite consisting of 54 models ranging from 99M to 3.9B parameters across different bitwidths: | |
* FloatLM: LLMs pretrained in FP16 (Half-Precision). | |
* TriLM: LLMs pretrained with effective ternary bitwidth. | |
* QuantLM 8-bit: FloatLM LLMs Quantized to 8-bits. | |
* QuantLM 6-bit: FloatLM LLMs Quantized to 6-bits. | |
* QuantLM 4-bit: FloatLM LLMs Quantized to 4-bits. | |
* QuantLM 3-bit: FloatLM LLMs Quantized to 3-bits. | |
All models are released in unpacked (FP16 format) - compatible with FP16 GEMMs across any library supporting the LLaMa architecture. | |
## Citation | |
If you find these models or the associated paper useful, please cite the paper: | |
```bibtex | |
@misc{kaushal2024spectracomprehensivestudyternary, | |
title={Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models}, | |
author={Ayush Kaushal and Tejas Pandey and Tejas Vaidhya and Aaryan Bhagat and Irina Rish}, | |
year={2024}, | |
eprint={2407.12327}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.LG}, | |
url={https://arxiv.org/abs/2407.12327}, | |
} | |
``` | |