gemma2-mitra-it-int8

This is an 8int quantized version of gemma-2-mitra-it: https://huggingface.co/buddhist-nlp/gemma-2-mitra-it
The quantization was done with llm compressor: https://github.com/vllm-project/llm-compressor

The template for prompting the model is this:

Please translate into <target_language>: <input_sentence> ๐Ÿ”ฝ Translation::

Line breaks in this model should be replaced with the '๐Ÿ”ฝ' character before running the generation. '#' is used as a stop token.

Model Details

For details on how to run this please see the gemma2-9b repository: https://huggingface.co/google/gemma-2-9b

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.