--- license: apache-2.0 datasets: - hamishivi/gsm8k-symbolic language: - en base_model: - hamishivi/tess2-v0.3-base --- # TESS 2 v0.3 Symbolic - A Math-specific Tuned Diffusion LM This model is the TESS 2 model trained on GSM8k symbolic data found [here](https://huggingface.co/datasets/hamishivi/gsm8k-symbolic), adapted from [here](https://github.com/HKUNLP/diffusion-of-thoughts). This model is a simplex-based diffusion model adapted from Mistral v0.1 7B, further trained on Dolma 1.7 and Tulu 2 SFT data. For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917). This is the model based on Mistral v0.3 and trained on GSM8k data. This model will only work with our custom codebase found [here](https://github.com/hamishivi/tess-2) -- please go there to see details on how to run training and inference. ## Using this model To run this model, first clone https://github.com/hamishivi/tess-2. Then, after creating a python environment with the correct packages, you can run inference via a ui with: ```sh ./shell_scripts/run_interactive_demo.sh hamishivi/tess2-v0.3 ``` This allows you to directly interact with the model, and shows the diffusion generation process. For training or other evaluations, please see our main repository. ## Citation If you find this work useful, please cite this work as follows. ```bibtex @misc{taeivison2025tess2, title={{TESS 2: A Large-Scale Generalist Diffusion Language Model}}, author={Jaesung Tae and Hamish Ivison and Sachin Kumar and Arman Cohan}, year={2025}, eprint={2502.13917}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.13917}, } ```