metadata
license: llama3.1
language:
- en
base_model:
- prithivMLmods/LwQ-Reasoner-10B
pipeline_tag: text-generation
library_name: transformers
tags:
- python
- math
- coder
- reasoner
PyThagorean-10B
PyThagorean [Python + Math] is a Python and mathematics-based model designed to solve mathematical problems using Python libraries and coding. It has been fine-tuned on 1.5 million entries and is built on LLaMA's architecture. The model supports different parameter sizes, including 10B, 3B, and 1B (Tiny). These instruction-tuned, text-only models are optimized for multilingual dialogue use cases, including agent-based retrieval and summarization tasks. PyThagorean leverages an auto-regressive language model that uses an optimized transformer architecture. The tuned versions employ supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.