PyThagorean-10B / README.md
prithivMLmods's picture
Update README.md
304eea4 verified
|
raw
history blame
1.06 kB
metadata
license: llama3.1
language:
  - en
base_model:
  - prithivMLmods/LwQ-Reasoner-10B
pipeline_tag: text-generation
library_name: transformers
tags:
  - python
  - math
  - coder
  - reasoner

python.gif

PyThagorean-10B

PyThagorean [Python + Math] is a Python and mathematics-based model designed to solve mathematical problems using Python libraries and coding. It has been fine-tuned on 1.5 million entries and is built on LLaMA's architecture. The model supports different parameter sizes, including 10B, 3B, and 1B (Tiny). These instruction-tuned, text-only models are optimized for multilingual dialogue use cases, including agent-based retrieval and summarization tasks. PyThagorean leverages an auto-regressive language model that uses an optimized transformer architecture. The tuned versions employ supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.