Dhanishtha-Large / README.md

Update README.md

54544eb verified 8 days ago

4.46 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- Abhaykoul/Dhanishtha-R1
	- open-thoughts/OpenThoughts-114k
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	---
	# Dhanishtha Overview

	Dhanishtha is a cutting-edge reasoning AI model developed by HelpingAI, designed for deep introspection and structured logical analysis. Unlike traditional models that generate immediate responses, Dhanishtha employs a unique deep-thinking process process—an internal deliberation phase that enhances reasoning depth before presenting refined answers.

	## Model Capabilities
	Dhanishtha operates in Dhanishtha Mode, inspired by the Dhanishtha Nakshatra, known for wisdom, rhythm, and intellectual depth. The model engages in a multi-step thought process before providing responses, ensuring high accuracy and coherence.

	### Key Features:
	- Structured Internal Reasoning: Engages in self-dialogue within `<think></think>` tags, iterating through ideas and refining its thought process before responding.
	- Progressive Thought Refinement: Evaluates multiple perspectives, making logical connections and ensuring a well-rounded answer.
	- Emotionally Intelligent Conversational Style: Responses are expressive, engaging, and tailored for natural human interaction.
	- Optimized for Critical Thinking & Problem-Solving: Excels in analytical reasoning, debate, and deep philosophical discussions.
	- Context Awareness: Maintains logical coherence in extended interactions, avoiding contradictions and ensuring smooth thought progression.

	## Training & Architecture
	- Model Size: Optimized for high-performance reasoning with balanced efficiency.
	- Training Approach: Fine-tuned using advanced structured learning techniques to enhance deliberative thinking and introspective processing.
	- Data Sources: Trained on a diverse dataset covering philosophy, critical reasoning, and problem-solving scenarios to develop a deep intellectual foundation.

	## Performance & Benchmarks
	Dhanishtha outperforms conventional models in structured reasoning and contextual depth. The model has been rigorously evaluated across various metrics, demonstrating significant improvements in:
	- Logical Coherence & Argumentation: Enhanced ability to follow complex discussions and construct persuasive arguments.
	- Depth of Analysis: Excels in breaking down intricate topics into clear, structured responses.
	- Adaptive Conversational Flow: Seamlessly shifts between casual and analytical tones based on user input.

	## Deployment & Use Cases
	Dhanishtha is designed for:
	- High-precision academic and philosophical discussions
	- Deep problem-solving and strategic reasoning
	- Engaging and thought-provoking conversations
	- Use in AI-driven research and advanced dialogue systems

	## Benchmarks
	We report Pass@1 accuracy averaged over 16 samples for each problem.

	\| Model \| AIME 2024 \| MATH 500 \| AMC 2023 \| Minerva Math \| OlympiadBench \| Avg. \|
	\|-------------------------------\|-----------\|----------\|----------\|--------------\|----------------\|------\|
	\| 2.5-7B-Instruct \| 13.3 \| 79.8 \| 50.6 \| 34.6 \| 40.7 \| 43.8 \|
	\| rStar-Math-7B \| 26.7 \| 78.4 \| 47.5 \| - \| 47.1 \| - \|
	\| Eurus-2.7B-PRIME \| 26.7 \| 79.2 \| 57.8 \| 38.6 \| 42.1 \| 48.9 \|
	\| Qwen2.5-7B-SimpleRL \| 26.7 \| 82.4 \| 62.5 \| 39.7 \| 43.3 \| 50.9 \|
	\| DeepSeek-R1-Distill-Qwen-1.5B \| 28.8 \| 82.8 \| 62.9 \| 26.5 \| 43.3 \| 48.9 \|
	\| Still-1.5B \| 32.5 \| 84.4 \| 66.7 \| 29.0 \| 45.4 \| 51.6 \|
	\| DeepScaleR-1.5B-Preview \| 43.1 \| 87.8 \| 73.6 \| 30.2 \| 50.0 \| 57.0 \|
	\| O1-Preview \| 40.0 \| 81.4 \| - \| - \| - \| - \|
	\| Dhanishta \| 38.2 \| 85.1 \| 70.3 \| 30.5 \| 42.0 \| 53.2 \|
	\| Dhanishta-Large \| - \| - \| - \| - \| - \| - \|


	## Credits & License
	Dhanishtha is developed and maintained by HelpingAI, pushing the boundaries of AI-driven introspection and structured reasoning. The model is open-source and community-driven, encouraging contributions and collaborative innovation.

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- Abhaykoul/Dhanishtha-R1
	- open-thoughts/OpenThoughts-114k
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	---
	# Dhanishtha Overview

	Dhanishtha is a cutting-edge reasoning AI model developed by HelpingAI, designed for deep introspection and structured logical analysis. Unlike traditional models that generate immediate responses, Dhanishtha employs a unique deep-thinking process process—an internal deliberation phase that enhances reasoning depth before presenting refined answers.

	## Model Capabilities
	Dhanishtha operates in Dhanishtha Mode, inspired by the Dhanishtha Nakshatra, known for wisdom, rhythm, and intellectual depth. The model engages in a multi-step thought process before providing responses, ensuring high accuracy and coherence.

	### Key Features:
	- Structured Internal Reasoning: Engages in self-dialogue within `<think></think>` tags, iterating through ideas and refining its thought process before responding.
	- Progressive Thought Refinement: Evaluates multiple perspectives, making logical connections and ensuring a well-rounded answer.
	- Emotionally Intelligent Conversational Style: Responses are expressive, engaging, and tailored for natural human interaction.
	- Optimized for Critical Thinking & Problem-Solving: Excels in analytical reasoning, debate, and deep philosophical discussions.
	- Context Awareness: Maintains logical coherence in extended interactions, avoiding contradictions and ensuring smooth thought progression.

	## Training & Architecture
	- Model Size: Optimized for high-performance reasoning with balanced efficiency.
	- Training Approach: Fine-tuned using advanced structured learning techniques to enhance deliberative thinking and introspective processing.
	- Data Sources: Trained on a diverse dataset covering philosophy, critical reasoning, and problem-solving scenarios to develop a deep intellectual foundation.

	## Performance & Benchmarks
	Dhanishtha outperforms conventional models in structured reasoning and contextual depth. The model has been rigorously evaluated across various metrics, demonstrating significant improvements in:
	- Logical Coherence & Argumentation: Enhanced ability to follow complex discussions and construct persuasive arguments.
	- Depth of Analysis: Excels in breaking down intricate topics into clear, structured responses.
	- Adaptive Conversational Flow: Seamlessly shifts between casual and analytical tones based on user input.

	## Deployment & Use Cases
	Dhanishtha is designed for:
	- High-precision academic and philosophical discussions
	- Deep problem-solving and strategic reasoning
	- Engaging and thought-provoking conversations
	- Use in AI-driven research and advanced dialogue systems

	## Benchmarks
	We report Pass@1 accuracy averaged over 16 samples for each problem.

	\| Model \| AIME 2024 \| MATH 500 \| AMC 2023 \| Minerva Math \| OlympiadBench \| Avg. \|
	\|-------------------------------\|-----------\|----------\|----------\|--------------\|----------------\|------\|
	\| 2.5-7B-Instruct \| 13.3 \| 79.8 \| 50.6 \| 34.6 \| 40.7 \| 43.8 \|
	\| rStar-Math-7B \| 26.7 \| 78.4 \| 47.5 \| - \| 47.1 \| - \|
	\| Eurus-2.7B-PRIME \| 26.7 \| 79.2 \| 57.8 \| 38.6 \| 42.1 \| 48.9 \|
	\| Qwen2.5-7B-SimpleRL \| 26.7 \| 82.4 \| 62.5 \| 39.7 \| 43.3 \| 50.9 \|
	\| DeepSeek-R1-Distill-Qwen-1.5B \| 28.8 \| 82.8 \| 62.9 \| 26.5 \| 43.3 \| 48.9 \|
	\| Still-1.5B \| 32.5 \| 84.4 \| 66.7 \| 29.0 \| 45.4 \| 51.6 \|
	\| DeepScaleR-1.5B-Preview \| 43.1 \| 87.8 \| 73.6 \| 30.2 \| 50.0 \| 57.0 \|
	\| O1-Preview \| 40.0 \| 81.4 \| - \| - \| - \| - \|
	\| Dhanishta \| 38.2 \| 85.1 \| 70.3 \| 30.5 \| 42.0 \| 53.2 \|
	\| Dhanishta-Large \| - \| - \| - \| - \| - \| - \|


	## Credits & License
	Dhanishtha is developed and maintained by HelpingAI, pushing the boundaries of AI-driven introspection and structured reasoning. The model is open-source and community-driven, encouraging contributions and collaborative innovation.