Dongwei
/

Rationalyst_reasoning_datasets

Text Generation

feature-extraction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Rationalyst_reasoning_datasets / README.md

Dongwei's picture

Update pipeline tag (#2)

b86a225 verified 4 months ago

|

history blame contribute delete

1.49 kB

	---
	license: apache-2.0
	pipeline_tag: text-generation
	---

	# Rationalyst

	This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct). It was
	introduced in [RATIONALYST: Pre-training Process-Supervision for Improving Reasoning](https://arxiv.org/pdf/2410.01044). The code for the rationale extraction, model training, and
	inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).

	## Model description
	Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing.
	RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply
	them to supervise reasoning.

	## How to use
	To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.

	## Training data

	This Rationalyst is trained using 65k implicit rationales from The Pile and 14k implicit rationales from GSM8K and ECQA. The data used can be found [here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)


	## Evaluation results

	When used to evaluate on downstream tasks, this model achieves the following results:

	\| Task \| GSM8K \| MATH \| ECQA \| HellaSwag \| ProofWriter \| ARC \| MMLU-Pro \|
	\|:----:\|:----:\|:----:\|:----:\|:-----:\|:----:\|:-----:\|:----:\|
	\| \| 81.6 \| 32.5 \| 75.2 \| 60.3 \| 90.7 \| 80.7 \| 45.3 \|