File size: 1,494 Bytes
74d3f8c
 
b86a225
74d3f8c
 
18fb372
74d3f8c
d793983
18fb372
74d3f8c
4d16990
 
 
 
 
 
 
 
 
 
 
18fb372
4d16990
 
 
 
 
 
 
 
18fb372
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: apache-2.0
pipeline_tag: text-generation
---

# Rationalyst

This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct). It was
introduced in [RATIONALYST: Pre-training Process-Supervision for Improving Reasoning](https://arxiv.org/pdf/2410.01044). The code for the rationale extraction, model training, and 
inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).

## Model description
Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing.
RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply
them to supervise reasoning. 

## How to use
To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.

## Training data

This Rationalyst is trained using 65k implicit rationales from The Pile and 14k implicit rationales from GSM8K and ECQA. The data used can be found [here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)


## Evaluation results

When used to evaluate on downstream tasks, this model achieves the following results:

| Task | GSM8K | MATH  | ECQA | HellaSwag | ProofWriter  | ARC | MMLU-Pro |
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
|      | 81.6 | 32.5   | 75.2 | 60.3     | 90.7 | 80.7        | 45.3     |