Safetensors
qwen2
File size: 3,523 Bytes
4425950
 
 
 
 
 
 
351d727
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: apache-2.0
datasets:
- Quest-AI/quest-corruption-14brestorations-2.6k-filter-v1
base_model:
- Qwen/Qwen2.5-14B
---
## Task Description
The task involves training a model to evaluate two pieces of text. One of the texts has been subtly augmented by a LLM (specifically, the larger 14B variant of my [corruption models](https://huggingface.co/Quest-AI/quest-corruption-7b-s375-v3-GRPO)). The model must provide notes and a subsequent judgment afterwards in consecutive XML tags.

![GRPO Task Overview](https://files.catbox.moe/r5dm86.png)

## Example Format

The base model is provided a system prompt that establishes the expected template, as well as two randomly ordered A/B samples containing "real" vs "synthetic" samples for the input:

```
REQUEST: You are to judge the better of the two samples and determine which of the following samples is better using a short judgement that is no longer than (and no shorter than) exactly 128 tokens.

Respond with an exactly 128 tokens tag labeled <notes> that contains your notes, and then <judgement> which is just the letter that you are picking.

For example:

JUDGE: <notes>
Sample A is superior to Sample B... (example notes)
</notes>
<judgement>A</judgement>

Now, it is your turn.

[Sample A]:
Included is a pre-test, post-test, and vocabulary quiz on the 8th grade math standard functions (8.F). 1.) Determine if a graph represents a function 2.) State the domain and range of a relation 3.) Plot points on a graph to determine if the table represents a function 4.) State if a function is decreasing, increasing, or constant 5.) Determine the output of a function machine 6.) Determine the recursive and explicit equation 7.) Determine the minimum, maximum, increasing interval, and decreasing interval of a graph 8.) Determine the rate of change, initial value, independent value, and dependent variable given a graph 9.) Sketch a graph given a situation The vocabulary included is dependent, output, function, domain, range, decreasing function, input, range, non-linear function, relation, increasing function, and function notation. Total Pages: 9 (18 including answer key) Answer Key: Included Document File: PDF

[Sample B]:
Included is a pre-test, post-test, and vocabulary quiz on the 8th grade math standard functions (8.F). 1.) Determine if a graph represents a function 2.) State the domain and range of a relation 3.) Plot points on a graph to determine if the table represents a function 4.) State if a function is increasing, decreasing, or constant 5.) Determine the output of a function given 6.) Determine the input of a function given 7.) Determine a function rule given ordered pairs or a table of values. 8.) Graph functions using a table of values and determine a trend line in a graph 9.) Write a data table situation The vocabulary included is dependent, output, function, domain, range, decreasing function, input, range, non-linear function, relation, increasing function, and function notation. Total Pages: 9 (18 including answer key) Answer Key: Included Document File: PDF

JUDGE:
```

A correct output follows this structure:
```
<notes>
Sample A provides more specific and thoroughly defined tasks. It mentions "function machine," "recursive and explicit equation," and detailed graph analysis with "minimum, maximum" and intervals. Sample B contains incomplete phrases like "output of a function given" without completing the thought, making it less coherent and precise than Sample A.
</notes>
<judgement>A</judgement>
```