AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
This is the organization grouping all the models and datasets used in the TRL library.
Collections
7
models
81

trl-lib/Qwen2-0.5B-Reward-Math-Sheperd
Token Classification
•
Updated
•
47

trl-lib/Qwen2-0.5B-XPO
Text Generation
•
Updated
•
162

trl-lib/Qwen2-0.5B-OnlineDPO
Text Generation
•
Updated
•
63

trl-lib/Qwen2-0.5B-KTO
Text Generation
•
Updated
•
77

trl-lib/Qwen2-0.5B-ORPO
Text Generation
•
Updated
•
287
•
2

trl-lib/Qwen2-0.5B-DPO
Text Generation
•
Updated
•
42
•
4

trl-lib/Qwen2-0.5B-Reward
Text Classification
•
Updated
•
920
•
1

trl-lib/pythia-1b-deduped-tldr-rm
Updated
•
418

trl-lib/pythia-2.8b-deduped-tldr-online-dpo
Text Generation
•
Updated
•
6

trl-lib/pythia-6.9b-deduped-tldr-offline-dpo
Text Generation
•
Updated
•
12
datasets
19
trl-lib/documentation-images
Viewer
•
Updated
•
1
•
151k
trl-lib/rlaif-v
Viewer
•
Updated
•
83.1k
•
373
•
3
trl-lib/ultrafeedback-gpt-3.5-turbo-helpfulness
Viewer
•
Updated
•
16.6k
•
260
•
1
trl-lib/ultrafeedback-prompt
Viewer
•
Updated
•
39.8k
•
871
•
4
trl-lib/tldr-preference
Viewer
•
Updated
•
179k
•
660
trl-lib/tldr
Viewer
•
Updated
•
130k
•
13.5k
•
9
trl-lib/prm800k
Viewer
•
Updated
•
41.2k
•
147
•
2
trl-lib/math_shepherd
Viewer
•
Updated
•
445k
•
765
•
5
trl-lib/lm-human-preferences-sentiment
Viewer
•
Updated
•
6.26k
•
94
trl-lib/lm-human-preferences-descriptiveness
Viewer
•
Updated
•
6.26k
•
127
•
1