RLHF-And-Friends
community
AI & ML interests
None defined yet.
models
9
RLHF-And-Friends/Llama-3.2-3B-Instruct-DPO-Math
Text Generation
•
Updated
•
279
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math-SF
Text Generation
•
Updated
•
5
RLHF-And-Friends/Llama-3.2-3B-Instruct
Text Generation
•
Updated
•
394
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math
Updated
•
49
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit
Updated
•
43
RLHF-And-Friends/Llama3.1-8B
Updated
•
28
RLHF-And-Friends/Llama3.1-8B-DPO-0.05
Updated
•
47
RLHF-And-Friends/Zephyr-7B-DPO-0.05
Updated
•
48
RLHF-And-Friends/Zephyr-SFT-7B
Updated
•
44
datasets
None public yet