RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-ultrafeedback_binarized-max_length-1024-LoRA-8r Updated 2 days ago
sam-at/heavy-200k-barc-llama3.2-1b-ins-fft-transduction_lr1e-5_epoch3 Text Generation • Updated 2 days ago • 12