weepcat/summarization_sft_reward-model-deberta-v3-large-v2_RM-Gemma-2B_mask_partial_rm_random_length Text Classification • Updated Jan 23 • 5
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_random_length Text Classification • Updated Jan 8 • 4
weepcat/hh_sft_RM-Gemma-2B_RM-Gemma-7B_mask_partial_rm_token_by_token Text Classification • Updated Jan 3 • 11
weepcat/compute_weights_summarization_partial_reward_model_random_length-2 Viewer • Updated Jan 22 • 302k • 40
weepcat/compute_rewards_summarization_partial_reward_model_random_length-2 Viewer • Updated Jan 21 • 302k • 40