See our paper at https://huggingface.co/papers/2405.19332.
Shenao Zhang
ZhangShenao
AI & ML interests
None yet
Recent Activity
updated
a model
23 minutes ago
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_8e-7
published
a model
26 minutes ago
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_8e-7
updated
a model
40 minutes ago
ZhangShenao/math_math-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_5e-6
Organizations
Collections
3
-
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3
Text Generation • Updated • 91 • 5 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-2
Text Generation • Updated • 14 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-1
Text Generation • Updated • 16 -
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Paper • 2405.19332 • Published • 15
models
340
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_8e-7
Updated
ZhangShenao/math_math-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_5e-6
Updated
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_5e-6
Updated
ZhangShenao/math_math-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_5e-5
Updated
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-sft-sample_7500_tp_mlr_5e-5
Updated
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-m-iter-1_sample_2500_nsk_ml512_mlr5e-6
Updated
ZhangShenao/math_math-gemma-2-9b-it-rs_nnew-sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
ZhangShenao/math_gsm-gemma-2-9b-it-rs_nnew-sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
ZhangShenao/math_math-Mistral-7B-Instruct-v0.2-rs_nnew-sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
•
2
ZhangShenao/math_math-Mistral-7B-Instruct-v0.2-sft-sample_7500_tp
Updated
•
2
datasets
200
ZhangShenao/sft-math_gsm-Meta-Llama-3-8B-Instruct-iter_sample_7500_tp
Updated
ZhangShenao/math_gsm-Meta-Llama-3-8B-Instruct-iter1_sample_7500_nsk_ml512_mlr5e-6_ent0.0
Updated
ZhangShenao/rs_nnew-math_math-gemma-2-9b-it-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
ZhangShenao/rs_nnew-math_gsm-gemma-2-9b-it-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
ZhangShenao/rs_nnew-math_math-Mistral-7B-Instruct-v0.2-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Updated
ZhangShenao/rs_nnew-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
1.85k
ZhangShenao/sft-math_math-Mistral-7B-Instruct-v0.2-iter_sample_7500_tp
Viewer
•
Updated
•
7.5k
ZhangShenao/sft-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7500_tp
Viewer
•
Updated
•
7.47k
ZhangShenao/rs_nnew-math_math-Meta-Llama-3-8B-Instruct-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
1.78k
ZhangShenao/rs_nnew-math_gsm-Meta-Llama-3-8B-Instruct-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
3.83k