keeeeenw
/

Llama-3.2-1B-Instruct-Open-R1-Distill

Model card Files Files and versions Community

keeeeenw commited on Feb 1

Commit

f17e9c7

·

verified ·

1 Parent(s): 0b88607

Delete eval_llama3.sh

Files changed (1) hide show

eval_llama3.sh +0 -17

eval_llama3.sh DELETED Viewed

@@ -1,17 +0,0 @@
-# MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
-# ValueError: User-specified max_model_len (32768) is greater than the derived max_model_len (max_position_embeddings=2048 or model_max_length=None in model's config.json). This may lead
-# to incorrect model outputs or CUDA errors. To allow overriding this maximum, set the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
-# Only needed for MicroLlama V1
-# export VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
-NUM_GPUS=4
-MODEL="/root/open-r1/data/meta-llama/Llama-3.2-1B-Instruct/checkpoint-900"
-MODEL_ARGS="pretrained=$MODEL,dtype=float16,data_parallel_size=$NUM_GPUS,max_model_length=32768,gpu_memory_utilisation=0.8"
-TASK=aime24
-OUTPUT_DIR=data/evals/$MODEL
-lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
-    --custom-tasks src/open_r1/evaluate.py \
-    --use-chat-template \
-    --system-prompt="Please reason step by step, and put your final answer within \boxed{}." \
-    --output-dir $OUTPUT_DIR