Add comparison with 70B distilled R1 model
#8 opened 1 day ago
by
blankohagen7

Update model card
#7 opened 8 days ago
by
minpeter

Temperature's effect on the performance of long chain reasoning models. Why was 0.7 used for the evals?
1
#6 opened 26 days ago
by
j456
License of your model
1
#4 opened about 1 month ago
by
chewkokwah
Evaluation
1
#3 opened about 1 month ago
by
PSM24
Merge with 32b coder?
14
#2 opened about 1 month ago
by
RDson
