Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
WildEval
non-profit
wild_eval
WildEval
Activity Feed
Request to join this org
Follow
11
AI & ML interests
None defined yet.
Recent Activity
yuchenlin
Â
authored
a paper
5 days ago
Small Models Struggle to Learn from Strong Reasoners
DongfuJiang
Â
authored
a paper
20 days ago
ACECODER: Acing Coder RL via Automated Test-Case Synthesis
yuchenlin
Â
authored
a paper
21 days ago
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
View all activity
Team members
9
spaces
1
pinned
Running
6
Zebra Logic Bench
🦓
Explore and compare Zebra Puzzle solving models
models
None public yet
datasets
9
Sort:Â Recently updated
WildEval/ZebraLogic
Viewer
•
Updated
21 days ago
•
4.26k
•
227
•
5
WildEval/G-PlanET
Viewer
•
Updated
Aug 1, 2024
•
1.42k
•
50
WildEval/ZeroEval
Viewer
•
Updated
Jul 23, 2024
•
4.61k
•
411
WildEval/WildBench-V2
Viewer
•
Updated
May 22, 2024
•
2.05k
•
77
WildEval/WildBench-Results-v2-internal
Viewer
•
Updated
May 21, 2024
•
30k
•
224
WildEval/WildBench-Results-V2
Viewer
•
Updated
May 20, 2024
•
10.2k
•
143
WildEval/WildBench-v2-dev
Viewer
•
Updated
Apr 19, 2024
•
5.99k
•
27
WildEval/WildBench-dev
Updated
Apr 19, 2024
•
13
•
1
WildEval/NaturalChats
Viewer
•
Updated
Apr 18, 2024
•
641k
•
30