Post
812
Huge disappointment to Claude Sonnet 3.7 π Big performance regression. Worse than the June version in 2024. π
onekq-ai/WebApp1K-models-leaderboard
I'm sure though this version improves on something, only not the thing my leaderboard measures. This proves the point that no model can be the best on everything.
onekq-ai/WebApp1K-models-leaderboard
I'm sure though this version improves on something, only not the thing my leaderboard measures. This proves the point that no model can be the best on everything.