DontPlanToEnd
commited on
Update app.py
Browse files
app.py
CHANGED
@@ -288,7 +288,7 @@ with GraInter:
|
|
288 |
<br>
|
289 |
**Score:** A combination of Dif, Cor, and Std.
|
290 |
<br><br>
|
291 |
-
The question this leaderboard focuses on could've benefited from being multiple prediction prompts each with different
|
292 |
""")
|
293 |
|
294 |
gr.Markdown("### **NA models:**")
|
|
|
288 |
<br>
|
289 |
**Score:** A combination of Dif, Cor, and Std.
|
290 |
<br><br>
|
291 |
+
The question this leaderboard focuses on could've benefited from being multiple prediction prompts each with different user and test lists, then averaging the accuracy of each list of predictions together. This would have reduced the variability of prediction accuracy and created a ranking with fewer outliers. Implementing these improvements will have to wait until the next time it is absolutely nesessary to update the leaderboard's questions due to how long it takes to retest all of the models.
|
292 |
""")
|
293 |
|
294 |
gr.Markdown("### **NA models:**")
|