Multi-needle In A Haystack
#25
by
ElliottDyson
- opened
Many models can be easily trained to perform well on the standard needle in a haystack evaluation. Something much more useful and representative of long-context capabilities is the multi-needle evaluation method. It would be very interesting to see its results in these tests.