Multi-needle In A Haystack

by ElliottDyson - opened

Many models can be easily trained to perform well on the standard needle in a haystack evaluation. Something much more useful and representative of long-context capabilities is the multi-needle evaluation method. It would be very interesting to see its results in these tests.

Sign up or log in to comment