Context window 2k -> 16k
#1
by
mtasic85
- opened
Hi @calvintwr @lemousehunter ,
How did you extend context window from 2k to 16k? And what is potential maximum?
Hi @mtasic85 ,
We trained it directly with 16k context from the start.
In terms of maximum potential, it is only limited by hardware.
Alternative to starting with 16k from the get go, you can also explore the incremental approach taken by Llama3.1, as described in their paper.