Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved Text Generation • Updated Oct 1, 2024 • 18 • 2
Pretergeek/OpenChat-3.5-0106_8.99B_40Layers-Interleaved Text Generation • Updated Nov 7, 2024 • 13 • 2
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Interleaved Text Generation • Updated Oct 1, 2024 • 10 • 2
OpenChat-3.5-0106 with Additional Layers Collection Upscaled models using the Block Expansion method. Unlike the more common DUP Scaling, BE doesn't require fine-tuning to recover lost performance. • 7 items • Updated Nov 29, 2024 • 2
view article Article Rank-Stabilized LoRA: Unlocking the Potential of LoRA Fine-Tuning By damjan-k • Feb 20, 2024 • 19
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 26
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 67