Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published 14 days ago • 9
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published 14 days ago • 9
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated about 24 hours ago • 184
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra • Jun 11, 2024 • 16