A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning Paper • 2408.07057 • Published Aug 13, 2024
Multi-Head Adapter Routing for Cross-Task Generalization Paper • 2211.03831 • Published Nov 7, 2022 • 2
Towards Modular LLMs by Building and Reusing a Library of LoRAs Paper • 2405.11157 • Published May 18, 2024 • 28
Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge Paper • 2112.08583 • Published Dec 16, 2021
Multi-Head Adapter Routing for Cross-Task Generalization Paper • 2211.03831 • Published Nov 7, 2022 • 2
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference Paper • 2306.12509 • Published Jun 21, 2023 • 14
Towards Modular LLMs by Building and Reusing a Library of LoRAs Paper • 2405.11157 • Published May 18, 2024 • 28
CLUE: A Clinical Language Understanding Evaluation for LLMs Paper • 2404.04067 • Published Apr 5, 2024
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Paper • 2404.05694 • Published Apr 8, 2024 • 2
On the Impact of Cross-Domain Data on German Language Models Paper • 2310.07321 • Published Oct 11, 2023 • 1