Mixture of Experts - a CCMat Collection

CCMat 's Collections

RL

LoRA

Visual Consistency

ID Preservation

Inference Improvements

Adapters & Controls

Personalization

Depth & Segmentation

Computer Vision

3D & 360 & World Models

Video

Mixture of Experts

Transformers & Attention

StateSpaceModels

LLMs

Audio

Agents

Data

UI

toread

VLM

Mixture of Experts

updated about 11 hours ago

BlackMamba: Mixture of Experts for State-Space Models

Paper • 2402.01771 • Published Feb 1, 2024 • 24
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 27
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 51
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 47
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 70
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 157
Scaling Laws for Fine-Grained Mixture of Experts

Paper • 2402.07871 • Published Feb 12, 2024 • 12
Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 35
Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 60
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 17