MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Paper • 2211.15841 • Published Nov 29, 2022 • 7