Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
Paper
âą
2501.06589
âą
Published
Mixture of Experts, Branch Merge Train, International Cooperation, Reuse, https://github.com/ontocord/MDEL