arxiv:2501.08313
Yiran Zhong
IanZhong
AI & ML interests
LLM
Recent Activity
authored
a paper
11 days ago
Scaling TransNormer to 175 Billion Parameters
authored
a paper
14 days ago
Exploring Transformer Extrapolation
authored
a paper
14 days ago
CO2: Efficient Distributed Training with Full Communication-Computation
Overlap
Organizations
None yet
Papers
16
models
None public yet
datasets
None public yet