Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper β’ 2502.11089 β’ Published 21 days ago β’ 141
view post Post Hello Huggers! π€ 27 replies Β· π€ 53 53 β€οΈ 19 19 π€― 8 8 π€ 7 7 π 2 2 + Reply