Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
yicui
's Collections
Mechanistic
Coding
Benchmark
Training
ICL
Architecture
RL
TDD
Theory
Instructions
Mechanistic
updated
3 days ago
Upvote
-
Massive Activations in Large Language Models
Paper
•
2402.17762
•
Published
Feb 27
•
1
What Matters in Transformers? Not All Attention is Needed
Paper
•
2406.15786
•
Published
Jun 22
•
29
The Super Weight in Large Language Models
Paper
•
2411.07191
•
Published
6 days ago
•
2
Upvote
-
Share collection
View history
Collection guide
Browse collections