-
Attention Is All You Need
Paper • 1706.03762 • Published • 50 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 12 -
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Paper • 2305.13245 • Published • 5 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 243
Eli Chen PRO
elichen3051
AI & ML interests
Learning Algorithm, Reinforcement Learning, Data Synthesize, Benchmarking
Recent Activity
updated
a model
about 2 hours ago
skymizer/Llama3.1-8B-relu-stage-1-fineweb-edu-45B-4096
updated
a dataset
about 7 hours ago
skymizer/Mistral-7B-v0.1-base-tokenized-fineweb-edu-45B-4096
updated
a dataset
about 9 hours ago
skymizer/Llama3.1-8B-base-tokenized-fineweb-edu-45B-4096
Organizations
Collections
2
models
None public yet