Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Nanotron Research
community
https://github.com/huggingface/nanotron
Activity Feed
Request to join this org
Follow
25
AI & ML interests
None defined yet.
Recent Activity
lvwerra
Â
authored
a paper
about 2 months ago
SelfCodeAlign: Self-Alignment for Code Generation
thomwolf
Â
authored
a paper
6 months ago
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
lvwerra
Â
authored
a paper
6 months ago
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
View all activity
Team members
12
Organization Card
Community
About org cards
Edit this
README.md
markdown file to author your organization card.
models
14
Sort:Â Recently updated
nanotron/temp_for_pr_review
Updated
Sep 24
nanotron/fp8_for_nanotron
Updated
Sep 21
nanotron/llama3-8b-infini-attention
Updated
Aug 5
•
3
•
3
nanotron/bench_cluster_epfl
Updated
Jul 12
nanotron/bench_cluster
Updated
Jul 6
nanotron/test
Updated
Jul 6
nanotron/old_bench
Updated
Jul 6
•
2
nanotron/minicpm-nanotron
Updated
Apr 11
•
6
nanotron/doremi-llama-2.5b-optimized-weights
Updated
Feb 22
nanotron/doremi-llama-2.5b-reference
Updated
Feb 22
Expand 14 models
datasets
12
Sort:Â Recently updated
nanotron/minipile_100_samples
Viewer
•
Updated
Jul 10
•
100
•
38
•
1
nanotron/llama3-1024-passkey-retrieval-eval
Viewer
•
Updated
Jul 4
•
12.6k
•
47
nanotron/llama3-16k-passkey-retrieval-finetuning
Viewer
•
Updated
Jun 20
•
77.3k
•
38
nanotron/llama3-16k-passkey-retrieval-eval
Viewer
•
Updated
Jun 19
•
712
•
44
nanotron/llama3_needle_16k_finetuning
Viewer
•
Updated
Jun 15
•
3.57k
•
38
nanotron/needle_32k_eval_dataset
Viewer
•
Updated
May 29
•
1.79k
•
39
•
1
nanotron/needle_32k_finetuning_dataset
Viewer
•
Updated
May 16
•
35.5k
•
46
nanotron/needle_in_a_hay_stack_finetuning_dataset
Viewer
•
Updated
May 14
•
21
•
42
nanotron/needle_in_a_hay_stack_eval_dataset
Viewer
•
Updated
May 14
•
1
•
34
•
1
nanotron/needle_in_a_hay_stack_finetune_dataset
Viewer
•
Updated
May 3
•
12.6k
•
42
•
1
Expand 12 datasets