SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues.
Princeton NLP group
princeton-nlp
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 13 hours ago
princeton-nlp/prolong-data-512K
updated
a model
1 day ago
WebOrganizer/LM-1b_1x-DCLMFasttext_over_Topics_x_Formats_for_MMLU_and_Hellaswag
published
a model
1 day ago
WebOrganizer/LM-1b_1x-DCLMFasttext_over_Topics_x_Formats_for_MMLU_and_Hellaswag
Organizations
Papers
1
models
259

princeton-nlp/Llama-3-8B-ProLong-512k-Instruct
Updated
•
2.61k
•
20

princeton-nlp/Llama-3-8B-ProLong-512k-Base
Updated
•
2.02k
•
8

princeton-nlp/Llama-3-8B-ProLong-64k-Instruct
Text Generation
•
Updated
•
2.55k
•
13

princeton-nlp/Llama-3-8B-ProLong-64k-Base
Text Generation
•
Updated
•
2.8k
•
5

princeton-nlp/Mistral-7B-Base-SFT-CPO
Text Generation
•
Updated
•
2.54k
•
1

princeton-nlp/Mistral-7B-Base-SFT-RRHF
Text Generation
•
Updated
•
2.54k

princeton-nlp/gemma-2-9b-it-SimPO
Text Generation
•
Updated
•
21.9k
•
156

princeton-nlp/gemma-2-9b-it-DPO
Text Generation
•
Updated
•
2.45k
•
9

princeton-nlp/Llama-3-Instruct-8B-SimPO-v0.2
Text Generation
•
Updated
•
2.47k
•
6

princeton-nlp/Llama-3-Instruct-8B-RDPO-v0.2
Text Generation
•
Updated
•
2.51k
•
1
datasets
45
princeton-nlp/prolong-data-512K
Updated
•
12.7k
•
6
princeton-nlp/SWE-bench_Lite
Viewer
•
Updated
•
323
•
51.8k
•
31
princeton-nlp/SWE-bench
Viewer
•
Updated
•
21.5k
•
42.5k
•
98
princeton-nlp/SWE-bench_Verified
Viewer
•
Updated
•
500
•
306k
•
149
princeton-nlp/TextbooksBySubject
Viewer
•
Updated
•
129
•
68
princeton-nlp/TextbookChapters
Viewer
•
Updated
•
77.9k
•
98
•
9
princeton-nlp/SWE-bench_Multimodal
Viewer
•
Updated
•
612
•
649
•
19
princeton-nlp/fineweb_edu-swahili-translated
Viewer
•
Updated
•
137k
•
128
princeton-nlp/prolong-ultrachat-64K
Preview
•
Updated
•
49
princeton-nlp/HELMET
Viewer
•
Updated
•
516
•
428
•
5