Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.13187

lshort-transformers

Papers useful when writing the paper: "The Not So Short Transfromers"

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 62
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26 • 68
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 150
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 62

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

papers interesting

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

Foundational Model

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

Papers - Image - Model Merging

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50

Papers - Image - Frankenmerging

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28 • 10

Papers - Frankenmerging

Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28 • 10
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2 • 116

To read... eventually

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 65

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs