YuE Collection YuE: Open Full-song Generation Foundation Model β’ 9 items β’ Updated 8 days ago β’ 16
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit β’ 15 items β’ Updated about 7 hours ago β’ 37
Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. β’ 4 items β’ Updated 1 day ago β’ 39
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. β’ 29 items β’ Updated about 4 hours ago β’ 145
view article Article The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... By srinivasbilla β’ 16 days ago β’ 55
Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! β’ 30 items β’ Updated Jul 11, 2024 β’ 104
Oryx-1.5 Collection Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution β’ 4 items β’ Updated 21 days ago β’ 5
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. β’ 7 items β’ Updated about 1 month ago β’ 61
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper β’ 2412.04424 β’ Published Dec 5, 2024 β’ 59
Oryx Collection Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding β’ 6 items β’ Updated Dec 11, 2024 β’ 16
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 15 items β’ Updated Dec 22, 2024 β’ 212
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 β’ 15 items β’ Updated Dec 6, 2024 β’ 566
π Ichigo v0.3 Collection The experimental family designed to train LLMs to understand sound natively. β’ 6 items β’ Updated Nov 11, 2024 β’ 17
SpeechT5 Collection The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. β’ 8 items β’ Updated 28 days ago β’ 24
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation β’ 12 items β’ Updated 19 days ago β’ 60
Leaderboards and benchmarks β¨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... β’ 90 items β’ Updated about 2 hours ago β’ 93
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. β’ 4 items β’ Updated Jun 27, 2024 β’ 148
view article Article The CVPR Survival Guide: Discovering Research That's Interesting to YOU! By harpreetsahota β’ Jun 14, 2024 β’ 9