Trangle Heshvp's picture

Trangle Heshvp

Trangle

·

AI & ML interests

None yet

Recent Activity

liked a model about 9 hours ago

Qwen/Qwen2.5-32B-Instruct

liked a model 1 day ago

Alibaba-NLP/gte-reranker-modernbert-base

liked a model 1 day ago

Alibaba-NLP/gte-modernbert-base

View all activity

Organizations

Trangle's activity

upvoted an article 17 days ago

Article

Open-source DeepResearch – Freeing our search agents

22 days ago

• 1.1k

upvoted 2 collections 2 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 139

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated 1 day ago • 58

upvoted a collection 5 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 570

upvoted an article 6 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 322

upvoted 4 collections 7 months ago

Gemma Scope Release

A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Dec 13, 2024 • 17

Llama 3.1 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Dec 6, 2024 • 15

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated Jan 17 • 60

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 5 days ago • 216

upvoted 2 papers 7 months ago

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12, 2024 • 22

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 161

upvoted an article 7 months ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

By

•

Jul 9, 2024

• 43

upvoted a paper 8 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 50

upvoted a collection 8 months ago

Step-DPO

Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1, 2024 • 5

upvoted an article 8 months ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27, 2024

• 128

upvoted a paper 8 months ago

SpeechVerse: A Large-scale Generalizable Audio Language Model

Paper • 2405.08295 • Published May 14, 2024 • 19

upvoted a collection 8 months ago

TaskMeAnything

A collection of TaskMeAnything resources [https://github.com/JieyuZ2/TaskMeAnything] • 12 items • Updated Aug 4, 2024 • 3

upvoted 2 articles 8 months ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16, 2024

• 43

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

• 187

upvoted a collection 8 months ago

WildBench

4 items • Updated 15 days ago • 6