Aleksei Dorkin's picture

Aleksei Dorkin PRO

adorkin

·

slowwavesleep

AI & ML interests

Computational Linguistics

Recent Activity

updated a Space 4 days ago

adorkin/siglip2-clothes

published a Space 4 days ago

adorkin/siglip2-clothes

updated a Space 4 days ago

adorkin/m-clip-clothes

View all activity

Organizations

adorkin's activity

upvoted a paper 4 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 5 days ago • 115

upvoted a paper 14 days ago

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 27

upvoted an article 18 days ago

Article

Welcome to Inference Providers on the Hub 🔥

29 days ago

• 387

upvoted a collection about 1 month ago

SmolVLM 256M & 500M

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 5 days ago • 69

upvoted a paper 2 months ago

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 50

upvoted an article 3 months ago

Article

Use Models from the Hugging Face Hub in LM Studio

By

•

Nov 28, 2024

• 138

upvoted 2 collections 3 months ago

Multilingual LLM Evaluation

Multilingual Evaluation Benchmarks • 6 items • Updated Dec 13, 2024 • 10

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

upvoted a paper 3 months ago

LLMs for Extremely Low-Resource Finno-Ugric Languages

Paper • 2410.18902 • Published Oct 24, 2024 • 2

upvoted 6 collections 3 months ago

Llammas 🐑

4 items • Updated Jan 1 • 2

MaLA-LM

MaLA-LM: Massive Language Adaptation of Large Language Models • 7 items • Updated Oct 7, 2024 • 1

Models for dataset curation

9 items • Updated Dec 5, 2024 • 17

4M Models

Multimodal models from https://4m.epfl.ch/ • 14 items • Updated Jun 14, 2024 • 31

AIMv2

A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 73

LLM2CLIP

LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 10 items • Updated Jan 8 • 55

upvoted an article 3 months ago

Article

Releasing the largest multilingual open pretraining dataset

By

and 2 others •

Nov 13, 2024

• 99

upvoted an article 4 months ago

Article

Decoding Strategies in Large Language Models

By

•

Oct 29, 2024

• 44

upvoted a collection 4 months ago

October 25 Releases

19 items • Updated Oct 25, 2024 • 7

upvoted 2 collections 5 months ago

GLiClass

Generalist and Light-weighted Models for Zero-shot Text Classification • 13 items • Updated Sep 17, 2024 • 14

Salamandra 🦎

17 items • Updated 7 days ago • 55