Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

updated a dataset about 5 hours ago

llm-scratch/wmt14-de-en-split

published a dataset about 5 hours ago

llm-scratch/wmt14-de-en-split

new activity about 7 hours ago

huggingface/documentation-images:added infographics for the blog post iisc-huggingface-collab.md

View all activity

Organizations

ariG23498's activity

upvoted an article 1 day ago

Article

Remote VAEs for decoding with HF endpoints 🤗

2 days ago

• 25

upvoted a collection 4 days ago

SigLIP 2

OpenCLIP and timm SigLIP 2 models • 45 items • Updated 4 days ago • 9

upvoted an article 4 days ago

Article

SigLIP 2: A better multilingual vision language encoder

5 days ago

• 90

upvoted a collection 4 days ago

SigLIP2

36 items • Updated 4 days ago • 46

upvoted a paper 4 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 5 days ago • 115

upvoted an article 5 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

6 days ago

• 162

upvoted 2 articles 6 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

7 days ago

• 58

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5, 2024

• 208

upvoted an article 7 days ago

Article

Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥

8 days ago

• 89

upvoted a paper 12 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 14 days ago • 28

upvoted an article 14 days ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

By

and 1 other •

14 days ago

• 25

upvoted a collection 14 days ago

SigLIP

Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated 6 days ago • 53

upvoted an article 14 days ago

Article

The Open Arabic LLM Leaderboard 2

16 days ago

• 27

upvoted an article 16 days ago

Article

Open-source DeepResearch – Freeing our search agents

22 days ago

• 1.1k

upvoted an article 19 days ago

Article

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

• 4

upvoted 2 articles 27 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 400

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

By

•

27 days ago

• 32

upvoted a collection 27 days ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 18 days ago • 199

upvoted an article 28 days ago

Article

Welcome to Inference Providers on the Hub 🔥

29 days ago

• 387

upvoted a collection 28 days ago

Qwen2.5

The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7