Bulat

Bulat15g

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

liked a dataset 4 months ago

upstage/dp-bench

upvoted an article 5 months ago

Llama can now see and run on your device - welcome Llama 3.2

View all activity

Organizations

Bulat15g's activity

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Text Generation • Updated 11 days ago • 1.43M • • 1.23k

liked a dataset 4 months ago

upstage/dp-bench

Updated Oct 24, 2024 • 1.47k • 64

upvoted an article 5 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 183

liked a dataset 7 months ago

NiGuLa/Russian_Sensitive_Topics

Viewer • Updated Dec 13, 2024 • 33.3k • 96 • 13

upvoted a collection 8 months ago

Chameleon

Collection

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR. • 2 items • Updated Jul 9, 2024 • 28

liked a model 8 months ago

RLHFlow/pair-preference-model-LLaMA3-8B

Text Generation • Updated Oct 14, 2024 • 256 • 38

upvoted an article 9 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 452

liked a dataset 10 months ago

Lemhf14/EasyJailbreak_Datasets

Viewer • Updated Jan 20, 2024 • 1.63k • 1.07k • 13

liked a model 10 months ago

gradientai/Llama-3-8B-Instruct-Gradient-1048k

Text Generation • Updated Oct 29, 2024 • 5.09k • 681

upvoted 2 collections 10 months ago

LLaVA-Phi-3-mini

Collection

4 items • Updated Apr 28, 2024 • 14

Top 10% instruction tuning datasets

Collection

Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Jul 3, 2024 • 8

upvoted a collection 12 months ago

VideoEncoder

Collection

Video Understanding, Video Embedding, Video Tasks • 5 items • Updated Mar 8, 2024 • 1

reacted to akhaliq's post with 👍 about 1 year ago

Post

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (2402.14905)

paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice for mobile deployment. Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, our investigation underscores the significance of model architecture for sub-billion scale LLMs. Leveraging deep and thin architectures, coupled with embedding sharing and grouped-query attention mechanisms, we establish a strong baseline network denoted as MobileLLM, which attains a remarkable 2.7%/4.3% accuracy boost over preceding 125M/350M state-of-the-art models. Additionally, we propose an immediate block-wise weight sharing approach with no increase in model size and only marginal latency overhead. The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0.7%/0.8% than MobileLLM 125M/350M. Moreover, MobileLLM model family shows significant improvements compared to previous sub-billion models on chat benchmarks, and demonstrates close correctness to LLaMA-v2 7B in API calling tasks, highlighting the capability of small models for common on-device use cases.