Ali El Filali's picture

Ali El Filali

alielfilali01

·

AI & ML interests

AI Psychometrician ? | NLP (mainly for Arabic) | Other interests include Reinforcement Learning and Cognitive sciences among others

Recent Activity

updated a Space about 20 hours ago

inceptionai/AraGen-Leaderboard

updated a dataset about 22 hours ago

OALL/requests

updated a dataset 1 day ago

OALL/requests

View all activity

Articles

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

Introducing the Open Arabic LLM Leaderboard

Organizations

alielfilali01's activity

upvoted 2 articles 7 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

11 days ago

• 105

Article

We now support VLMs in smolagents!

10 days ago

• 69

upvoted an article 10 days ago

Article

TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation

By

•

24 days ago

• 26

upvoted an article 13 days ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

18 days ago

• 62

upvoted 2 articles 18 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

19 days ago

• 131

Article

CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard

25 days ago

• 16

upvoted a paper 19 days ago

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

Paper • 2402.01781 • Published Feb 1, 2024 • 2

upvoted a paper 25 days ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 25 days ago • 90

upvoted 2 papers about 1 month ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 48

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 79

upvoted a collection about 1 month ago

Deepseek Papers

Deepseek papers collection • 14 items • Updated Dec 30, 2024 • 38

upvoted 3 papers about 1 month ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 46

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation

Paper • 2412.15255 • Published Dec 15, 2024 • 3

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345

upvoted 2 collections about 2 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 26 days ago • 80

Multilingual LLM Evaluation

Multilingual Evaluation Benchmarks • 6 items • Updated Dec 13, 2024 • 10

upvoted a paper about 2 months ago

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 106

upvoted 3 collections about 2 months ago

🧪 FineWeb v1 data experiments

Ablation models trained for our data experiments. • 22 items • Updated Jun 12, 2024 • 4

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 35

AraDICE

AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs • 12 items • Updated Dec 13, 2024 • 4