1 9

Oussema Harbi

Harbous

oharbi

AI & ML interests

None yet

Recent Activity

reacted to chansung's post with 👍 21 days ago

New look for AI powered paper reviews from the list by Hugging Face Daily Papers ( managed by the @akhaliq ) Bookmark the webpage along, check comprehensive reviews by Google DeepMind Gemini 1.5, and listen to audio podcast made by the same tech used in NotebookLM. Link: https://deep-diver.github.io/ai-paper-reviewer/ This is not an official service by Hugging Face. It is just a service developed by an individual developer using his own money :)

liked a model 27 days ago

openbmb/MiniCPM-o-2_6

liked a Space about 1 month ago

elmresearchcenter/open_universal_arabic_asr_leaderboard

View all activity

Organizations

None yet

Harbous's activity

reacted to chansung's post with 👍 21 days ago

Post

1715

New look for AI powered paper reviews from the list by Hugging Face Daily Papers ( managed by the @akhaliq )

Bookmark the webpage along, check comprehensive reviews by Google DeepMind Gemini 1.5, and listen to audio podcast made by the same tech used in NotebookLM.

Link: https://deep-diver.github.io/ai-paper-reviewer/

This is not an official service by Hugging Face. It is just a service developed by an individual developer using his own money :)

liked a model 27 days ago

openbmb/MiniCPM-o-2_6

Any-to-Any • Updated 3 days ago • 522k • 945

liked a Space about 1 month ago

Open Universal Arabic Asr Leaderboard

🥇

A benchmark for open-source multi-dialect Arabic ASR models

reacted to singhsidhukuldeep's post with 👍 about 1 month ago

Post

3182

Groundbreaking Research Alert: Rethinking RAG with Cache-Augmented Generation (CAG)

Researchers from National Chengchi University and Academia Sinica have introduced a paradigm-shifting approach that challenges the conventional wisdom of Retrieval-Augmented Generation (RAG).

Instead of the traditional retrieve-then-generate pipeline, their innovative Cache-Augmented Generation (CAG) framework preloads documents and precomputes key-value caches, eliminating the need for real-time retrieval during inference.

Technical Deep Dive:
- CAG preloads external knowledge and precomputes KV caches, storing them for future use
- The system processes documents only once, regardless of subsequent query volume
- During inference, it loads the precomputed cache alongside user queries, enabling rapid response generation
- The cache reset mechanism allows efficient handling of multiple inference sessions through strategic token truncation

Performance Highlights:
- Achieved superior BERTScore metrics compared to both sparse and dense retrieval RAG systems
- Demonstrated up to 40x faster generation times compared to traditional approaches
- Particularly effective with both SQuAD and HotPotQA datasets, showing robust performance across different knowledge tasks

Why This Matters:
The approach significantly reduces system complexity, eliminates retrieval latency, and mitigates common RAG pipeline errors. As LLMs continue evolving with expanded context windows, this methodology becomes increasingly relevant for knowledge-intensive applications.

updated a model about 1 month ago

Harbous/SmolLM2-360-finetuned-sql-instruct

Updated Jan 4 • 4

liked a model about 1 month ago

PowerInfer/SmallThinker-3B-Preview

Text Generation • Updated 28 days ago • 115k • • 381

reacted to hexgrad's post with ❤️ about 2 months ago

Post

4042

Merry Christmas! 🎄 Open sourced a small TTS model at hexgrad/Kokoro-82M

2 replies

liked a dataset about 2 months ago

MohamedRashad/Quran-Tafseer

Viewer • Updated Sep 13, 2024 • 219k • 215 • 37

New activity in MohamedRashad/Quran-Tafseer about 2 months ago

ideas about automatic summarization of qur'an-tafseer

#2 opened about 2 months ago by

rhyssh

reacted to csabakecskemeti's post with 👍 2 months ago

Post

4545

The AMD Instinct MI50 (~$110) is surprisingly fast for inference Quantized models.

This runs a Llama 3.1 8B Q8 with Llama.cpp
https://huggingface.co/spaces/DevQuasar/Mi50

A little blogpost about the HW
http://devquasar.com/uncategorized/amd-radeon-instinct-mi50-cheap-inference/

reacted to freddyaboulton's post with 👍 2 months ago

Post

1134

Just created a cookbook of real time audio/video spaces created using Gradio and WebRTC ⚡️

Use this and the [docs](https://freddyaboulton.github.io/gradio-webrtc/) to get started building the next gen of AI apps!

freddyaboulton/gradio-webrtc-cookbook-6758ba7745aeca7b1be7de0f

2 replies

reacted to etemiz's post with ➕ 2 months ago

Post

427

Apparently you can't count on centralized AI to perform similarly, some days great some days bad. They may be distilling or doing other things to dumb it down and make it cost effective. But you can count on open source LLMs that you run locally to perform same level, every day.

So you always have to watch centralized AI but you never have to watch the local LLM.

liked a model 3 months ago

MohamedRashad/arabic-large-nougat

Image-to-Text • Updated Nov 28, 2024 • 617 • 8

reacted to MohamedRashad's post with ❤️ 3 months ago

Post

1667

A while back i shared this model MohamedRashad/arabic-small-nougat that was a finetune from facebook/nougat-small for the Arabic Language.

Today this humble project has been scaled with new models, new datasets, new space, and a new paper

Check everything throught this collection here:
MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

1 reply

reacted to singhsidhukuldeep's post with ❤️ 3 months ago

Post

1909

It's not every day you see the No. 1 ranked paper of the day open-sourcing a very powerful image editing app!

Fascinating to see MagicQuill - a groundbreaking interactive image editing system that makes precise photo editing effortless through advanced AI!

The system's architecture features three sophisticated components:

1. Editing Processor:
- Implements a dual-branch architecture integrated into a latent diffusion framework
- Utilizes PiDiNet for edge map extraction and content-aware per-pixel inpainting
- Features a specialized UNet architecture with zero-convolution layers for feature insertion
- Employs denoising score matching for training the control branch
- Processes both structural modifications via scribble guidance and color manipulation through downsampled color blocks
- Maintains pixel-level control through VAE-based latent space operations

2. Painting Assistor:
- Powered by a fine-tuned LLaVA multimodal LLM using Low-Rank Adaptation (LoRA)
- Trained on a custom dataset derived from Densely Captioned Images (DCI)
- Processes user brushstrokes through specialized Q&A tasks for add/subtract/color operations
- Features bounding box coordinate normalization for precise stroke localization
- Implements streamlined single-word/phrase outputs for real-time performance

3. Idea Collector:
- Built as a modular ReactJS component library
- Supports cross-platform deployment via HTTP protocols
- Compatible with Gradio and ComfyUI frameworks
- Features comprehensive layer management and parameter adjustment capabilities
- Implements real-time canvas updates and preview generation

The system outperforms existing solutions like SmartEdit and BrushNet in edge alignment and color fidelity while maintaining seamless integration with popular AI frameworks.

What are your thoughts on AI-powered creative tools?

liked 4 models 4 months ago