dame rajee's picture

dame rajee

damerajee

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago
AmitIsraeli/PopYou
liked a Space 7 days ago
multimodalart/LLaDA
liked a model 7 days ago
GSAI-ML/LLaDA-8B-Instruct
View all activity

Organizations

Blog-explorers's profile picture Samanvay AI's profile picture None yet's profile picture

damerajee's activity

reacted to ginipick's post with πŸ˜ŽπŸ€—πŸ‘€πŸš€πŸ”₯ 21 days ago
view post
Post
5875
Gini's AI Spaces: Everything You Need for Visual Content Creation!

Hello! ✨ Let me introduce Gini’s 5 AI Spaces that effortlessly generate various styles of visual content.

Each Space leverages Diffusers and Gradio, so you can create stunning images in just a few clicks!

1) Flowchart
Features: Hand-drawn style flowcharts for workflows or business processes
Use Cases: Software release pipelines, data pipelines, corporate workflows
Benefits: Clear stage-by-stage structure, simple icon usage

ginigen/Flowchart

2) Infographic
Features: Visually appealing infographics that communicate data or statistics
Use Cases: Global energy charts, startup growth metrics, health tips and more
Benefits: Eye-catching icons and layouts, perfect for storytelling at a glance

ginigen/Infographic

3) Mockup
Features: Sketch-style wireframes or UX mockups for apps/websites
Use Cases: Mobile login flows, dashboards, e-commerce site layouts
Benefits: Rapid prototyping of early design ideas, perfect for storyboarding

ginigen/Mockup

4) Diagram
Features: Educational diagrams (science, biology, geography, etc.)
Use Cases: Water cycle, photosynthesis, chemical reactions, human anatomy
Benefits: Vibrant, friendly illustrations, ideal for student-friendly materials

ginigen/Diagram

5) Design
Features: Product/industrial design concepts (coffee machines, smartphones, etc.)
Use Cases: Prototyping, concept car interiors, high-tech product sketches
Benefits: From 3D render-like visuals to simple sketches, unleash your creativity!

ginigen/Design

Click any link above and let AI spark your imagination. Enjoy a fun and productive creative process! πŸš€βœ¨
reacted to Tonic's post with πŸ”₯ about 1 month ago
view post
Post
2348
πŸ™‹πŸ»β€β™‚οΈhey there folks ,

Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math

give it a try !
reacted to lewtun's post with πŸ”₯πŸ€—πŸš€ about 1 month ago
view post
Post
10223
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

πŸ§ͺ Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

πŸ”₯ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
Β·
reacted to danielhanchen's post with πŸ€—πŸ‘ 3 months ago
reacted to mervenoyan's post with πŸ”₯ 5 months ago
posted an update 5 months ago
view post
Post
493
On the 2nd of October a really cool paper was released called "Were RNNs all we need" https://arxiv.org/abs/2410.01201

This paper introduces the MinGRU model, a simplified version of the traditional Gated Recurrent Unit (GRU) designed to enhance efficiency by removing hidden state dependencies from its gates. This allows for parallel training, making it significantly faster than conventional GRUs. Additionally, MinGRU eliminates non-linear activations like tanh, streamlining computations.

So I read the paper and I tried training this model and it seems to be doing quite well , you could check out the pre-trained model on the huggingface spaces

- damerajee/mingru-stories
  • 1 reply
Β·
reacted to onekq's post with 🧠 5 months ago
view post
Post
2571
Here is my latest study on OpenAIπŸ“o1πŸ“.
A Case Study of Web App Coding with OpenAI Reasoning Models (2409.13773)

I wrote an easy-to-read blogpost to explain finding.
https://huggingface.co/blog/onekq/daily-software-engineering-work-reasoning-models

INSTRUCTION FOLLOWING is the key.

100% instruction following + Reasoning = new SOTA

But if the model misses or misunderstands one instruction, it can perform far worse than non-reasoning models.
replied to reach-vb's post 6 months ago
reacted to reach-vb's post with πŸ”₯🧠 6 months ago
view post
Post
2902
Less than two days ago Kyutai Labs open sourced Moshi - an ~7.6B on-device Speech to Speech foundation model and Mimi - SoTA streaming speech codec! πŸ”₯

The release includes:

1. Moshiko & Moshika - Moshi finetuned on synthetic data (CC-BY license) ( kyutai/moshi-v01-release-66eaeaf3302bef6bd9ad7acd)
2. Mimi - Streaiming Audio Codec, processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps (CC-BY license) ( kyutai/mimi)
3. Model checkpoints & Inference codebase written in Rust (Candle), PyTorch & MLX (Apache license) (https://github.com/kyutai-labs/moshi)

How does Moshi work?

1. Moshi processes two audio streams: one for itself and one for the user, with the user's stream coming from audio input and Moshi's stream generated by the model.

2. Along with these audio streams, Moshi predicts text tokens for its speech, enhancing its generation quality.

3. The model uses a small Depth Transformer for codebook dependencies and a large 7B parameter Temporal Transformer for temporal dependencies.

4. The theoretical latency is 160ms, with a practical latency of around 200ms on an L4 GPU.

Model size & inference:

Moshiko/ka are 7.69B param models

bf16 ~16GB VRAM
8-bit ~8GB VRAM
4-bit ~4GB VRAM

You can run inference via Candle πŸ¦€, PyTorch and MLX - based on your hardware.

The Kyutai team, @adefossez @lmz and team are cracked AF, they're bringing some serious firepower to the open source/ science AI scene, looking forward to what's next! 🐐
  • 1 reply
Β·
reacted to MohamedRashad's post with ❀️ 6 months ago
view post
Post
3461
For all the Muslims out there who are interested in Quran and its tafsir (explanations). This humble dataset consists of 84 different books of tafsir for nearly all the ayat in the Quran:
MohamedRashad/Quran-Tafseer

I hope it helps someone to build something nice and useful with it ^_^
reacted to merve's post with πŸš€πŸ‘ 6 months ago
view post
Post
2395
NVIDIA just dropped NVEagle πŸ¦…

Super impressive vision language model that comes in 7B, 13B and 13B fine-tuned on chat πŸ’¬
Model repositories: merve/nveagle-66d0705108582d73bb235c26
Try it: NVEagle/Eagle-X5-13B-Chat πŸ’¬ (works very well! 🀯)

This model essentially explores having different experts (MoE) for image encoder part of vision language model.
How? 🧐
The authors concatenate the vision encoder output tokens together, and they apply "pre-alignment" essentially fine-tune experts with frozen text encoder.

Then they freeze both experts and the decoder and just train the projection layer, and finally, they unfreeze everything for supervised fine-tuning ✨

In the paper, they explore different fusion strategies and vision encoders, extending basic CLIP encoder, and figure out simply concatenating visual tokens works well.
Rest of the architecture is quite similar to LLaVA. (see below the architecture)