1 57 22

Krinal Joshi

krinal

kjdeveloper8

AI & ML interests

NLP, Speech

Recent Activity

upvoted an article about 13 hours ago

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

reacted to frimelle's post with 👍 about 13 hours ago

Seeing AI develop has been a wild ride, from trying to explain why we'd bother to generate a single sentence with a *neural network* to explaining that AI is not a magic, all-knowing box. The recent weeks and months have been a lot of talking about how AI works; to policy makers, to other developers, but also and mainly friends and family without a technical background. Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions. In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; it’s an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology. Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.

reacted to singhsidhukuldeep's post with 👍 about 13 hours ago

Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day. >> Technical Innovation The system introduces two groundbreaking modules: - Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity - Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests >> Key Advantages LIC addresses critical limitations of existing systems by: - Providing fine-grained time perception instead of discrete hour-based recommendations - Analyzing long-term user behavior patterns rather than just short-term interactions - Operating at item-level granularity versus broad category-level interests >> Real-World Impact Already deployed in Douyin Music App, the system has demonstrated remarkable results: - 0.122% improvement in user active days - Significant boost in engagement metrics including likes and play rates - Enhanced user satisfaction with reduced dislike rates >> Under the Hood The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments. This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.

View all activity

Organizations

krinal's activity

upvoted an article about 13 hours ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

May 1, 2024

• 72

reacted to frimelle's post with 👍 about 13 hours ago

Post

552

Seeing AI develop has been a wild ride, from trying to explain why we'd bother to generate a single sentence with a *neural network* to explaining that AI is not a magic, all-knowing box. The recent weeks and months have been a lot of talking about how AI works; to policy makers, to other developers, but also and mainly friends and family without a technical background.

Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions.

In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; it’s an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology.

Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.

1 reply

reacted to singhsidhukuldeep's post with 👍 about 13 hours ago

Post

561

Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day.

>> Technical Innovation
The system introduces two groundbreaking modules:
- Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity
- Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests

>> Key Advantages
LIC addresses critical limitations of existing systems by:
- Providing fine-grained time perception instead of discrete hour-based recommendations
- Analyzing long-term user behavior patterns rather than just short-term interactions
- Operating at item-level granularity versus broad category-level interests

>> Real-World Impact
Already deployed in Douyin Music App, the system has demonstrated remarkable results:
- 0.122% improvement in user active days
- Significant boost in engagement metrics including likes and play rates
- Enhanced user satisfaction with reduced dislike rates

>> Under the Hood
The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments.

This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.

upvoted 4 articles 1 day ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

• 179

Article

Deploying Speech-to-Speech on Hugging Face

Oct 22, 2024

• 37

Article

Open-R1: a fully open reproduction of DeepSeek-R1

8 days ago

• 620

Article

How to deploy and fine-tune DeepSeek models on AWS

6 days ago

• 30

reacted to csabakecskemeti's post with 👍 1 day ago

Post

1658

Check out my idea:
LLmaaS - Local LLM as a Service

With LLmaaS, I propose leveraging locally running LLMs as a service, providing a standardized way for websites to access and utilize them for LLM-powered operations directly on the user’s device.

Demo, code, more detailed description.
https://devquasar.com/llmaas/
https://github.com/csabakecskemeti/LLmaaS
https://youtu.be/OOWGr8jcP5Q

Call for contributors
Join me a develop the LLmaaS proxy to make this a generic purpose tool to leverage local LLMs on web. Build in security measures.
I'm looking for help to make the proxy more generic support multiple local LLM services without any change on the HTML side.
Also looking for ideas how to make the HTML par more modular and easy to use.

4 replies

reacted to Pendrokar's post with 👍 1 day ago

Post

2671

TTS: Added Kokoro v1, Parler Large, LlaSa 3B & MARS 6 TTS models to the Arena.
Pendrokar/TTS-Spaces-Arena

Also had added MaskGCT, GPT-SoVITS & OuteTTS a month ago. OuteTTS devs did say that is too early for it to be added to TTS Arenas.

Mars 5 does have a space with open weights models, but inference is way too slow (2 minutes+).

2 replies

reacted to openfree's post with 👍 6 days ago

Post

6129

📚 Multilingual RAG Chatbot with PDF Support

Chat naturally with your documents! 🌟

✨ Key Features:
• 🌏 Multilingual Q&A support (English, Korean, etc.)
• 📄 Real-time PDF and text file processing
• 🔍 Context-aware accurate responses
• ⚡ Intuitive Chainlit-powered chat interface

🛠️ Tech Stack:
• 💻 Clean, documented open-source code
• 🤝 User-friendly Chainlit UI
• 📊 Vector database for efficient retrieval
• 🔄 Real-time streaming responses

📱 Try it now!
→ Demo: openfree/PDF-RAG

🔧 Special Features:
• 📊 Support for PDF/text files up to 2MB
• 🎯 Precise context understanding
• ⚡ Fast response time
• 🔒 Secure file handling

Full source code available - ready to integrate into your projects!

#RAG #NLP #Chatbot #OpenSource #PDFProcessing

reacted to tegridydev's post with 👍 6 days ago

Post

1365

So, what is #MechanisticInterpretability 🤔

Mechanistic Interpretability (MI) is the discipline of opening the black box of large language models (and other neural networks) to understand the underlying circuits, features and/or mechanisms that give rise to specific behaviours

Instead of treating a model as a monolithic function, we can:

1. Trace how input tokens propagate through attention heads & MLP layers
2. Identify localized “circuit motifs”
3. Develop methods to systematically break down or “edit” these circuits to confirm we understand the causal structure.

Mechanistic Interpretability aims to yield human-understandable explanations of how advanced models represent and manipulate concepts which hopefully leads to

1. Trust & Reliability
2. Safety & Alignment
3. Better Debugging / Development Insights

https://bsky.app/profile/mechanistics.bsky.social/post/3lgvvv72uls2x

1 reply

replied to tegridydev's post 6 days ago

Intresting!

reacted to hexgrad's post with 👍 6 days ago

Post

7992

hexgrad/Kokoro-82M got an upgrade! ⬆️ More voices, more languages, pip install kokoro, and still 82M parameters.

GitHub: https://github.com/hexgrad/kokoro
PyPI: https://pypi.org/project/kokoro/
Space: hexgrad/Kokoro-TTS

11 replies

reacted to davidberenstein1957's post with 👍 6 days ago

Post

1519

tldr; Parquet is awesome, DuckDB too!

Datasets on the Hugging Face Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.

blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend

upvoted an article 6 days ago

Article

Welcome to Inference Providers on the Hub 🔥

8 days ago

• 238

liked a model 8 days ago

deepseek-ai/DeepSeek-V3-Base

Updated 12 days ago • 27.6k • 1.5k

reacted to alibabasglab's post with 👍 8 days ago

Post

1994

Do you need to improve your speech audio to premium quality? If so, please try out our latest open-sourced free speech processing toolkit: [ClearerVoice-Studio](https://github.com/modelscope/ClearerVoice-Studio)! Check out our live demo at alibabasglab/ClearVoice
and https://modelscope.cn/studios/iic/ClearerVoice-Studio.

1 reply

reacted to mmaguero's post with 👍 8 days ago

Post

1466

🚀 Multidimensional Affective Analysis for Guarani/Jopara! 🌎

This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).

Highlights:
🧵 Corpora:
- Emotion Recognition
- Humor Detection
- Offensive Language Identification
💻 Base Models for Fine-Tuning (trained on Guarani Wiki):
- From scratch: BERT-based tiny, small, base and large models
- Continuously pre-trained models: Multilingual-BERT and BETO
📓 Baseline Notebooks:
- Fine-tuning BERT-based models
- NCRF++ models via GitHub

💡 Check the repo!
https://github.com/mmaguero/guarani-multi-affective-analysis

📖 Check out the publication here:
- https://digibug.ugr.es/handle/10481/98843
- https://link.springer.com/article/10.1007/s12559-023-10165-0

#NLP #AffectiveComputing #LowResourceLanguages #Guarani #Jopara #SentimentAnalysis #AIForAll

reacted to nicolay-r's post with 👍 17 days ago

Post

1316

📢 So far I been passioned about making NLP pipeline for handling iterator of texts with no-string dependency from besides third-party providers of your choice.

By starting with text-translation, delighted to share the related notebooks that might save you time for handling your data

⭐ https://github.com/nicolay-r/nlp-thirdgate

Example of using GoogleTranslate API in no-string for handling textual data iterators with spans:

📙 https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/translate_texts_with_spans_via_googletrans.ipynb

The key concept is that all these API examples could be tied into a single pipeline using AREkit

📘 https://github.com/nicolay-r/AREkit

🛠️ The further plan is to popualte this repo with
1. NER (DeepPavlov models wrapper)
2. LLM with fancy out-of-the-box chain-of-thought declaration support.

liked a model 17 days ago

geneing/Kokoro

Text-to-Speech • Updated 25 days ago • 44 • 8