Krinal Joshi

krinal

AI & ML interests

NLP, Speech

Recent Activity

reacted to frimelle's post with πŸ‘ about 13 hours ago
Seeing AI develop has been a wild ride, from trying to explain why we'd bother to generate a single sentence with a *neural network* to explaining that AI is not a magic, all-knowing box. The recent weeks and months have been a lot of talking about how AI works; to policy makers, to other developers, but also and mainly friends and family without a technical background. Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions. In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; it’s an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology. Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.
reacted to singhsidhukuldeep's post with πŸ‘ about 13 hours ago
Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day. >> Technical Innovation The system introduces two groundbreaking modules: - Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity - Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests >> Key Advantages LIC addresses critical limitations of existing systems by: - Providing fine-grained time perception instead of discrete hour-based recommendations - Analyzing long-term user behavior patterns rather than just short-term interactions - Operating at item-level granularity versus broad category-level interests >> Real-World Impact Already deployed in Douyin Music App, the system has demonstrated remarkable results: - 0.122% improvement in user active days - Significant boost in engagement metrics including likes and play rates - Enhanced user satisfaction with reduced dislike rates >> Under the Hood The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments. This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.
View all activity

Organizations

Blog-explorers's profile picture Hugging Face Discord Community's profile picture

krinal's activity

upvoted an article about 13 hours ago
view article
Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

β€’ 72
reacted to frimelle's post with πŸ‘ about 13 hours ago
view post
Post
552
Seeing AI develop has been a wild ride, from trying to explain why we'd bother to generate a single sentence with a *neural network* to explaining that AI is not a magic, all-knowing box. The recent weeks and months have been a lot of talking about how AI works; to policy makers, to other developers, but also and mainly friends and family without a technical background.

Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions.

In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; it’s an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology.

Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.
  • 1 reply
Β·
reacted to singhsidhukuldeep's post with πŸ‘ about 13 hours ago
view post
Post
561
Exciting breakthrough in Streaming Recommendation Systems! @BytedanceTalk researchers have developed "Long-Term Interest Clock" (LIC), a revolutionary approach to understand user preferences throughout the day.

>> Technical Innovation
The system introduces two groundbreaking modules:
- Clock-based General Search Unit (Clock-GSU): Intelligently retrieves relevant user behaviors by analyzing time patterns and content similarity
- Clock-based Exact Search Unit (Clock-ESU): Employs time-gap-aware attention mechanism to precisely model user interests

>> Key Advantages
LIC addresses critical limitations of existing systems by:
- Providing fine-grained time perception instead of discrete hour-based recommendations
- Analyzing long-term user behavior patterns rather than just short-term interactions
- Operating at item-level granularity versus broad category-level interests

>> Real-World Impact
Already deployed in Douyin Music App, the system has demonstrated remarkable results:
- 0.122% improvement in user active days
- Significant boost in engagement metrics including likes and play rates
- Enhanced user satisfaction with reduced dislike rates

>> Under the Hood
The system processes user behavior sequences spanning an entire year, utilizing multi-head attention mechanisms and sophisticated time-gap calculations to understand user preferences. It pre-computes embeddings stored in parameter servers for real-time performance, making it highly scalable for production environments.

This innovation marks a significant step forward in personalized content delivery, especially for streaming platforms where user preferences vary throughout the day. The research has been accepted for presentation at WWW '25, Sydney.
upvoted 4 articles 1 day ago
view article
Article

Visualize and understand GPU memory in PyTorch

β€’ 179
view article
Article

Deploying Speech-to-Speech on Hugging Face

β€’ 37
view article
Article

Open-R1: a fully open reproduction of DeepSeek-R1

β€’ 620
view article
Article

How to deploy and fine-tune DeepSeek models on AWS

β€’ 30
reacted to csabakecskemeti's post with πŸ‘ 1 day ago
view post
Post
1658
Check out my idea:
LLmaaS - Local LLM as a Service

With LLmaaS, I propose leveraging locally running LLMs as a service, providing a standardized way for websites to access and utilize them for LLM-powered operations directly on the user’s device.

Demo, code, more detailed description.
https://devquasar.com/llmaas/
https://github.com/csabakecskemeti/LLmaaS
https://youtu.be/OOWGr8jcP5Q

Call for contributors
Join me a develop the LLmaaS proxy to make this a generic purpose tool to leverage local LLMs on web. Build in security measures.
I'm looking for help to make the proxy more generic support multiple local LLM services without any change on the HTML side.
Also looking for ideas how to make the HTML par more modular and easy to use.
  • 4 replies
Β·
reacted to Pendrokar's post with πŸ‘ 1 day ago
view post
Post
2671
TTS: Added Kokoro v1, Parler Large, LlaSa 3B & MARS 6 TTS models to the Arena.
Pendrokar/TTS-Spaces-Arena

Also had added MaskGCT, GPT-SoVITS & OuteTTS a month ago. OuteTTS devs did say that is too early for it to be added to TTS Arenas.

Mars 5 does have a space with open weights models, but inference is way too slow (2 minutes+).
  • 2 replies
Β·
reacted to openfree's post with πŸ‘ 6 days ago
view post
Post
6129
πŸ“š Multilingual RAG Chatbot with PDF Support

Chat naturally with your documents! 🌟

✨ Key Features:
β€’ 🌏 Multilingual Q&A support (English, Korean, etc.)
β€’ πŸ“„ Real-time PDF and text file processing
β€’ πŸ” Context-aware accurate responses
β€’ ⚑ Intuitive Chainlit-powered chat interface

πŸ› οΈ Tech Stack:
β€’ πŸ’» Clean, documented open-source code
β€’ 🀝 User-friendly Chainlit UI
β€’ πŸ“Š Vector database for efficient retrieval
β€’ πŸ”„ Real-time streaming responses

πŸ“± Try it now!
β†’ Demo: openfree/PDF-RAG

πŸ”§ Special Features:
β€’ πŸ“Š Support for PDF/text files up to 2MB
β€’ 🎯 Precise context understanding
β€’ ⚑ Fast response time
β€’ πŸ”’ Secure file handling

Full source code available - ready to integrate into your projects!

#RAG #NLP #Chatbot #OpenSource #PDFProcessing
reacted to tegridydev's post with πŸ‘ 6 days ago
view post
Post
1365
So, what is #MechanisticInterpretability πŸ€”

Mechanistic Interpretability (MI) is the discipline of opening the black box of large language models (and other neural networks) to understand the underlying circuits, features and/or mechanisms that give rise to specific behaviours

Instead of treating a model as a monolithic function, we can:

1. Trace how input tokens propagate through attention heads & MLP layers
2. Identify localized β€œcircuit motifs”
3. Develop methods to systematically break down or β€œedit” these circuits to confirm we understand the causal structure.

Mechanistic Interpretability aims to yield human-understandable explanations of how advanced models represent and manipulate concepts which hopefully leads to

1. Trust & Reliability
2. Safety & Alignment
3. Better Debugging / Development Insights

https://bsky.app/profile/mechanistics.bsky.social/post/3lgvvv72uls2x
  • 1 reply
Β·
replied to tegridydev's post 6 days ago
reacted to hexgrad's post with πŸ‘ 6 days ago
reacted to davidberenstein1957's post with πŸ‘ 6 days ago
view post
Post
1519
tldr; Parquet is awesome, DuckDB too!

Datasets on the Hugging Face Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.

blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend
upvoted an article 6 days ago
view article
Article

Welcome to Inference Providers on the Hub πŸ”₯

β€’ 238
reacted to alibabasglab's post with πŸ‘ 8 days ago
reacted to mmaguero's post with πŸ‘ 8 days ago
view post
Post
1466
πŸš€ Multidimensional Affective Analysis for Guarani/Jopara! 🌎

This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).

Highlights:
🧡 Corpora:
- Emotion Recognition
- Humor Detection
- Offensive Language Identification
πŸ’» Base Models for Fine-Tuning (trained on Guarani Wiki):
- From scratch: BERT-based tiny, small, base and large models
- Continuously pre-trained models: Multilingual-BERT and BETO
πŸ““ Baseline Notebooks:
- Fine-tuning BERT-based models
- NCRF++ models via GitHub

πŸ’‘ Check the repo!
https://github.com/mmaguero/guarani-multi-affective-analysis

πŸ“– Check out the publication here:
- https://digibug.ugr.es/handle/10481/98843
- https://link.springer.com/article/10.1007/s12559-023-10165-0

#NLP #AffectiveComputing #LowResourceLanguages #Guarani #Jopara #SentimentAnalysis #AIForAll
reacted to nicolay-r's post with πŸ‘ 17 days ago
view post
Post
1316
πŸ“’ So far I been passioned about making NLP pipeline for handling iterator of texts with no-string dependency from besides third-party providers of your choice.

By starting with text-translation, delighted to share the related notebooks that might save you time for handling your data

⭐ https://github.com/nicolay-r/nlp-thirdgate

Example of using GoogleTranslate API in no-string for handling textual data iterators with spans:

πŸ“™ https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/translate_texts_with_spans_via_googletrans.ipynb

The key concept is that all these API examples could be tied into a single pipeline using AREkit

πŸ“˜ https://github.com/nicolay-r/AREkit

πŸ› οΈ The further plan is to popualte this repo with
1. NER (DeepPavlov models wrapper)
2. LLM with fancy out-of-the-box chain-of-thought declaration support.