Kuldeep Singh Sidhu's picture
6 3

Kuldeep Singh Sidhu

singhsidhukuldeep

AI & ML interests

šŸ˜ƒ TOP 3 on HuggingFace for posts šŸ¤— Seeking contributors for a completely open-source šŸš€ Data Science platform! singhsidhukuldeep.github.io

Recent Activity

posted an update 1 day ago
Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: šŸš€ Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512Ɨ512 pixels with 14Ɨ14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage āš”ļø Under The Hood: - Multi-stage training process with progressive resolution scaling (224ā†’384ā†’512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample šŸ“Š Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) šŸŽÆ Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!
posted an update 2 days ago
Fascinating insights from @Pinterest 's latest research on improving feature interactions in recommendation systems! Pinterest's engineering team has tackled a critical challenge in their Homefeed ranking system that serves 500M+ monthly active users. Here's what makes their approach remarkable: >> Technical Deep Dive Architecture Overview ā€¢ The ranking model combines dense features, sparse features, and embedding features to represent users, Pins, and context ā€¢ Sparse features are processed using learnable embeddings with size based on feature cardinality ā€¢ User sequence embeddings are generated using a transformer architecture processing past engagements Feature Processing Pipeline ā€¢ Dense features undergo normalization for numerical stability ā€¢ Sparse and embedding features receive L2 normalization ā€¢ All features are concatenated into a single feature embedding Key Innovations ā€¢ Implemented parallel MaskNet layers with 3 blocks ā€¢ Used projection ratio of 2.0 and output dimension of 512 ā€¢ Stacked 4 DCNv2 layers on top for higher-order interactions Performance Improvements ā€¢ Achieved +1.42% increase in Homefeed Save Volume ā€¢ Boosted Overall Time Spent by +0.39% ā€¢ Maintained memory consumption increase to just 5% >> Industry Constraints Addressed Memory Management ā€¢ Optimized for 60% GPU memory utilization ā€¢ Prevented OOM errors while maintaining batch size efficiency Latency Optimization ā€¢ Removed input-output concatenation before MLP ā€¢ Reduced hidden layer sizes in MLP ā€¢ Achieved zero latency increase while improving performance System Stability ā€¢ Ensured reproducible results across retraining ā€¢ Maintained model stability across different data distributions ā€¢ Successfully deployed in production environment This work brilliantly demonstrates how to balance academic innovations with real-world industrial constraints. Kudos to the Pinterest team!
updated a Space 3 days ago
singhsidhukuldeep/posts_leaderboard
View all activity

Organizations

MLX Community's profile picture Social Post Explorers's profile picture C4AI Community's profile picture

singhsidhukuldeep's activity

upvoted an article 5 months ago
view article
Article

Making LLMs lighter with AutoGPTQ and transformers

ā€¢ 37
upvoted an article 7 months ago
view article
Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By wolfram ā€¢
ā€¢ 59
upvoted an article 8 months ago
view article
Article

Train custom AI models with the trainer API and adapt them to šŸ¤—

By not-lain ā€¢
ā€¢ 33