This week in open AI was π₯ Let's recap! π€ merve/january-31-releases-679a10669bd4030090c5de4d LLMs π¬ > Huge: AllenAI released new TΓΌlu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B π₯ > Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license π± > Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license π₯ > Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens > Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages > OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset
VLMs & vision π > Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization π₯ > NVIDIA released new series of Eagle2 models with 1B and 9B sizes > DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license > BEN2 is a new background removal model with MIT license!
Audio π£οΈ > YuE is a new open-source music generation foundation model, lyrics-to-song generation
The OpenAI o3-mini model is a significant improvement over the o1-mini, reaching o1 performance levels. While generally good, its performance isn't universally better than previous models (o1, o1-prev.) or GPT-4o across all benchmarks. This means workflows should be re-evaluated with each model upgrade.
The o3-mini has "low," "medium," and "high" versions, with "low" being the base model used for benchmarking. It's speculated that the higher versions simply involve more processing. A fair comparison with other models like Gemini 2.0 Thinking or DeepSeek-R1 would likely need to use the "low" version and a similar "think more" mechanism.
The system card is recommended reading due to its comprehensive benchmark data.
Excited to share groundbreaking research in Knowledge Graph-based Retrieval-Augmented Generation (KG-RAG)!
Researchers from the University of Science and Technology of China have developed FRAG - a novel flexible modular framework that revolutionizes how Large Language Models (LLMs) reason with knowledge graphs.
What makes FRAG special? It intelligently adapts retrieval strategies based on query complexity without requiring expensive KG fine-tuning. The framework uses a reasoning-aware module to classify queries as simple or complex, then applies tailored retrieval pipelines.
Under the hood: - For simple queries: Uses breadth-first search and ranking to efficiently find relevant paths - For complex queries: Employs shortest path algorithms to minimize computational overhead - Features a preprocessing-retrieval-postprocessing pipeline with flexible components - Leverages traditional algorithms like PersonalizedPageRank for subgraph extraction - Implements edge and path ranking models for precise information filtering
The results are impressive - FRAG achieves state-of-the-art performance while maintaining high efficiency and low resource consumption. On benchmark datasets like WebQSP and CWQ, it outperforms existing approaches by significant margins.
Most importantly, FRAG maintains flexibility and modularity while improving retrieval quality - no expensive LLM fine-tuning required! This makes it highly practical for real-world applications.
This work represents a major step forward in making LLMs more reliable and capable of complex reasoning tasks. Looking forward to seeing how this technology evolves!
1 reply
Β·
reacted to prithivMLmods's
post with π€about 19 hours ago
Prompt : Using HTML, CSS, and JavaScript in a single HTML file to create a simulation of the solar system. Pay extreme attention to the UI to make it as intuitive as possible. Ensure that every planet appears as a sphere and is labeled with its corresponding name.