Open-source embeddings and LLMs outperform Gemini and OpenAI for Web Navigation while being faster and cheaper Jun 21, 2024 • 7
Introducing BlindChat, an open-source and privacy-by-design Conversational AI fully in-browser Sep 22, 2023 • 1
AI Total Cost of Ownership Calculator: Evaluate the cost of in-house AI deployment vs AI APIs Sep 20, 2023 • 1
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 8 days ago • 20
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong Paper • 2501.09775 • Published 9 days ago • 26
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 25 days ago • 25
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 17 days ago • 24
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 28 days ago • 81
Demystifying Domain-adaptive Post-training for Financial LLMs Paper • 2501.04961 • Published 16 days ago • 11
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding Paper • 2501.05452 • Published 15 days ago • 15
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 35
From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond Paper • 2411.03590 • Published Nov 6, 2024 • 10
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
Attention Prompting on Image for Large Vision-Language Models Paper • 2409.17143 • Published Sep 25, 2024 • 7
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 37
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18, 2024 • 44
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Paper • 2408.13257 • Published Aug 23, 2024 • 26
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published Aug 20, 2024 • 42