neo-9981 (Aadesh )

upvoted a collection 4 days ago

AI Engineering

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book. • 238 items • Updated 5 days ago • 14

liked a model 7 days ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated Dec 21, 2024 • 740k • • 2.11k

upvoted an article 10 days ago

Article

Common AI Model Formats

By

•

11 days ago

• 27

liked a model 14 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Text Generation • Updated 14 days ago • 1.49M • • 1.24k

reacted to jjokah's post with 👍 14 days ago

Post

4615

The past few years have been a blast for artificial intelligence, with large language models (LLMs) stunning everyone with their capabilities and powering everything from chatbots to code assistants. However, not all applications demand the massive size and complexity of LLMs, the computational power required makes them impractical for many use cases. This is why Small Language Models (SLMs) entered the scene to make powerful AI models more accessible by shrinking in size.

In this article we went through what SLMs are, how they are made small, their benefits and limitations, real-world use cases, and how they can be used on mobile and desktop devices.
https://huggingface.co/blog/jjokah/small-language-model

2 replies

·

upvoted 2 articles 21 days ago

Article

🦸🏻#10: Does Present-Day GenAI Actually Reason?

By

•

22 days ago

• 7

Article

What is test-time compute and how to scale it?

By

and 1 other •

Feb 6

• 53

reacted to Kseniase's post with 🚀 21 days ago

Post

3251

8 New Applications of Test-Time Scaling

We've noticed a huge interest in test-time scaling (TTS), so we decided to explore this concept further. Test-time compute (TTC) refers to the amount of computational power used by an AI model when generating a response. Many researchers are now focused on scaling TTC, as it enables slow, deep "thinking" and step-by-step reasoning, which improves overall models' performance.

Here are 8 fresh studies on test-time scaling:

1. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)
Introduces an LM that scales TTC by reasoning in latent space instead of generating more tokens with no special training. Here, a recurrent block to processes information iteratively.

2. Generating Symbolic World Models via Test-time Scaling of Large Language Models (2502.04728)
Shows how TTS is applied to enhance model's Planning Domain Definition Language (PDDL) reasoning capabilities, which can be used to generate a symbolic world model.

3. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling (2502.06703)
Analyzes optimal TTS strategies and shows how small models can outperform much larger ones.

4. Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis (2502.04128)
Shows how TTS improves expressiveness, timbre consistency and accuracy in speech synthesis with Llasa framework. It also dives into benefits of scaling train-time compute.

5. Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning (2502.07154)
Suggests a modified training loss for better reasoning of LLMs when scaling TTC.

6. Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures (2502.05078)
Unifies the strengths of chain, tree, and graph paradigms into one framework that expands reasoning only on necessary subproblems.

7. Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification (2502.01839)
Explores scaling trends of self-verification and how to improve its capabilities with TTC.

8. CodeMonkeys: Scaling Test-Time Compute for Software Engineering (2501.14723)
Explores how scaling serial compute (iterations) and parallel compute (trajectories), can improve accuracy in real-world software engineering issues.

Also, explore our article about TTS for more -> https://huggingface.co/blog/Kseniase/testtimecompute