Aadesh

neo-9981

AI & ML interests

NLP

Recent Activity

upvoted a collection 4 days ago
AI Engineering
liked a model 7 days ago
meta-llama/Llama-3.3-70B-Instruct
upvoted an article 10 days ago
Common AI Model Formats
View all activity

Organizations

None yet

neo-9981's activity

upvoted an article 10 days ago
reacted to jjokah's post with πŸ‘ 14 days ago
view post
Post
4615
The past few years have been a blast for artificial intelligence, with large language models (LLMs) stunning everyone with their capabilities and powering everything from chatbots to code assistants. However, not all applications demand the massive size and complexity of LLMs, the computational power required makes them impractical for many use cases. This is why Small Language Models (SLMs) entered the scene to make powerful AI models more accessible by shrinking in size.

In this article we went through what SLMs are, how they are made small, their benefits and limitations, real-world use cases, and how they can be used on mobile and desktop devices.
https://huggingface.co/blog/jjokah/small-language-model
  • 2 replies
Β·
upvoted 2 articles 21 days ago
view article
Article

🦸🏻#10: Does Present-Day GenAI Actually Reason?

By Kseniase β€’
β€’ 7
view article
Article

What is test-time compute and how to scale it?

By Kseniase and 1 other β€’
β€’ 53
reacted to Kseniase's post with πŸš€ 21 days ago
view post
Post
3251
8 New Applications of Test-Time Scaling

We've noticed a huge interest in test-time scaling (TTS), so we decided to explore this concept further. Test-time compute (TTC) refers to the amount of computational power used by an AI model when generating a response. Many researchers are now focused on scaling TTC, as it enables slow, deep "thinking" and step-by-step reasoning, which improves overall models' performance.

Here are 8 fresh studies on test-time scaling:

1. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)
Introduces an LM that scales TTC by reasoning in latent space instead of generating more tokens with no special training. Here, a recurrent block to processes information iteratively.

2. Generating Symbolic World Models via Test-time Scaling of Large Language Models (2502.04728)
Shows how TTS is applied to enhance model's Planning Domain Definition Language (PDDL) reasoning capabilities, which can be used to generate a symbolic world model.

3. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling (2502.06703)
Analyzes optimal TTS strategies and shows how small models can outperform much larger ones.

4. Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis (2502.04128)
Shows how TTS improves expressiveness, timbre consistency and accuracy in speech synthesis with Llasa framework. It also dives into benefits of scaling train-time compute.

5. Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning (2502.07154)
Suggests a modified training loss for better reasoning of LLMs when scaling TTC.

6. Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures (2502.05078)
Unifies the strengths of chain, tree, and graph paradigms into one framework that expands reasoning only on necessary subproblems.

7. Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification (2502.01839)
Explores scaling trends of self-verification and how to improve its capabilities with TTC.

8. CodeMonkeys: Scaling Test-Time Compute for Software Engineering (2501.14723)
Explores how scaling serial compute (iterations) and parallel compute (trajectories), can improve accuracy in real-world software engineering issues.

Also, explore our article about TTS for more -> https://huggingface.co/blog/Kseniase/testtimecompute
  • 1 reply
Β·
upvoted an article about 2 months ago
view article
Article

Train 400x faster Static Embedding Models with Sentence Transformers

β€’ 156
upvoted an article 5 months ago
view article
Article

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

By Pringled and 1 other β€’
β€’ 77
upvoted an article 5 months ago
view article
Article

ColPali: Efficient Document Retrieval with Vision Language Models πŸ‘€

By manu β€’
β€’ 214