view article Article π#90: Why AIβs Reasoning Tests Keep Failing Us By Kseniase β’ 3 days ago β’ 8
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System Paper β’ 2502.16750 β’ Published 11 days ago β’ 10
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper β’ 2502.18449 β’ Published 9 days ago β’ 63
Beyond Release: Access Considerations for Generative AI Systems Paper β’ 2502.16701 β’ Published 11 days ago β’ 11
Slamming: Training a Speech Language Model on One GPU in a Day Paper β’ 2502.15814 β’ Published 15 days ago β’ 65
GΓΆdel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement Paper β’ 2410.04444 β’ Published Oct 6, 2024 β’ 2
Tree of Thoughts: Deliberate Problem Solving with Large Language Models Paper β’ 2305.10601 β’ Published May 17, 2023 β’ 12
Chain of Hindsight Aligns Language Models with Feedback Paper β’ 2302.02676 β’ Published Feb 6, 2023 β’ 1
ReAct: Synergizing Reasoning and Acting in Language Models Paper β’ 2210.03629 β’ Published Oct 6, 2022 β’ 22
Reflexion: Language Agents with Verbal Reinforcement Learning Paper β’ 2303.11366 β’ Published Mar 20, 2023 β’ 5
view article Article π#89: AI in Action: How AI Engineers, Self-Optimizing Models, and Humanoid Robots Are Reshaping 2025 By Kseniase β’ 10 days ago β’ 4
Demonstrating specification gaming in reasoning models Paper β’ 2502.13295 β’ Published 16 days ago β’ 1
A Survey of Reinforcement Learning from Human Feedback Paper β’ 2312.14925 β’ Published Dec 22, 2023 β’ 1
Self-Consistency Improves Chain of Thought Reasoning in Language Models Paper β’ 2203.11171 β’ Published Mar 21, 2022 β’ 4
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Paper β’ 2201.11903 β’ Published Jan 28, 2022 β’ 11
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models Paper β’ 2205.10625 β’ Published May 21, 2022 β’ 1
PDDLEGO: Iterative Planning in Textual Environments Paper β’ 2405.19793 β’ Published May 30, 2024 β’ 1