Benhao Tang PRO

benhaotang

AI & ML interests

Physics Master student in theoretical particle physics at Universität Heidelberg, actively looking into the possibilities of integrating AI into future physics research.

Recent Activity

reacted to albertvillanova's post with 👍 about 13 hours ago

🚀 New smolagents update: Safer Local Python Execution! 🦾🐍 With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒 Here's why this matters & what you need to know! 🧵👇 1️⃣ Why is local execution risky? ⚠️ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data. 2️⃣ New Safety Layer in smolagents 🛡️ We now inspect every return value during execution: ✅ Allowed: Safe built-in types (e.g., numbers, strings, lists) ⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil) 3️⃣ Immediate Benefits 💡 - Prevent agents from accessing unsafe builtins - Block unauthorized file or network access - Reduce accidental security vulnerabilities 4️⃣ Security Disclaimer ⚠️ 🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨 If you need true isolation, use a remote sandboxed executor like Docker or E2B. 5️⃣ The Best Practice: Use Sandboxed Execution 🔐 For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation. 6️⃣ Upgrade Now & Stay Safe! 🚀 Check out the latest smolagents release and start building safer AI agents today. 🔗 https://github.com/huggingface/smolagents What security measures do you take when running AI-generated code? Let’s discuss! 👇 #AI #smolagents #Python #Security

liked a model 3 days ago

deepseek-ai/DeepSeek-R1

liked a model 3 days ago

Qwen/QwQ-32B

View all activity

Organizations

None yet

Posts 1

Post

2391

Try out my updated implementation of forked OpenDeepResearcher(link below) as an OpenAI compatible endpoint, but with full control, can be deployed completely free with Gemini api or completely locally with ollama, or pay-as-you-go in BYOK format, the AI agents will think dynamically based on the difficulties of given research, compatible with any OpenAI compatible configurable clients(Msty, Chatbox, even vscode AI Toolkit playground).

If you don't want to pay OpenAI $200 to use or want to take control of your deep research, check out here:
👉 https://github.com/benhaotang/OpenDeepResearcher-via-searxng

**Personal take**

Based on my testing against Perplexity's and Gemini's implementation with some Physics domain questions, mine is comparable and very competent at finding even the most rare articles or methods.

Also a funny benchmark of mine to test all these searching models, is to trouble shot a WSL2 hanging issue I experienced last year, with prompt:

> wsl2 in windows hangs in background with high vmmem cpu usage once in a while, especially after hibernation, no error logs captured in linux, also unable to shutdown in powershell, provide solutions

the final solution that took me a day last year to find is to patch the kernel with some steps documented in carlfriedrich's repo and wait Microsoft to solve it(it is buried deep in wsl issues). Out of the three, only my Deep Research agent has found this solution, Perplexity and Gemini just focus on other force restart or memory management methods. I am very impressed with how it has this kind of obscure and scarce trouble shooting ability.

**Limitations**

Some caveats to be done later:
- Multi-turn conversation is not yet supported, so no follow-up questions
- System message is only extra writing instructions, don't affect on search
- Small local model may have trouble citing source reliably, I am working on a fix to fact check all citation claims