Auto-Arena (Auto-Arena)

chiayewken

authored 3 papers 4 months ago

PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns

Paper • 2403.13315 • Published Mar 20, 2024

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths

Paper • 2410.10858 • Published Oct 7, 2024

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models

Paper • 2409.14277 • Published Sep 22, 2024

xww033

authored a paper 6 months ago

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29, 2024 • 56

isakzhang

authored 2 papers 6 months ago

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

Paper • 2405.20267 • Published May 30, 2024 • 1

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29, 2024 • 56

xww033

authored a paper 7 months ago

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Paper • 2406.16377 • Published Jun 24, 2024 • 12

ruochenzhao

authored a paper 8 months ago

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

Paper • 2405.20267 • Published May 30, 2024 • 1

isakzhang

updated a Space 8 months ago

Running

📊

README

isakzhang

authored 8 papers 9 months ago

Zero-Shot Text Classification via Self-Supervised Tuning

Paper • 2305.11442 • Published May 19, 2023 • 1

Easy-to-Hard Learning for Information Extraction

Paper • 2305.09193 • Published May 16, 2023

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models

Paper • 2306.05179 • Published Jun 8, 2023 • 2

Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations

Paper • 2305.10172 • Published May 17, 2023

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models

Paper • 2403.10258 • Published Mar 15, 2024

xww033

authored a paper about 1 year ago

Reasons to Reject? Aligning Language Models with Judgments

Paper • 2312.14591 • Published Dec 22, 2023 • 18

isakzhang

authored a paper about 1 year ago

SeaLLMs -- Large Language Models for Southeast Asia

Paper • 2312.00738 • Published Dec 1, 2023 • 24

Auto-Arena

AI & ML interests

Auto-Arena's activity

PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

README

Zero-Shot Text Classification via Self-Supervised Tuning

Easy-to-Hard Learning for Information Extraction

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models

Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations

Multilingual Jailbreak Challenges in Large Language Models

How do Large Language Models Handle Multilingualism?

Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models

Reasons to Reject? Aligning Language Models with Judgments

SeaLLMs -- Large Language Models for Southeast Asia

AI & ML interests

Team members 4

Auto-Arena's activity

README