-
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published -
Mapping Natural Language Commands to Web Elements
Paper • 1808.09132 • Published -
Learning to Navigate the Web
Paper • 1812.09195 • Published -
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations
Paper • 1909.00031 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2401.16158
-
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Paper • 2402.11131 • Published • 41 -
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Paper • 2401.16158 • Published • 17 -
Octopus v2: On-device language model for super agent
Paper • 2404.01744 • Published • 57
-
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
Paper • 2402.00769 • Published • 20 -
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Paper • 2311.05556 • Published • 80 -
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper • 2401.18058 • Published • 21 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 16
-
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Paper • 2401.16158 • Published • 17 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 126 -
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Paper • 2405.12107 • Published • 25 -
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Paper • 2406.01014 • Published • 30
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 27 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 20 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 64
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper • 2310.11954 • Published • 24 -
Training Chain-of-Thought via Latent-Variable Inference
Paper • 2312.02179 • Published • 8 -
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Paper • 2401.16158 • Published • 17 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 35
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 35 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9
-
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Paper • 2310.09478 • Published • 19 -
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Paper • 2310.08678 • Published • 12 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 242 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 13