Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2310.11954

there's many more on arxiv if you search for CLAP

Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation

Paper • 2211.06687 • Published Nov 12, 2022 • 3
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning

Paper • 2401.17690 • Published Jan 31 • 5
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Paper • 2312.09911 • Published Dec 15, 2023 • 53
Audiobox: Unified Audio Generation with Natural Language Prompts

Paper • 2312.15821 • Published Dec 25, 2023 • 12

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24

Papers related to audio and music

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 43
FLAP: Fast Language-Audio Pre-training

Paper • 2311.01615 • Published Nov 2, 2023 • 16
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

Paper • 2308.01546 • Published Aug 3, 2023 • 17

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24
Training Chain-of-Thought via Latent-Variable Inference

Paper • 2312.02179 • Published Nov 28, 2023 • 8
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Paper • 2401.16158 • Published Jan 29 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24

Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing

Paper • 2310.12404 • Published Oct 19, 2023 • 15
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24
A Survey of AI Music Generation Tools and Models

Paper • 2308.12982 • Published Aug 24, 2023 • 1
UniAudio: An Audio Foundation Model Toward Universal Audio Generation

Paper • 2310.00704 • Published Oct 1, 2023 • 19

Branch-Solve-Merge Improves Large Language Model Evaluation and Generation

Paper • 2310.15123 • Published Oct 23, 2023 • 7
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search

Paper • 2310.13227 • Published Oct 20, 2023 • 12
LASER: LLM Agent with State-Space Exploration for Web Navigation

Paper • 2309.08172 • Published Sep 15, 2023 • 11
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Paper • 2310.16045 • Published Oct 24, 2023 • 14
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Paper • 2310.14566 • Published Oct 23, 2023 • 25
SILC: Improving Vision Language Pretraining with Self-Distillation

Paper • 2310.13355 • Published Oct 20, 2023 • 6
Conditional Diffusion Distillation

Paper • 2310.01407 • Published Oct 2, 2023 • 20

AI Music Models

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Paper • 2310.11954 • Published Oct 18, 2023 • 24

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs