Zero Shot voice cloning with llasa 3b (Unofficial Demo)
Visual Retrieval with ColPali and Vespa
An end-to-end (e2e) Voice Language Model by Fish Audio.