Visual Question Answering
Transformers
English
videollama2_qwen2
text-generation
multimodal large language model
large video-language model
Inference Endpoints
File size: 31 Bytes
4e2c694
 
 
1
2
3
4
---

license: apache-2.0
---