VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation Paper β’ 2412.00927 β’ Published Dec 1, 2024 β’ 26