EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval Paper • 2407.16658 • Published Jul 23, 2024
Video-adverb retrieval with compositional adverb-action embeddings Paper • 2309.15086 • Published Sep 26, 2023
Semantic Image Synthesis with Semantically Coupled VQ-Model Paper • 2209.02536 • Published Sep 6, 2022
Temporal and cross-modal attention for audio-visual zero-shot learning Paper • 2207.09966 • Published Jul 20, 2022
Text-to-feature diffusion for audio-visual few-shot learning Paper • 2309.03869 • Published Sep 7, 2023