Gedas Bertasius's picture

3 2

Gedas Bertasius

gberta

https://www.gedasbertasius.com/

AI & ML interests

None yet

Recent Activity

upvoted a paper about 18 hours ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

commented on a paper about 18 hours ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

authored a paper 6 months ago

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

View all activity

Organizations

None yet

gberta's activity

upvoted a paper about 18 hours ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Paper • 2503.09590 • Published 1 day ago • 2

commented a paper about 18 hours ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Paper • 2503.09590 • Published 1 day ago • 2 •

authored a paper 6 months ago

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Paper • 2409.07450 • Published Sep 11, 2024 • 11

upvoted a paper about 1 year ago

Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20, 2024 • 26

authored 6 papers about 1 year ago

Is Space-Time Attention All You Need for Video Understanding?

Paper • 2102.05095 • Published Feb 9, 2021 • 1

SimpleClick: Interactive Image Segmentation with Simple Vision Transformers

Paper • 2210.11006 • Published Oct 20, 2022

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

Paper • 2309.10091 • Published Sep 18, 2023

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Paper • 2101.12059 • Published Jan 28, 2021

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Paper • 2401.10529 • Published Jan 19, 2024 • 1

Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20, 2024 • 26