TIGER-Lab
/

VISTA-LongVA

Video-Text-to-Text

Model card Files Files and versions Community

wren93 commited on 5 days ago

Commit

fa425cc

·

verified ·

1 Parent(s): 4a06f31

Update README.md

Files changed (1) hide show

README.md +0 -1

README.md CHANGED Viewed

@@ -7,7 +7,6 @@ pipeline_tag: video-text-to-text
 This repo contains model checkpoints for **VISTA-LongVA**. [VISTA](https://huggingface.co/papers/2412.00927) is a video spatiotemporal augmentation method that generates long-duration and high-resolution video instruction-following data to enhance the video understanding capabilities of video LMMs.
-### This repo is under construction. Please stay tuned.
 [**🌐 Homepage**](https://tiger-ai-lab.github.io/VISTA/) | [**📖 arXiv**](https://arxiv.org/abs/2412.00927) | [**💻 GitHub**](https://github.com/TIGER-AI-Lab/VISTA) | [**🤗 VISTA-400K**](https://huggingface.co/datasets/TIGER-Lab/VISTA-400K) | [**🤗 Models**](https://huggingface.co/collections/TIGER-Lab/vista-674a2f0fab81be728a673193) | [**🤗 HRVideoBench**](https://huggingface.co/datasets/TIGER-Lab/HRVideoBench)
 ## Video Instruction Data Synthesis Pipeline

 This repo contains model checkpoints for **VISTA-LongVA**. [VISTA](https://huggingface.co/papers/2412.00927) is a video spatiotemporal augmentation method that generates long-duration and high-resolution video instruction-following data to enhance the video understanding capabilities of video LMMs.
 [**🌐 Homepage**](https://tiger-ai-lab.github.io/VISTA/) | [**📖 arXiv**](https://arxiv.org/abs/2412.00927) | [**💻 GitHub**](https://github.com/TIGER-AI-Lab/VISTA) | [**🤗 VISTA-400K**](https://huggingface.co/datasets/TIGER-Lab/VISTA-400K) | [**🤗 Models**](https://huggingface.co/collections/TIGER-Lab/vista-674a2f0fab81be728a673193) | [**🤗 HRVideoBench**](https://huggingface.co/datasets/TIGER-Lab/HRVideoBench)
 ## Video Instruction Data Synthesis Pipeline