VRIS_vip / docs /data.md

Add files using upload-large-folder tool

9b855a7 verified 28 days ago

4.64 kB

	# Data Preparation

	Create a new directory `data` to store all the datasets.

	## Ref-COCO

	Download the dataset from the official website [COCO](https://cocodataset.org/#download).
	RefCOCO/+/g use the COCO2014 train split.
	Download the annotation files from [github](https://github.com/lichengunc/refer).

	Convert the annotation files:

	```
	python3 tools/data/convert_refexp_to_coco.py
	```

	Finally, we expect the directory structure to be the following:

	```
	ReferFormer
	├── data
	│ ├── coco
	│ │ ├── train2014
	│ │ ├── refcoco
	│ │ │ ├── instances_refcoco_train.json
	│ │ │ ├── instances_refcoco_val.json
	│ │ ├── refcoco+
	│ │ │ ├── instances_refcoco+_train.json
	│ │ │ ├── instances_refcoco+_val.json
	│ │ ├── refcocog
	│ │ │ ├── instances_refcocog_train.json
	│ │ │ ├── instances_refcocog_val.json
	```


	## Ref-Youtube-VOS

	Download the dataset from the competition's website [here](https://competitions.codalab.org/competitions/29139#participate-get_data).
	Then, extract and organize the file. We expect the directory structure to be the following:

	```
	ReferFormer
	├── data
	│ ├── ref-youtube-vos
	│ │ ├── meta_expressions
	│ │ ├── train
	│ │ │ ├── JPEGImages
	│ │ │ ├── Annotations
	│ │ │ ├── meta.json
	│ │ ├── valid
	│ │ │ ├── JPEGImages
	```

	## Ref-DAVIS17

	Downlaod the DAVIS2017 dataset from the [website](https://davischallenge.org/davis2017/code.html). Note that you only need to download the two zip files `DAVIS-2017-Unsupervised-trainval-480p.zip` and `DAVIS-2017_semantics-480p.zip`.
	Download the text annotations from the [website](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/video-segmentation/video-object-segmentation-with-language-referring-expressions).
	Then, put the zip files in the directory as follows.


	```
	ReferFormer
	├── data
	│ ├── ref-davis
	│ │ ├── DAVIS-2017_semantics-480p.zip
	│ │ ├── DAVIS-2017-Unsupervised-trainval-480p.zip
	│ │ ├── davis_text_annotations.zip
	```

	Unzip these zip files.
	```
	unzip -o davis_text_annotations.zip
	unzip -o DAVIS-2017_semantics-480p.zip
	unzip -o DAVIS-2017-Unsupervised-trainval-480p.zip
	```

	Preprocess the dataset to Ref-Youtube-VOS format. (Make sure you are in the main directory)

	```
	python tools/data/convert_davis_to_ytvos.py
	```

	Finally, unzip the file `DAVIS-2017-Unsupervised-trainval-480p.zip` again (since we use `mv` in preprocess for efficiency).

	```
	unzip -o DAVIS-2017-Unsupervised-trainval-480p.zip
	```




	## A2D-Sentences

	Follow the instructions and download the dataset from the website [here](https://kgavrilyuk.github.io/publication/actor_action/).
	Then, extract the files. Additionally, we use the same json annotation files generated by [MTTR](https://github.com/mttr2021/MTTR). Please download these files from [onedrive](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/wjn922_connect_hku_hk/EnvcpWsMsY5NrMF5If3F6DwBseMrqmzQwpTtL8HXoLAChw?e=Vlv1et).
	We expect the directory structure to be the following:

	```
	ReferFormer
	├── data
	│ ├── a2d_sentences
	│ │ ├── Release
	│ │ ├── text_annotations
	│ │ │ ├── a2d_annotation_with_instances
	│ │ │ ├── a2d_annotation.txt
	│ │ │ ├── a2d_missed_videos.txt
	│ │ ├── a2d_sentences_single_frame_test_annotations.json
	│ │ ├── a2d_sentences_single_frame_train_annotations.json
	│ │ ├── a2d_sentences_test_annotations_in_coco_format.json
	```

	## JHMDB-Sentences

	Follow the instructions and download the dataset from the website [here](https://kgavrilyuk.github.io/publication/actor_action/).
	Then, extract the files. Additionally, we use the same json annotation files generated by [MTTR](https://github.com/mttr2021/MTTR). Please download these files from [onedrive](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/wjn922_connect_hku_hk/EjPyzXq93s5Jm4GU07JrWIMBb6nObY8fEmLyuiGg-0uBtg?e=GsZ6jP).
	We expect the directory structure to be the following:

	```
	ReferFormer
	├── data
	│ ├── jhmdb_sentences
	│ │ ├── Rename_Images
	│ │ ├── puppet_mask
	│ │ ├── jhmdb_annotation.txt
	│ │ ├── jhmdb_sentences_samples_metadata.json
	│ │ ├── jhmdb_sentences_gt_annotations_in_coco_format.json
	```