File size: 4,638 Bytes
9b855a7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
# Data Preparation
Create a new directory `data` to store all the datasets.
## Ref-COCO
Download the dataset from the official website [COCO](https://cocodataset.org/#download).
RefCOCO/+/g use the COCO2014 train split.
Download the annotation files from [github](https://github.com/lichengunc/refer).
Convert the annotation files:
```
python3 tools/data/convert_refexp_to_coco.py
```
Finally, we expect the directory structure to be the following:
```
ReferFormer
βββ data
β βββ coco
β β βββ train2014
β β βββ refcoco
β β β βββ instances_refcoco_train.json
β β β βββ instances_refcoco_val.json
β β βββ refcoco+
β β β βββ instances_refcoco+_train.json
β β β βββ instances_refcoco+_val.json
β β βββ refcocog
β β β βββ instances_refcocog_train.json
β β β βββ instances_refcocog_val.json
```
## Ref-Youtube-VOS
Download the dataset from the competition's website [here](https://competitions.codalab.org/competitions/29139#participate-get_data).
Then, extract and organize the file. We expect the directory structure to be the following:
```
ReferFormer
βββ data
β βββ ref-youtube-vos
β β βββ meta_expressions
β β βββ train
β β β βββ JPEGImages
β β β βββ Annotations
β β β βββ meta.json
β β βββ valid
β β β βββ JPEGImages
```
## Ref-DAVIS17
Downlaod the DAVIS2017 dataset from the [website](https://davischallenge.org/davis2017/code.html). Note that you only need to download the two zip files `DAVIS-2017-Unsupervised-trainval-480p.zip` and `DAVIS-2017_semantics-480p.zip`.
Download the text annotations from the [website](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/video-segmentation/video-object-segmentation-with-language-referring-expressions).
Then, put the zip files in the directory as follows.
```
ReferFormer
βββ data
β βββ ref-davis
β β βββ DAVIS-2017_semantics-480p.zip
β β βββ DAVIS-2017-Unsupervised-trainval-480p.zip
β β βββ davis_text_annotations.zip
```
Unzip these zip files.
```
unzip -o davis_text_annotations.zip
unzip -o DAVIS-2017_semantics-480p.zip
unzip -o DAVIS-2017-Unsupervised-trainval-480p.zip
```
Preprocess the dataset to Ref-Youtube-VOS format. (Make sure you are in the main directory)
```
python tools/data/convert_davis_to_ytvos.py
```
Finally, unzip the file `DAVIS-2017-Unsupervised-trainval-480p.zip` again (since we use `mv` in preprocess for efficiency).
```
unzip -o DAVIS-2017-Unsupervised-trainval-480p.zip
```
## A2D-Sentences
Follow the instructions and download the dataset from the website [here](https://kgavrilyuk.github.io/publication/actor_action/).
Then, extract the files. Additionally, we use the same json annotation files generated by [MTTR](https://github.com/mttr2021/MTTR). Please download these files from [onedrive](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/wjn922_connect_hku_hk/EnvcpWsMsY5NrMF5If3F6DwBseMrqmzQwpTtL8HXoLAChw?e=Vlv1et).
We expect the directory structure to be the following:
```
ReferFormer
βββ data
β βββ a2d_sentences
β β βββ Release
β β βββ text_annotations
β β β βββ a2d_annotation_with_instances
β β β βββ a2d_annotation.txt
β β β βββ a2d_missed_videos.txt
β β βββ a2d_sentences_single_frame_test_annotations.json
β β βββ a2d_sentences_single_frame_train_annotations.json
β β βββ a2d_sentences_test_annotations_in_coco_format.json
```
## JHMDB-Sentences
Follow the instructions and download the dataset from the website [here](https://kgavrilyuk.github.io/publication/actor_action/).
Then, extract the files. Additionally, we use the same json annotation files generated by [MTTR](https://github.com/mttr2021/MTTR). Please download these files from [onedrive](https://connecthkuhk-my.sharepoint.com/:f:/g/personal/wjn922_connect_hku_hk/EjPyzXq93s5Jm4GU07JrWIMBb6nObY8fEmLyuiGg-0uBtg?e=GsZ6jP).
We expect the directory structure to be the following:
```
ReferFormer
βββ data
β βββ jhmdb_sentences
β β βββ Rename_Images
β β βββ puppet_mask
β β βββ jhmdb_annotation.txt
β β βββ jhmdb_sentences_samples_metadata.json
β β βββ jhmdb_sentences_gt_annotations_in_coco_format.json
``` |