File size: 4,690 Bytes
5c8ef86
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
## Prepare datasets

In our paper, we conduct experiments on three common-used datasets, including Ref-COCO, Ref-COCO+ and G-Ref.

### 1. COCO 2014

The data could be found at [here](https://cocodataset.org/#download). Please run the following commands to download.

```shell
# download
mkdir datasets && cd datasets
wget http://images.cocodataset.org/zips/train2014.zip

# unzip
unzip train2014.zip -d images/ && rm train2014.zip

```

### 2. Ref-COCO

The data could be found at [here](https://github.com/lichengunc/refer). Please run the following commands to download and convert.

```shell
# download
wget https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco.zip

# unzip
unzip refcoco.zip && rm refcoco.zip

# convert
python ../tools/data_process.py --data_root . --output_dir . --dataset refcoco --split unc --generate_mask

# lmdb
python ../tools/folder2lmdb.py -j anns/refcoco/train.json -i images/train2014/ -m masks/refcoco -o lmdb/refcoco
python ../tools/folder2lmdb.py -j anns/refcoco/val.json -i images/train2014/ -m masks/refcoco -o lmdb/refcoco
python ../tools/folder2lmdb.py -j anns/refcoco/testA.json -i images/train2014/ -m masks/refcoco -o lmdb/refcoco
python ../tools/folder2lmdb.py -j anns/refcoco/testB.json -i images/train2014/ -m masks/refcoco -o lmdb/refcoco

# clean
rm -r refcoco

```

### 3. Ref-COCO+

The data could be found at [here](https://github.com/lichengunc/refer). Please run the following commands to download and convert.

```shell
# download
wget https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco+.zip

# unzip
unzip refcoco+.zip && rm refcoco+.zip

# convert
python ../tools/data_process.py --data_root . --output_dir . --dataset refcoco+ --split unc --generate_mask

# lmdb
python ../tools/folder2lmdb.py -j anns/refcoco+/train.json -i images/train2014/ -m masks/refcoco+ -o lmdb/refcoco+
python ../tools/folder2lmdb.py -j anns/refcoco+/val.json -i images/train2014/ -m masks/refcoco+ -o lmdb/refcoco+
python ../tools/folder2lmdb.py -j anns/refcoco+/testA.json -i images/train2014/ -m masks/refcoco+ -o lmdb/refcoco+
python ../tools/folder2lmdb.py -j anns/refcoco+/testB.json -i images/train2014/ -m masks/refcoco+ -o lmdb/refcoco+

# clean
rm -r refcoco+

```

### 4. Ref-COCOg

The data could be found at [here](https://github.com/lichengunc/refer). Please run the following commands to download and convert.
(Note that we adopt two different splits of this dataset, 'umd' and 'google'.)

```shell
# download
wget https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcocog.zip

# unzip
unzip refcocog.zip && rm refcocog.zip

# convert
python ../tools/data_process.py --data_root . --output_dir . --dataset refcocog --split umd --generate_mask  # umd split
mv anns/refcocog anns/refcocog_u
mv masks/refcocog masks/refcocog_u

python ../tools/data_process.py --data_root . --output_dir . --dataset refcocog --split google --generate_mask  # google split
mv anns/refcocog anns/refcocog_g
mv masks/refcocog masks/refcocog_g

# lmdb
python ../tools/folder2lmdb.py -j anns/refcocog_u/train.json -i images/train2014/ -m masks/refcocog_u -o lmdb/refcocog_u
python ../tools/folder2lmdb.py -j anns/refcocog_u/val.json -i images/train2014/ -m masks/refcocog_u -o lmdb/refcocog_u
python ../tools/folder2lmdb.py -j anns/refcocog_u/test.json -i images/train2014/ -m masks/refcocog_u -o lmdb/refcocog_u

python ../tools/folder2lmdb.py -j anns/refcocog_g/train.json -i images/train2014/ -m masks/refcocog_g -o lmdb/refcocog_g
python ../tools/folder2lmdb.py -j anns/refcocog_g/val.json -i images/train2014/ -m masks/refcocog_g -o lmdb/refcocog_g

rm -r refcocog

```

### 5. Datasets struture

After the above-mentioned commands, the strutre of the dataset folder should be like:

```none
datasets
β”œβ”€β”€ anns
β”‚   β”œβ”€β”€ refcoco
β”‚   β”‚   β”œβ”€β”€ xxx.json
β”‚   β”œβ”€β”€ refcoco+
β”‚   β”‚   β”œβ”€β”€ xxx.json
β”‚   β”œβ”€β”€ refcocog_g
β”‚   β”‚   β”œβ”€β”€ xxx.json
β”‚   β”œβ”€β”€ refcocog_u
β”‚   β”‚   β”œβ”€β”€ xxx.json
β”œβ”€β”€ images
β”‚   β”œβ”€β”€ train2014
β”‚   β”‚   β”œβ”€β”€ xxx.jpg
β”œβ”€β”€ lmdb
β”‚   β”œβ”€β”€ refcoco
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb-lock
β”‚   β”œβ”€β”€ refcoco+
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb-lock
β”‚   β”œβ”€β”€ refcocog_g
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb-lock
β”‚   β”œβ”€β”€ refcocog_u
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb
β”‚   β”‚   β”œβ”€β”€ xxx.lmdb-lock
β”œβ”€β”€ masks
β”‚   β”œβ”€β”€ refcoco
β”‚   β”‚   β”œβ”€β”€ xxx.png
β”‚   β”œβ”€β”€ refcoco+
β”‚   β”‚   β”œβ”€β”€ xxx.png
β”‚   β”œβ”€β”€ refcocog_g
β”‚   β”‚   β”œβ”€β”€ xxx.png
β”‚   β”œβ”€β”€ refcocog_u
β”‚   β”‚   β”œβ”€β”€ xxx.png

```