--- license: apache-2.0 language: - en tags: - bridge_data - v2 - image - language --- This dataset contains [`BridgeData V1` and `BridgeData V2`](https://rail-berkeley.github.io/bridgedata/), which is orignally downloaded from [this tar.gz file](https://rail.eecs.berkeley.edu/datasets/bridge_release/data/demos_8_17.zip), and use the following scripts to preprocess the data: - [Preprocess V1](https://github.com/Kiteretsu77/This_and_That_VDM/blob/main/curation_pipeline/match_dataset_v1.py) - [Preprocess V2](https://github.com/Kiteretsu77/This_and_That_VDM/blob/main/curation_pipeline/match_dataset_v2.py) - [Train/test split](https://github.com/Kiteretsu77/This_and_That_VDM/blob/main/scripts/train_test_split.py) After the processing, we have the dataset with the following structure: | Folder name | Trajectory number | Size | | :---------: | :---------------: | :--: | | bridge_data_v1 (train) | 11007 | 30GB | | bridge_data_v2 (train) | 16527 | 62GB | | bridge_data_v1_test | 1222 | 3.3GB | | bridge_data_v2_test | 1836 | 6.9GB |