KarlP verityw commited on
Commit
de45859
·
verified ·
1 Parent(s): 50e9c84

add-filter (#7)

Browse files

- update gitattr w/ new file (c2e8f5cf7b74ba0cb2c4c1715fd23bb428eae522)
- add filter ranges (7bff5a46a11256821b44435a3fe588daab6f1e94)
- update readme with filter info (ac2c562d1b8386789e420f056b959434f9973590)


Co-authored-by: William Chen <[email protected]>

Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +9 -0
  3. keep_ranges_1_0_1.json +3 -0
.gitattributes CHANGED
@@ -37,3 +37,4 @@ droid_language_annotations.json filter=lfs diff=lfs merge=lfs -text
37
  cam2base_extrinsic_superset.json filter=lfs diff=lfs merge=lfs -text
38
  cam2base_extrinsics.json filter=lfs diff=lfs merge=lfs -text
39
  cam2cam_extrinsics.json filter=lfs diff=lfs merge=lfs -text
 
 
37
  cam2base_extrinsic_superset.json filter=lfs diff=lfs merge=lfs -text
38
  cam2base_extrinsics.json filter=lfs diff=lfs merge=lfs -text
39
  cam2cam_extrinsics.json filter=lfs diff=lfs merge=lfs -text
40
+ keep_ranges_1_0_1.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -74,6 +74,15 @@ pixel_positions = intrinsics_matrix @ robot_gripper_position_cam
74
  pixel_positions = pixel_positions[:2] / pixel_positions[2] # Shape 2 x 1 # Done!
75
  ```
76
 
 
 
 
 
 
 
 
 
 
77
  ## Accessing Annotation Data
78
 
79
  All annotations are stored in `json` files which you can download from this repository.
 
74
  pixel_positions = pixel_positions[:2] / pixel_positions[2] # Shape 2 x 1 # Done!
75
  ```
76
 
77
+ ## Filtering Data
78
+ Many episodes in DROID contain significant pauses. This is an issue when training models, as these pauses typically happen at the start of episodes, causing the policy to likewise output idle actions when in the home position. To remediate this, we recommend filtering the data you train your policy on, removing all frames that map to idle actions.
79
+
80
+ We provide `keep_ranges_1_0_1.json` which maps episode keys to a list of time step ranges that should *not* be filtered out. The episode keys uniquely identify each episode, and are defined as `f"{recording_folderpath}--{file_path}"`. We opt for this unique identifier because both pieces of information are found in the episodes' RLDS metadata, and thus is easy to compute (even with TensorFlow symbolic operations).
81
+
82
+ To use this data, we recommend creating a `tf.lookup.StaticHashTable` identifying all frames that should not be filtered (with all other frames being filtered by default). Frames can be uniquely identified by simply concatenating their episode key with their time step within the episode.
83
+
84
+ This particular filter `json` is meant for `droid/1.0.1`, NOT `droid/1.0.0`. It was computed by finding all continuous sequences in episodes of non-idle actions that are at least of length 16 (1 second of wallclock time) that are not interrupted by 8 or more idle actions.
85
+
86
  ## Accessing Annotation Data
87
 
88
  All annotations are stored in `json` files which you can download from this repository.
keep_ranges_1_0_1.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5046049ab62a2df2f802df89cf0888b720f852ce2557849417d40899c9a38bc8
3
+ size 28573266