KarlP verityw commited on
Commit
3abba70
·
verified ·
1 Parent(s): 738a16b

add-calib-docs (#3)

Browse files

- update (b1c833951b16e034df3df6e5feeaa86dc3842a81)
- update formatting error of readme (77ed38f715f8af7098b58ac295182f4a40afbafe)


Co-authored-by: William Chen <[email protected]>

Files changed (4) hide show
  1. README.md +65 -2
  2. camera_serials.json +3 -0
  3. episode_id_to_path.json +3 -0
  4. intrinsics.json +3 -0
README.md CHANGED
@@ -1,5 +1,4 @@
1
  # DROID Annotations
2
-
3
  This repo contains additional annotation data for the DROID dataset which we completed after the initial dataset release.
4
 
5
  Concretely, it contains the following information:
@@ -20,6 +19,60 @@ for a subset of the DROID episodes. Concretely, we provide the following three c
20
  - `cam2cam_extrinsics.json`: Contains ~90k entries with cam2cam relative poses and camera parameters for all of DROID.
21
  - `cam2base_extrinsic_superset.json`: Contains ~24k unique entries, total ~48k poses for both left and right camera calibrated with respect to the base.
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ## Accessing Annotation Data
25
 
@@ -35,4 +88,14 @@ import tensorflow as tf
35
  episode_paths = tf.io.gfile.glob("gs://gresearch/robotics/droid_raw/1.0.1/*/success/*/*/metadata_*.json")
36
  for p in episode_paths:
37
  episode_id = p[:-5].split("/")[-1].split("_")[-1]
38
- ```
 
 
 
 
 
 
 
 
 
 
 
1
  # DROID Annotations
 
2
  This repo contains additional annotation data for the DROID dataset which we completed after the initial dataset release.
3
 
4
  Concretely, it contains the following information:
 
19
  - `cam2cam_extrinsics.json`: Contains ~90k entries with cam2cam relative poses and camera parameters for all of DROID.
20
  - `cam2base_extrinsic_superset.json`: Contains ~24k unique entries, total ~48k poses for both left and right camera calibrated with respect to the base.
21
 
22
+ These files map episodes' unique ID (see Accessing Annotation Data below) to another dictionary containing metadata (e.g., detection quality metrics, see Appendix G of paper), as well as a map from camera ID to the extrinsics values. Said extrinsics is represented as a 6-element list of floats, indicating the translation and rotation. It can be easily converted into a homogeneous pose matrix:
23
+ ```
24
+ from scipy.spatial.transform import Rotation as R
25
+
26
+ # Assume extrinsics is that 6-element list
27
+ pos = extrinsics[0:3]
28
+ rot_mat = R.from_euler("xyz", extracted_extrinsics[3:6]).as_matrix()
29
+
30
+ # Make homogenous transformation matrix
31
+ cam_to_target_extrinsics_matrix = np.eye(4)
32
+ cam_to_target_extrinsics_matrix[:3, :3] = rot_mat
33
+ cam_to_target_extrinsics_matrix[:3, 3] = pos
34
+ ```
35
+ This represents a transformation matrix from the camera's frame to the target frame. Inverting it gets the transformation from target frame to camera frame (which is usually desirable, e.g., if one wants to project a point in the robot frame into the camera frame).
36
+
37
+ As the raw DROID video files were recorded on Zed cameras and saved in SVO format, they contain camera intrinsics which can be used in conjunction with the above. For convenience, we have extracted and saved all these annotations to `intrinsics.json` (~72k entries). This `json` has the following format:
38
+ ```
39
+ <episode ID>:
40
+ <external camera 1's serial>: [fx, cx, fy, cy for camera 1]
41
+ <external camera 2's serial>: [fx, cx, fy, cy for camera 2]
42
+ <wrist camera 1's serial>: [fx, cx, fy, cy for wrist camera]
43
+ ```
44
+ One can thus convert the list for a particular camera to a projection matrix via the following:
45
+ ```
46
+ import numpy as np
47
+
48
+ # Assume intrinsics is that 4-element list
49
+ fx, cx, fy, cy = intrinsics
50
+ intrinsics_matrix = np.array([
51
+ [fx, 0, cx],
52
+ [0, fy, cy],
53
+ [0, 0, 1]
54
+ ])
55
+ ```
56
+ Note that the intrinsics tend to not change much between episodes, but using the specific values corresponding to a particular episode tends to give the best results.
57
+
58
+ ## Example Calibration Use Case
59
+ Using the calibration information, one can project points in the robot's frame into pixel coordinates for the cameras. We will demonstrate how to map the robot gripper position to pixel coordinates for the external cameras with extrinsics in `cam2base_extrinsics.json`, see <TODO> for the full code.
60
+ ```
61
+ gripper_position_base = <Homogeneous gripper position in the base frame, as gotten from TFDS episode. Shape 4 x 1>
62
+ cam_to_base_extrinsics_matrix = <extrinsics matrix for some camera>
63
+ intrinsics_matrix = <intrinsics matrix for that same camera>
64
+
65
+ # Invert to get transform from base to camera frame
66
+ base_to_cam_extrinsics_matrix = np.linalg.inv(cam_to_base_extrinsics_matrix)
67
+
68
+ # Transform gripper position to camera frame, then remove homogeneous component
69
+ robot_gripper_position_cam = base_to_cam_extrinsics_matrix @ gripper_position_base
70
+ robot_gripper_position_cam = robot_gripper_position_cam[:3] # Now 3 x 1
71
+
72
+ # Project into pixel coordinates
73
+ pixel_positions = intrinsics_matrix @ robot_gripper_position_cam
74
+ pixel_positions = pixel_positions[:2] / pixel_positions[2] # Shape 2 x 1 # Done!
75
+ ```
76
 
77
  ## Accessing Annotation Data
78
 
 
88
  episode_paths = tf.io.gfile.glob("gs://gresearch/robotics/droid_raw/1.0.1/*/success/*/*/metadata_*.json")
89
  for p in episode_paths:
90
  episode_id = p[:-5].split("/")[-1].split("_")[-1]
91
+ ```
92
+
93
+ As using the above annotations requires these episode IDs (but the TFDS dataset only contains paths), we have included `episode_id_to_path.json` for convenience. The below code snippet loads this `json`, then gets the mapping from episode paths to IDs.
94
+
95
+ ```
96
+ import json
97
+ episode_id_to_path_path = "<path/to/episode_id_to_path.json>"
98
+ with open(episode_id_to_path_path, "r") as f:
99
+ episode_id_to_path = json.load(f)
100
+ episode_path_to_id = {v: k for k, v in episode_id_to_path.items()}
101
+ ```
camera_serials.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8d346c51dcef71248e280e44dcd7985a94433f6911460b31dcc098cab30acc4
3
+ size 12743876
episode_id_to_path.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e88ee7da94a40602cde4aacf22f2b48068f4f582c8ab38cf1888e06162a8085
3
+ size 7237770
intrinsics.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78c76755b075ae53e74a28c543bb1b185c50aa976458e95fbc9ba880a8cd2d51
3
+ size 125812944