Update README.md
Browse files
README.md
CHANGED
@@ -28,6 +28,8 @@ MolmoAct is a fully open-source action reasoning model for robotic manipulation
|
|
28 |
|
29 |
This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
|
30 |
|
|
|
|
|
31 |
Quick links:
|
32 |
- π [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
|
33 |
- π [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
|
@@ -61,7 +63,7 @@ ckpt = "allenai/MolmoAct-7B-D-Pretrain-0812"
|
|
61 |
processor = AutoProcessor.from_pretrained(
|
62 |
ckpt,
|
63 |
trust_remote_code=True,
|
64 |
-
torch_dtype="
|
65 |
device_map="auto",
|
66 |
padding_side="left",
|
67 |
)
|
@@ -70,7 +72,7 @@ processor = AutoProcessor.from_pretrained(
|
|
70 |
model = AutoModelForImageTextToText.from_pretrained(
|
71 |
ckpt,
|
72 |
trust_remote_code=True,
|
73 |
-
torch_dtype="
|
74 |
device_map="auto",
|
75 |
)
|
76 |
|
|
|
28 |
|
29 |
This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
|
30 |
|
31 |
+
**Update:** Checkpoints are now stored in FP32 (previously BF16). The model was trained in FP32, so publishing FP32 weights aligns with training and enables fine-tuning or continued training directly from this repo. For inference, you can still run BF16 by casting at load, which is what we did for evaluations. See more in the [instructions](#quick-start) below.
|
32 |
+
|
33 |
Quick links:
|
34 |
- π [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
|
35 |
- π [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
|
|
|
63 |
processor = AutoProcessor.from_pretrained(
|
64 |
ckpt,
|
65 |
trust_remote_code=True,
|
66 |
+
torch_dtype="bfloat16",
|
67 |
device_map="auto",
|
68 |
padding_side="left",
|
69 |
)
|
|
|
72 |
model = AutoModelForImageTextToText.from_pretrained(
|
73 |
ckpt,
|
74 |
trust_remote_code=True,
|
75 |
+
torch_dtype="bfloat16",
|
76 |
device_map="auto",
|
77 |
)
|
78 |
|