nieshen commited on
Commit
a7e4772
·
verified ·
1 Parent(s): 6cb62b9

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Pretrained models for the paper *Scaling up Masked Diffusion Models on Text*
2
+
3
+ **Scaling law experiments**: We provided all pre-trained models in the *ar_safetensors* and *mdm_safetensors* folders.
4
+ For instance, the checkpoint `mdm-1028M-1600e18.safetensors` represents an MDM model with 1,028 million non-embedding
5
+ parameters and 1,600e18 training FLOPs. Similarly, the checkpoint `mdm-170M-100e18-rsl-0.01.safetensors` indicates
6
+ an MDM model with 170 million non-embedding parameters, 100e18 training FLOPs, and 1% of the dataset subjected
7
+ to random sequence lengths during pretraining.
8
+
9
+ **Conditional generation**: please see the *sharegpt_safetensors* folder.
10
+
11
+ **Reverse curse**: please see the *reverse_safetensors* folder
12
+
13
+ For all models, we provide models in `.pth` and `.safetensors` formats.
14
+
15
+