njeffrie commited on
Commit
a753236
·
verified ·
1 Parent(s): b091215

Update config.json

Browse files

Add `pad_head_dim_to_multiple_of` field to allow the use of [memory efficient attention](https://pytorch.org/blog/accelerated-pytorch-2/#:~:text=The%20head%20dimension%20must%20be%20a%20multiple%20of%208%20for%2016%2Dbit%20floating%20point%20numbers%20and%20a%20multiple%20of%204%20for%2032%2Dbit%20floating%20point%20numbers.%20At%20present%2C%20the%20maximum%20head_dim%20support%20for%20the%20Flash%20Attention%20custom%20kernel%20is%20128).

Files changed (1) hide show
  1. config.json +2 -1
config.json CHANGED
@@ -29,5 +29,6 @@
29
  "torch_dtype": "float32",
30
  "transformers_version": "4.48.0.dev0",
31
  "use_cache": true,
32
- "vocab_size": 32768
 
33
  }
 
29
  "torch_dtype": "float32",
30
  "transformers_version": "4.48.0.dev0",
31
  "use_cache": true,
32
+ "vocab_size": 32768,
33
+ "pad_head_dim_to_multiple_of": 8
34
  }