Update README.md (#3)
Browse files- Update README.md (af077e000916bc4aee445e5e8c11599b63a83a9d)
Co-authored-by: Heloise Chomet <[email protected]>
README.md
CHANGED
|
@@ -1,3 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
## License summary
|
| 2 |
|
| 3 |
1. The Licensed Models are **only** available under this License for Non-Commercial Purposes.
|
|
|
|
| 1 |
+
# ViSNet
|
| 2 |
+
## Reference
|
| 3 |
+
Yusong Wang, Tong Wang, Shaoning Li, Xinheng He, Mingyu Li, Zun Wang, Nanning Zheng, Bin Shao, and Tie-Yan Liu.
|
| 4 |
+
Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing.
|
| 5 |
+
Nature Communications, 15(1), January 2024. ISSN: 2041-1723.
|
| 6 |
+
URL: https://dx.doi.org/10.1038/s41467-023-43720-2.
|
| 7 |
+
## How to Use
|
| 8 |
+
For complete usage instructions and more information, please refer to our [documentation](https://instadeep.github.io/mlip)
|
| 9 |
+
## Model architecture
|
| 10 |
+
| Parameter | Value | Description |
|
| 11 |
+
|--------------------|----------|--------------------------------------------------------------------------|
|
| 12 |
+
| `num_layers` | `4` | Number of ViSNet layers. |
|
| 13 |
+
| `num_channels` | `128` | Number of channels. |
|
| 14 |
+
| `l_max` | `2` | Highest harmonic order included in the Spherical Harmonics series. |
|
| 15 |
+
| `num_heads` | `8` | Number of heads in the attention block. |
|
| 16 |
+
| `num_rbf` | `32` | Number of radial basis functions in the embedding block. |
|
| 17 |
+
| `trainable_rbf` | `False` | Whether to add learnable weights to the radial embedding basis functions.|
|
| 18 |
+
| `activation` | `silu` | Activation function for the output block. |
|
| 19 |
+
| `attn_activation` | `silu` | Activation function for the attention block. |
|
| 20 |
+
| `vecnorm_type` | `None` | Type of the vector norm. |
|
| 21 |
+
| `atomic_energies` | `average`| Treatment of the atomic energies. |
|
| 22 |
+
| `avg_um_neighbors` | `None` | Mean number of neighbors. |
|
| 23 |
+
For more information about ViSNet hyperparameters,
|
| 24 |
+
please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/models/visnet.html#mlip.models.visnet.config.VisnetConfig)
|
| 25 |
+
## Training
|
| 26 |
+
Training is performed over 220 epochs, with an exponential moving average (EMA) decay rate of 0.99.
|
| 27 |
+
The model employs a Huber loss function with scheduled weights for the energy and force components.
|
| 28 |
+
Initially, the energy term is weighted at 40 and the force term at 1000.
|
| 29 |
+
At epoch 115, these weights are flipped.
|
| 30 |
+
We use our default MLIP optimizer in v1.0.0 with the following settings:
|
| 31 |
+
| Parameter | Value | Description |
|
| 32 |
+
|----------------------------------|----------------|-----------------------------------------------------------------|
|
| 33 |
+
| `init_learning_rate` | `0.0001` | Initial learning rate. |
|
| 34 |
+
| `peak_learning_rate` | `0.0001` | Peak learning rate. |
|
| 35 |
+
| `final_learning_rate` | `0.0001` | Final learning rate. |
|
| 36 |
+
| `weight_decay` | `0` | Weight decay. |
|
| 37 |
+
| `warmup_steps` | `4000` | Number of optimizer warm-up steps. |
|
| 38 |
+
| `transition_steps` | `360000` | Number of optimizer transition steps. |
|
| 39 |
+
| `grad_norm` | `500` | Gradient norm used for gradient clipping. |
|
| 40 |
+
| `num_gradient_accumulation_steps`| `1` | Steps to accumulate before taking an optimizer step. |
|
| 41 |
+
For more information about the optimizer,
|
| 42 |
+
please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/training/optimizer.html#mlip.training.optimizer_config.OptimizerConfig)
|
| 43 |
+
## Dataset
|
| 44 |
+
| Parameter | Value | Description |
|
| 45 |
+
|-----------------------------|-------|--------------------------------------------|
|
| 46 |
+
| `graph_cutoff_angstrom` | `5` | Graph cutoff distance (in Å). |
|
| 47 |
+
| `max_n_node` | `32` | Maximum number of nodes allowed in a batch.|
|
| 48 |
+
| `max_n_edge` | `288` | Maximum number of edges allowed in a batch.|
|
| 49 |
+
| `batch_size` | `16` | Number of graphs in a batch. |
|
| 50 |
+
This model was trained on the [SPICE2_curated dataset](https://huggingface.co/datasets/InstaDeepAI/SPICE2-curated).
|
| 51 |
+
For more information about dataset configuration
|
| 52 |
+
please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/data/dataset_configs.html#mlip.data.configs.GraphDatasetBuilderConfig)
|
| 53 |
+
|
| 54 |
## License summary
|
| 55 |
|
| 56 |
1. The Licensed Models are **only** available under this License for Non-Commercial Purposes.
|