Add initial draft of model card
Browse files
README.md
CHANGED
|
@@ -1,3 +1,119 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
model_id: TOTO-v1-Base
|
| 3 |
+
tags:
|
| 4 |
+
- time-series-forecasting
|
| 5 |
+
- foundation models
|
| 6 |
+
- pretrained models
|
| 7 |
+
- time series foundation models
|
| 8 |
+
- time series
|
| 9 |
+
- time-series
|
| 10 |
+
- transformers
|
| 11 |
+
- forecasting
|
| 12 |
+
- safetensors
|
| 13 |
+
- apache-2.0
|
| 14 |
+
paper: [Link to Paper] # TODO(Anna)
|
| 15 |
+
datasets:
|
| 16 |
+
# - BOOM [Link to BOOM Dataset] # TODO(Anna)
|
| 17 |
+
- GiftEvalPretrain
|
| 18 |
+
- Chronos # TODO(Anna) - is there a tag for this?
|
| 19 |
+
leaderboards:
|
| 20 |
+
- GiftEval (if results are public) #TODO(Anna) check how to do that
|
| 21 |
+
license: apache-2.0 # TODO(Anna) - check if renders correctly
|
| 22 |
+
# TODO(Anna) - check if rendered correctly when uploaded to Hub
|
| 23 |
+
# TODO(Anna) - export to RTF or anything to make it reviewable in GoogleDocs
|
| 24 |
+
---
|
| 25 |
+
# {{ model_id | default("TOTO", true) }}
|
| 26 |
+
|
| 27 |
+
<!-- TODO: Update this section to align with the new abstract of the paper once finalized. -->
|
| 28 |
+
|
| 29 |
+
TOTO, Time Series Optimized Transformer for Observability, is a time-series foundation model designed for multi-variate time series forecasting with a focus on observability metrics. TOTO leverages new architectural innovations and training recipes making it able to efficiently handle high-dimensional, sparse, and non-stationary time series that are hallmarks of the observability domain.
|
| 30 |
+
|
| 31 |
+
Trained on one trillion time series data points, including 43% of in-house real-life observability data, TOTO demonstrates state-of-the-art zero-shot performance on observability-specific tasks as well as a top ranking performance on the multi-domain time series forecasting GiftEval benchmark.
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+

|
| 37 |
+
|
| 38 |
+
Figure 1: Overview of the {{ model_id | default("TOTO", true) }} model architecture: Multivariate input time series of `L` steps are scaled using causal patch-based instance normalization, transformed into patch embeddings, and passed through a decoder-only transformer stack. The transformed features are unembedded and passed through a Student-T mixture model (Section: Probabilistic Prediction) which generates probabilistic next-patch predictions. **B.** The patch embedding takes as input a time series of `M` channels by `L` time steps. It divides the time dimension into patches of size `P` and projects these linearly into an embedding space of latent dimension `D`. This results in an output of size `M × (L/P) × D` which is fed to the transformer decoder. **C.** The transformer stack contains `F` identical segments. Each segment contains `N` time-wise transformer blocks followed by one channel-wise block.
|
| 39 |
+
|
| 40 |
+
## Key Features - TODO develop those or remove them
|
| 41 |
+
<!-- TODO: Update this section to align with the introduction in the paper once finalized. -->
|
| 42 |
+
- **Multi-Variate Time Series Support:** using **Proportional Factorized Space-Time Attention** that efficiently groups multivariate features, reducing computational overhead while maintaining high accuracy.
|
| 43 |
+
- **Tailored for Observability:** Observability metrics are machine-generated time series collected in near-real-time to monitor and optimize the performance and reliability of modern infrastructure and applications.
|
| 44 |
+
- **Decoder-Only Transformer Architecture**: supporting variable prediction horizons and lengths.
|
| 45 |
+
- **Point and Probabilisitc Forecasting**
|
| 46 |
+
- **Causal Patch-Wise Instance Normalization:** Improves forecasting performance and training stability in decoder-only models.
|
| 47 |
+
- **Student-T Mixture Model Prediction Head:** probabilistic forecasts modeling the complex, varied distributions typical of observability data.
|
| 48 |
+
- **Extensive Pretraining on Large-Scale Data:** Pretrained on 5–10× more data than leading time series foundation models, using a combination of synthetic, public, and observability-specific datasets.
|
| 49 |
+
- **High-Dimensional Time Series Support:** Efficiently handles datasets with a large number of variables.
|
| 50 |
+
|
| 51 |
+
### Resources - TODO
|
| 52 |
+
|
| 53 |
+
- **Paper:** "[Link to arxiv paper]"
|
| 54 |
+
- **Repository:** "[Link to github repo]"
|
| 55 |
+
- **Blog Post:** "[Link to Datadog BlogPost]"
|
| 56 |
+
- **BOOM:** "[Link to BOOM's Dataset card]"
|
| 57 |
+
|
| 58 |
+
## Usage
|
| 59 |
+
### Installation
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
# TODO(Anna) - update these with correct instructions
|
| 63 |
+
# Clone the repository
|
| 64 |
+
git clone https://github.com/DataDogFutureOpenSource/TOTO.git
|
| 65 |
+
|
| 66 |
+
# Navigate to the project directory
|
| 67 |
+
cd foundation-models-research/toto
|
| 68 |
+
|
| 69 |
+
# Install the required dependencies
|
| 70 |
+
pip install -r requirements.txt
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
### Running an Inference
|
| 74 |
+
|
| 75 |
+
For a step-by-step guide on running inferences with TOTO, please refer to our [GitHub repository's inference tutorial notebook](https://github.com/DataDogFutureOpenSource/TOTO/XXX/notebooks/inference_tutorial.ipynb).
|
| 76 |
+
|
| 77 |
+
### Usage Recommendations - TODO remove or develop
|
| 78 |
+
<!-- TODO: Share best practices for maybe optimal context length, prediction length?. -->
|
| 79 |
+
|
| 80 |
+
## Training Details - TODO keep or remove?
|
| 81 |
+
|
| 82 |
+
### PreTraining Data
|
| 83 |
+
|
| 84 |
+
| Dataset Name | Link to Dataset Card |
|
| 85 |
+
|--------------------|------------------------------------------|
|
| 86 |
+
| GiftEval Pretrain | [Link to GiftEval Pretrain Dataset Card] |
|
| 87 |
+
| Chronos | [Link to Chronos Dataset Card] |
|
| 88 |
+
| Synthetic | [Link to Synthetic Dataset Card] |
|
| 89 |
+
| Observability | [Link to Observability Dataset Card] |
|
| 90 |
+
|
| 91 |
+
For more details about the pretraining data and preprocessing steps, please refer to the [paper](#TODO-Link-to-Paper) or the [GitHub repository](https://github.com/DataDogFutureOpenSource/TOTO).
|
| 92 |
+
|
| 93 |
+
### Training Hyperparameters - TODO keep or remove?
|
| 94 |
+
|
| 95 |
+
The training hyperparameters for TOTO are defined in the YAML configuration file located in our GitHub repository. You can find the configuration file [here](https://github.com/DataDogFutureOpenSource/TOTO/blob/main/configs/toto_config.yaml).
|
| 96 |
+
|
| 97 |
+
## Results - TODO keep or remove?
|
| 98 |
+
|
| 99 |
+
| Dataset Name | Link to Dataset Card | CRPS | MASE |
|
| 100 |
+
|--------------|------------------------------------------|-------|-------|
|
| 101 |
+
| BOOM | [Link to BOOM Dataset Card] | TBD | TBD |
|
| 102 |
+
| GiftEval | [Link to GiftEval Dataset Card] | TBD | TBD |
|
| 103 |
+
|
| 104 |
+
For more detailed information, please refer to the results section in our [paper](#TODO-Link-to-Paper).
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
## Citation - TODO
|
| 108 |
+
|
| 109 |
+
If you use TOTO in your research or applications, please cite us using the following:
|
| 110 |
+
|
| 111 |
+
```
|
| 112 |
+
@article{TOTO-v1-Base-2025,
|
| 113 |
+
title={TOTO: Time Series Optimized Transformer for Observability},
|
| 114 |
+
author={Your Author Names Here},
|
| 115 |
+
journal={arXiv preprint arXiv:XXXX.XXXXX},
|
| 116 |
+
year={2025},
|
| 117 |
+
url={https://arxiv.org/abs/XXXX.XXXXX}
|
| 118 |
+
}
|
| 119 |
+
```
|