TEMPO / README.md

Add the practice

2d2f738 5 months ago

7.96 kB

	---
	license: apache-2.0
	datasets:
	- ETDataset/ett
	language:
	- en
	metrics:
	- mse
	- mae
	library_name: transformers
	pipeline_tag: time-series-forecasting
	tags:
	- Time-series
	- foundation-model
	- forecasting
	- TSFM
	base_model:
	- openai-community/gpt2
	---
	# [TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting](https://arxiv.org/abs/2310.04948)

	[![preprint](https://img.shields.io/static/v1?label=arXiv&message=2310.04948&color=B31B1B&logo=arXiv)](https://arxiv.org/pdf/2310.04948)

	![TEMPO_logo\|50%](pics/TEMPO_logo.png)



	The official code for [["TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting (ICLR 2024)"]](https://arxiv.org/pdf/2310.04948). TEMPO is one of the very first open source Time Series Foundation Models for forecasting task v1.0 version.

	![TEMPO-architecture](pics/TEMPO.png)


	## 💡 Demos

	### 1. Reproducing zero-shot experiments on ETTh2:

	Please try to reproduc the zero-shot experiments on ETTh2 [[here on Colab]](https://colab.research.google.com/drive/11qGpT7H1JMaTlMlm9WtHFZ3_cJz7p-og?usp=sharing).

	### 2. Zero-shot experiments on customer dataset:

	We use the following Colab page to show the demo of building the customer dataset and directly do the inference via our pre-trained foundation model: [[Colab]](https://colab.research.google.com/drive/1ZpWbK0L6mq1pav2yDqOuORo4rHbv80-A?usp=sharing)




	# 🔧 Hands-on: Using Foundation Model

	## 1. Download the repo

	```
	git clone [email protected]:DC-research/TEMPO.git
	```

	## 2. [Optional] Download the model and config file via commands
	```
	huggingface-cli download Melady/TEMPO config.json --local-dir ./TEMPO/TEMPO_checkpoints
	```
	```
	huggingface-cli download Melady/TEMPO TEMPO-80M_v1.pth --local-dir ./TEMPO/TEMPO_checkpoints
	```

	```
	huggingface-cli download Melady/TEMPO TEMPO-80M_v2.pth --local-dir ./TEMPO/TEMPO_checkpoints
	```

	## 3. Build the environment

	```
	conda create -n tempo python=3.8
	```
	```
	conda activate tempo
	```
	```
	cd TEMPO
	```
	```
	pip install -r requirements.txt
	```

	## 4. Script Demo

	A streamlining example showing how to perform forecasting using TEMPO:

	```python
	# Third-party library imports
	import numpy as np
	import torch
	from numpy.random import choice
	# Local imports
	from models.TEMPO import TEMPO


	model = TEMPO.load_pretrained_model(
	device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu'),
	repo_id = "Melady/TEMPO",
	filename = "TEMPO-80M_v1.pth",
	cache_dir = "./checkpoints/TEMPO_checkpoints"
	)

	input_data = np.random.rand(336) # Random input data
	with torch.no_grad():
	predicted_values = model.predict(input_data, pred_length=96)
	print("Predicted values:")
	print(predicted_values)

	```


	## 5. Online demo

	Please try our foundation model demo [[here]](https://4171a8a7484b3e9148.gradio.live).

	![TEMPO_demo.jpg](pics/TEMPO_demo.jpg)


	# 🔨 Advanced Practice: Full Training Workflow!

	We also updated our models on HuggingFace: [[Melady/TEMPO]](https://huggingface.co/Melady/TEMPO).



	## 1. Get Data

	Download the data from [[Google Drive]](https://drive.google.com/drive/folders/13Cg1KYOlzM5C7K8gK8NfC-F3EYxkM3D2?usp=sharing) or [[Baidu Drive]](https://pan.baidu.com/s/1r3KhGd0Q9PJIUZdfEYoymg?pwd=i9iy), and place the downloaded data in the folder`./dataset`. You can also download the STL results from [[Google Drive]](https://drive.google.com/file/d/1gWliIGDDSi2itUAvYaRgACru18j753Kw/view?usp=sharing), and place the downloaded data in the folder`./stl`.

	## 2. Run Scripts

	### 2.1 Pre-Training Stage
	```
	bash [ecl, etth1, etth2, ettm1, ettm2, traffic, weather].sh
	```

	### 2.2 Test/ Inference Stage

	After training, we can test TEMPO model under the zero-shot setting:

	```
	bash [ecl, etth1, etth2, ettm1, ettm2, traffic, weather]_test.sh
	```

	![TEMPO-results](pics/results.jpg)


	# Pre-trained Models

	You can download the pre-trained model from [[Google Drive]](https://drive.google.com/file/d/11Ho_seP9NGh-lQCyBkvQhAQFy_3XVwKp/view?usp=drive_link) and then run the test script for fun.

	# TETS dataset

	Here is the prompts use to generate the coresponding textual informaton of time series via [[OPENAI ChatGPT-3.5 API]](https://platform.openai.com/docs/guides/text-generation)

	![TEMPO-prompt](pics/TETS_prompt.png)

	The time series data are come from [[S&P 500]](https://www.spglobal.com/spdji/en/indices/equity/sp-500/#overview). Here is the EBITDA case for one company from the dataset:


	![Company1_ebitda_summary](pics/Company1_ebitda_summary.png)

	Example of generated contextual information for the Company marked above:

	![Company1_ebitda_summary_words.jpg](pics/Company1_ebitda_summary_words.jpg)




	You can download the processed data with text embedding from GPT2 from: [[TETS]](https://drive.google.com/file/d/1Hu2KFj0kp4kIIpjbss2ciLCV_KiBreoJ/view?usp=drive_link
	).

	# 🚀 News


	- Oct 2024: 🚀 We've streamlined our code structure, enabling users to download the pre-trained model and perform zero-shot inference with a single line of code! Check out our [demo](./run_TEMPO_demo.py) for more details. Our model's download count on HuggingFace is now trackable!

	- Jun 2024: 🚀 We added demos for reproducing zero-shot experiments in [Colab](https://colab.research.google.com/drive/11qGpT7H1JMaTlMlm9WtHFZ3_cJz7p-og?usp=sharing). We also added the demo of building the customer dataset and directly do the inference via our pre-trained foundation model: [Colab](https://colab.research.google.com/drive/1ZpWbK0L6mq1pav2yDqOuORo4rHbv80-A?usp=sharing)
	- May 2024: 🚀 TEMPO has launched a GUI-based online [demo](https://4171a8a7484b3e9148.gradio.live/), allowing users to directly interact with our foundation model!
	- May 2024: 🚀 TEMPO published the 80M pretrained foundation model in [HuggingFace](https://huggingface.co/Melady/TEMPO)!
	- May 2024: 🧪 We added the code for pretraining and inference TEMPO models. You can find a pre-training script demo in [this folder](./scripts/etth2.sh). We also added [a script](./scripts/etth2_test.sh) for the inference demo.

	- Mar 2024: 📈 Released [TETS dataset](https://drive.google.com/file/d/1Hu2KFj0kp4kIIpjbss2ciLCV_KiBreoJ/view?usp=drive_link) from [S&P 500](https://www.spglobal.com/spdji/en/indices/equity/sp-500/#overview) used in multimodal experiments in TEMPO.
	- Mar 2024: 🧪 TEMPO published the project [code](https://github.com/DC-research/TEMPO) and the pre-trained checkpoint [online](https://drive.google.com/file/d/11Ho_seP9NGh-lQCyBkvQhAQFy_3XVwKp/view?usp=drive_link)!
	- Jan 2024: 🚀 TEMPO [paper](https://openreview.net/pdf?id=YH5w12OUuU) get accepted by ICLR!
	- Oct 2023: 🚀 TEMPO [paper](https://arxiv.org/pdf/2310.04948) released on Arxiv!



	## ⏳ Upcoming Features

	- [✅] Parallel pre-training pipeline
	- [] Probabilistic forecasting
	- [] Multimodal dataset
	- [] Multimodal pre-training script


	# Contact
	Feel free to connect [email protected] / [email protected] if you’re interested in applying TEMPO to your real-world application.

	# Cite our work
	```
	@inproceedings{
	cao2024tempo,
	title={{TEMPO}: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting},
	author={Defu Cao and Furong Jia and Sercan O Arik and Tomas Pfister and Yixiang Zheng and Wen Ye and Yan Liu},
	booktitle={The Twelfth International Conference on Learning Representations},
	year={2024},
	url={https://openreview.net/forum?id=YH5w12OUuU}
	}
	```

	```
	@article{
	Jia_Wang_Zheng_Cao_Liu_2024,
	title={GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting},
	volume={38},
	url={https://ojs.aaai.org/index.php/AAAI/article/view/30383},
	DOI={10.1609/aaai.v38i21.30383},
	number={21},
	journal={Proceedings of the AAAI Conference on Artificial Intelligence},
	author={Jia, Furong and Wang, Kevin and Zheng, Yixiang and Cao, Defu and Liu, Yan},
	year={2024}, month={Mar.}, pages={23343-23351}
	}
	```