PID: Physics-Informed Diffusion Model for Infrared Image Generation

Update

2025/05 The paper is accepted by Pattern Recognition: https://doi.org/10.1016/j.patcog.2025.111816
Arxiv version: 2407.09299
We have released our code.

Environment

It is recommended to install the environment with environment.yaml.

conda env create --file=environment.yaml

Datasets

Download KAIST dataset from https://github.com/SoonminHwang/rgbt-ped-detection

Download FLIRv1 dataset from https://www.flir.com/oem/adas/adas-dataset-form/

We adopt the official dataset split in our experiments.

Checkpoint

VQGAN can be downloaded from https://ommer-lab.com/files/latent-diffusion/vq-f8.zip (Other GAN models can be downloaded from https://github.com/CompVis/latent-diffusion).

TeVNet and PID heckpoints can be found in HuggingFace.

Evaluation

Use the shellscript to evaluate. indir is the input directory of visible RGB images, outdir is the output directory of translated infrared images, config is the chosen config in configs/latent-diffusion/config.yaml. We prepare some RGB images in dataset/KAIST for quick evaluation.

bash run_test_kaist512_vqf8.sh

Train

Dataset preparation

Prepare corresponding RGB and infrared images with same names in two directories.

Stage 1: Train TeVNet

cd TeVNet
bash shell/train.sh

Stage 2: Train PID

To accelerate training, we recommend using our pretrained model.

bash shell/run_train_kaist512_vqf8.sh

Acknowledgements

Our code is built upon LDM and HADAR. We thank the authors for their excellent work.

Citation

If you find this work is helpful in your research, please consider citing our paper:

@article{mao2026pid,
  title={PID: physics-informed diffusion model for infrared image generation},
  author={Mao, Fangyuan and Mei, Jilin and Lu, Shun and Liu, Fuyang and Chen, Liang and Zhao, Fangzhou and Hu, Yu},
  journal={Pattern Recognition},
  volume={169},
  pages={111816},
  year={2026},
  publisher={Elsevier}
}

If you have any question, feel free to contact [email protected].