--- license: apache-2.0 library_name: diffusers pipeline_tag: text-to-image --- # TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

## Setup The environment we conduct experiments are as follows: + python: 3.10 + torch: 2.1.2 + CUDA Version: 12.4 Please run `pip install -r requirement.txt` to install dependency packages. The erased model can be found :hugs:[here](https://huggingface.co/ddgoodgood/trce-erased-model/tree/main). Currently, our implementation is based only on SD1.4. We will release the implementation of TRCE on newer model in the future. ## RUN You can find the pre-cached COCO embeddings :hugs:[here](https://huggingface.co/ddgoodgood/trce-erased-model/tree/main). Please download the `cache` directory and place it in `data/cache`. ### Run stage-1 TRCE In the first stage, TRCE starts with a closed-form edit for the cross-attention layers, simply run: ``` bash # for erasing "sexual" python run_trce_stage1.py config/stage1/stage1_sexual_default.yaml # for erasing multiple malicious concepts python run_trce_stage1.py config/stage1/stage1_unsafe_default.yaml ``` You can modify the base model path and the output directory for the first-stage fine-tuned model in the configuration files. ### Run stage-2 TRCE Before the second stage, you need to prepare the denosing trajectory samples for the fine-tuning: ```bash python stage2_data_preparation.py ``` This script generates samples for both "sexual" and "multi-concept" fine-tuning, as well as unconditional samples for the regularization loss. Then, you can run the stage-2 using the following scripts: ``` bash # for erasing "sexual" python run_trce_stage2.py config/stage2/stage2_sexual_default.yaml # for erasing multiple malicious concepts python run_trce_stage2.py config/stage2/stage2_unsafe_default.yaml ``` ## Evaluation The evaluation relies on the following repositories: [NudeNet](https://github.com/notAI-tech/NudeNet), [Q16 Detector](https://github.com/ml-research/Q16), [Pytorch FID](https://github.com/mseitzer/pytorch-fid), and [CLIP Score](https://github.com/Taited/clip-score). Please install these repositories according to their instructions before proceeding with the evaluation. ### Generate image using erased model Firstly, use the following scripts with the specified UNet path and output path to generate images for different evaluation tasks. ``` # for evaluate "sexual" erasure python gen_sexual.py # for evaluate "multi concepts" erasure python gen_unsafe.py # for evaluate knowledge preservation on coco python gen_coco.py ``` Then, you can follow the instructions in `eval_nudenet_batch.ipynb`, `eval_unsafe.ipynb` and `eval_coco_batch.ipynb` to evaluate and review the performance of the erasure. If you encounter any issues while using this repository, please feel free to leave messages in issues or contact me at chenruidong@tju.edu.cn. I will respond as soon as possible. ## Citation ``` @article{chen2025reliable, title={TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models}, author={Ruidong, Chen and Honglin, Guo and Lanjun, Wang and Chenyu, Zhang and Weizh, Nie and An-An, Liu}, journal={arXiv preprint arXiv:2503.07389}, year={2025} } ``` ## Acknowledgement We built this repository based on the excellent work of previous projects: [RECE](https://github.com/CharlesGong12/RECE/tree/main), [MACE](https://github.com/Shilin-LU/MACE), and [Safree](https://github.com/jaehong31/SAFREE). Thank you to all who contributed.