Update README.md
Browse files
README.md
CHANGED
|
@@ -25,10 +25,18 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
| 25 |
|
| 26 |
|
| 27 |
<p align="center">
|
| 28 |
-
<a href='https://yxbian23.github.io/project/video-painter'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
</p>
|
| 30 |
|
| 31 |
-
**Your
|
|
|
|
|
|
|
| 32 |
|
| 33 |
|
| 34 |
**π Table of Contents**
|
|
@@ -36,10 +44,15 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
| 36 |
|
| 37 |
- [VideoPainter](#videopainter)
|
| 38 |
- [π₯ Update Log](#-update-log)
|
| 39 |
-
- [TODO](#todo)
|
| 40 |
- [π οΈ Method Overview](#οΈ-method-overview)
|
| 41 |
- [π Getting Started](#-getting-started)
|
|
|
|
|
|
|
| 42 |
- [ππΌ Running Scripts](#-running-scripts)
|
|
|
|
|
|
|
|
|
|
| 43 |
- [π€πΌ Cite Us](#-cite-us)
|
| 44 |
- [π Acknowledgement](#-acknowledgement)
|
| 45 |
|
|
@@ -48,11 +61,13 @@ Keywords: Video Inpainting, Video Editing, Video Generation
|
|
| 48 |
## π₯ Update Log
|
| 49 |
- [2025/3/09] π’ π’ [VideoPainter](https://huggingface.co/TencentARC/VideoPainter) are released, an efficient, any-length video inpainting & editing framework with plug-and-play context control.
|
| 50 |
- [2025/3/09] π’ π’ [VPData](https://huggingface.co/datasets/TencentARC/VPData) and [VPBench](https://huggingface.co/datasets/TencentARC/VPBench) are released, the largest video inpainting dataset with precise segmentation masks and dense video captions (>390K clips).
|
|
|
|
|
|
|
| 51 |
|
| 52 |
## TODO
|
| 53 |
|
| 54 |
- [x] Release trainig and inference code
|
| 55 |
-
- [x] Release
|
| 56 |
- [x] Release [VideoPainter checkpoints](https://huggingface.co/TencentARC/VideoPainter) (based on CogVideoX-5B)
|
| 57 |
- [x] Release [VPData and VPBench](https://huggingface.co/collections/TencentARC/videopainter-67cc49c6146a48a2ba93d159) for large-scale training and evaluation.
|
| 58 |
- [x] Release gradio demo
|
|
@@ -107,10 +122,7 @@ pip install -e .
|
|
| 107 |
</details>
|
| 108 |
|
| 109 |
<details>
|
| 110 |
-
<summary><b>
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
**VPBench and VPData**
|
| 114 |
|
| 115 |
You can download the VPBench [here](https://huggingface.co/datasets/TencentARC/VPBench), and the VPData [here](https://huggingface.co/datasets/TencentARC/VPData) (as well as the Davis we re-processed), which are used for training and testing the BrushNet. By downloading the data, you are agreeing to the terms and conditions of the license. The data structure should be like:
|
| 116 |
|
|
@@ -172,11 +184,16 @@ You can download the VPData (only mask and text annotations due to the space lim
|
|
| 172 |
git lfs install
|
| 173 |
git clone https://huggingface.co/datasets/TencentARC/VPData
|
| 174 |
mv VPBench data
|
| 175 |
-
|
| 176 |
-
unzip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 177 |
```
|
| 178 |
|
| 179 |
-
Noted: *Due to the space limit, you need to run the following script to download the raw videos of the
|
| 180 |
|
| 181 |
```
|
| 182 |
cd data_utils
|
|
@@ -216,6 +233,13 @@ git clone https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev
|
|
| 216 |
mv ckpt/FLUX.1-Fill-dev ckpt/flux_inp
|
| 217 |
```
|
| 218 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 219 |
|
| 220 |
The ckpt structure should be like:
|
| 221 |
|
|
@@ -237,6 +261,7 @@ The ckpt structure should be like:
|
|
| 237 |
|-- transformer
|
| 238 |
|-- vae
|
| 239 |
|-- ...
|
|
|
|
| 240 |
```
|
| 241 |
</details>
|
| 242 |
|
|
|
|
| 25 |
|
| 26 |
|
| 27 |
<p align="center">
|
| 28 |
+
<a href='https://yxbian23.github.io/project/video-painter'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
|
| 29 |
+
<a href="https://arxiv.org/abs/2503.05639"><img src="https://img.shields.io/badge/arXiv-2503.05639-b31b1b.svg"></a>
|
| 30 |
+
<a href="https://github.com/TencentARC/VideoPainter"><img src="https://img.shields.io/badge/GitHub-Code-black?logo=github"></a>
|
| 31 |
+
<a href="https://youtu.be/HYzNfsD3A0s"><img src="https://img.shields.io/badge/YouTube-Video-red?logo=youtube"></a>
|
| 32 |
+
<a href='https://huggingface.co/datasets/TencentARC/VPData'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-blue'></a>
|
| 33 |
+
<a href='https://huggingface.co/datasets/TencentARC/VPBench'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Benchmark-blue'></a>
|
| 34 |
+
<a href="https://huggingface.co/TencentARC/VideoPainter"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue"></a>
|
| 35 |
</p>
|
| 36 |
|
| 37 |
+
**Your star means a lot for us to develop this project!** βββ
|
| 38 |
+
|
| 39 |
+
**VPData and VPBench have been fully uploaded (contain 390K mask sequences and video captions). Welcome to use our biggest video segmentation dataset VPData with video captions!** π₯π₯π₯
|
| 40 |
|
| 41 |
|
| 42 |
**π Table of Contents**
|
|
|
|
| 44 |
|
| 45 |
- [VideoPainter](#videopainter)
|
| 46 |
- [π₯ Update Log](#-update-log)
|
| 47 |
+
- [π TODO](#todo)
|
| 48 |
- [π οΈ Method Overview](#οΈ-method-overview)
|
| 49 |
- [π Getting Started](#-getting-started)
|
| 50 |
+
- [Environment Requirement π](#environment-requirement-)
|
| 51 |
+
- [Data Download β¬οΈ](#data-download-οΈ)
|
| 52 |
- [ππΌ Running Scripts](#-running-scripts)
|
| 53 |
+
- [Training π€―](#training-)
|
| 54 |
+
- [Inference π](#inference-)
|
| 55 |
+
- [Evaluation π](#evaluation-)
|
| 56 |
- [π€πΌ Cite Us](#-cite-us)
|
| 57 |
- [π Acknowledgement](#-acknowledgement)
|
| 58 |
|
|
|
|
| 61 |
## π₯ Update Log
|
| 62 |
- [2025/3/09] π’ π’ [VideoPainter](https://huggingface.co/TencentARC/VideoPainter) are released, an efficient, any-length video inpainting & editing framework with plug-and-play context control.
|
| 63 |
- [2025/3/09] π’ π’ [VPData](https://huggingface.co/datasets/TencentARC/VPData) and [VPBench](https://huggingface.co/datasets/TencentARC/VPBench) are released, the largest video inpainting dataset with precise segmentation masks and dense video captions (>390K clips).
|
| 64 |
+
- [2025/3/25] π’ π’ The 390K+ high-quality video segmentation masks of [VPData](https://huggingface.co/datasets/TencentARC/VPData) have been fully released.
|
| 65 |
+
- [2025/3/25] π’ π’ The raw videos of videovo subset have been uploaded to [VPData](https://huggingface.co/datasets/TencentARC/VPData), to solve the raw video link expiration issue.
|
| 66 |
|
| 67 |
## TODO
|
| 68 |
|
| 69 |
- [x] Release trainig and inference code
|
| 70 |
+
- [x] Release evaluation code
|
| 71 |
- [x] Release [VideoPainter checkpoints](https://huggingface.co/TencentARC/VideoPainter) (based on CogVideoX-5B)
|
| 72 |
- [x] Release [VPData and VPBench](https://huggingface.co/collections/TencentARC/videopainter-67cc49c6146a48a2ba93d159) for large-scale training and evaluation.
|
| 73 |
- [x] Release gradio demo
|
|
|
|
| 122 |
</details>
|
| 123 |
|
| 124 |
<details>
|
| 125 |
+
<summary><b>VPBench and VPData Download β¬οΈ</b></summary>
|
|
|
|
|
|
|
|
|
|
| 126 |
|
| 127 |
You can download the VPBench [here](https://huggingface.co/datasets/TencentARC/VPBench), and the VPData [here](https://huggingface.co/datasets/TencentARC/VPData) (as well as the Davis we re-processed), which are used for training and testing the BrushNet. By downloading the data, you are agreeing to the terms and conditions of the license. The data structure should be like:
|
| 128 |
|
|
|
|
| 184 |
git lfs install
|
| 185 |
git clone https://huggingface.co/datasets/TencentARC/VPData
|
| 186 |
mv VPBench data
|
| 187 |
+
|
| 188 |
+
# 1. unzip the masks in VPData
|
| 189 |
+
python data_utils/unzip_folder.py --source_dir ./data/videovo_masks --target_dir ./data/video_inpainting/videovo
|
| 190 |
+
python data_utils/unzip_folder.py --source_dir ./data/pexels_masks --target_dir ./data/video_inpainting/pexels
|
| 191 |
+
|
| 192 |
+
# 2. unzip the raw videos in Videovo subset in VPData
|
| 193 |
+
python data_utils/unzip_folder.py --source_dir ./data/videovo_raw_videos --target_dir ./data/videovo/raw_video
|
| 194 |
```
|
| 195 |
|
| 196 |
+
Noted: *Due to the space limit, you need to run the following script to download the raw videos of the Pexels subset in VPData. The format should be consistent with VPData/VPBench above (After download the VPData/VPBench, the script will automatically place the raw videos of VPData into the corresponding dataset directories that have been created by VPBench).*
|
| 197 |
|
| 198 |
```
|
| 199 |
cd data_utils
|
|
|
|
| 233 |
mv ckpt/FLUX.1-Fill-dev ckpt/flux_inp
|
| 234 |
```
|
| 235 |
|
| 236 |
+
[Optional]You need to download [SAM2](https://huggingface.co/facebook/sam2-hiera-large) for video segmentation in gradio demo:
|
| 237 |
+
```
|
| 238 |
+
git lfs install
|
| 239 |
+
cd ckpt
|
| 240 |
+
wget https://huggingface.co/facebook/sam2-hiera-large/resolve/main/sam2_hiera_large.pt
|
| 241 |
+
```
|
| 242 |
+
You can also choose the segmentation checkpoints of other sizes to balance efficiency and performance, such as [SAM2-Tiny](https://huggingface.co/facebook/sam2-hiera-tiny).
|
| 243 |
|
| 244 |
The ckpt structure should be like:
|
| 245 |
|
|
|
|
| 261 |
|-- transformer
|
| 262 |
|-- vae
|
| 263 |
|-- ...
|
| 264 |
+
|-- sam2_hiera_large.pt
|
| 265 |
```
|
| 266 |
</details>
|
| 267 |
|