Spaces:
Configuration error
Configuration error
| # One-Shot Free-View Neural Talking Head Synthesis | |
| Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing". | |
| ```Python 3.6``` and ```Pytorch 1.7``` are used. | |
| Updates: | |
| -------- | |
| ```2021.11.05``` : | |
| * <s>Replace Jacobian with the rotation matrix (Assuming J = R) to avoid estimating Jacobian.</s> | |
| * Correct the rotation matrix. | |
| ```2021.11.17``` : | |
| * Better Generator, better performance (models and checkpoints have been released). | |
| Driving | Beta Version | FOMM | New Version: | |
| https://user-images.githubusercontent.com/17874285/142828000-db7b324e-c2fd-4fdc-a272-04fb8adbc88a.mp4 | |
| -------- | |
| Driving | FOMM | Ours: | |
|  | |
| Free-View: | |
|  | |
| Train: | |
| -------- | |
| ``` | |
| python run.py --config config/vox-256.yaml --device_ids 0,1,2,3,4,5,6,7 | |
| ``` | |
| Demo: | |
| -------- | |
| ``` | |
| python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame | |
| ``` | |
| free-view (e.g. yaw=20, pitch=roll=0): | |
| ``` | |
| python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame --free_view --yaw 20 --pitch 0 --roll 0 | |
| ``` | |
| Note: run ```crop-video.py --inp driving_video.mp4``` first to get the cropping suggestion and crop the raw video. | |
| Pretrained Model: | |
| -------- | |
| Model | Train Set | Baidu Netdisk | Media Fire | | |
| ------- |------------ |----------- |-------- | | |
| Vox-256-Beta| VoxCeleb-v1 | [Baidu](https://pan.baidu.com/s/1lLS4ArbK2yWelsL-EtwU8g) (PW: c0tc)| [MF](https://www.mediafire.com/folder/rw51an7tk7bh2/TalkingHead) | | |
| Vox-256-New | VoxCeleb-v1 | - | [MF](https://www.mediafire.com/folder/fcvtkn21j57bb/TalkingHead_Update) | | |
| Vox-512 | VoxCeleb-v2 | soon | soon | | |
| Note: | |
| 1. <s>For now, the Beta Version is not well tuned.</s> | |
| 2. For free-view synthesis, it is recommended that Yaw, Pitch and Roll are within ±45°, ±20° and ±20° respectively. | |
| 3. Face Restoration algorithms ([GPEN](https://github.com/yangxy/GPEN)) can be used for post-processing to significantly improve the resolution. | |
|  | |
| Acknowlegement: | |
| -------- | |
| Thanks to [NV](https://github.com/NVlabs/face-vid2vid), [AliaksandrSiarohin](https://github.com/AliaksandrSiarohin/first-order-model) and [DeepHeadPose](https://github.com/DriverDistraction/DeepHeadPose). | |