SAT-HMR / README.md

Update README.md

6b40d93 verified 5 months ago

6.48 kB

	---
	tags:
	- Human Mesh Recovery
	- Human Pose and Shape Estimation
	- Multi-Person Mesh Recovery
	arxiv: '2411.19824'
	license: apache-2.0
	---

	# SAT-HMR

	Offical [Pytorch](https://pytorch.org/) implementation of our paper:

	<h3 align="center">SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens <br> (CVPR 2025)</h3>

	<h4 align="center" style="text-decoration: none;">
	<a href="https://github.com/ChiSu001/", target="_blank"><b>Chi Su</b></a>
	,
	<a href="https://shirleymaxx.github.io/", target="_blank"><b>Xiaoxuan Ma</b></a>
	,
	<a href="https://scholar.google.com/citations?user=DoUvUz4AAAAJ&hl=en", target="_blank"><b>Jiajun Su</b></a>
	,
	<a href="https://cfcs.pku.edu.cn/english/people/faculty/yizhouwang/index.htm", target="_blank"><b>Yizhou Wang</b></a>

	</h4>

	<h3 align="center">
	<a href="https://arxiv.org/abs/2411.19824", target="_blank">Paper</a> \|
	<a href="https://ChiSu001.github.io/SAT-HMR", target="_blank">Project Page</a> \|
	<a href="https://youtu.be/wLfNrDYFAns", target="_blank">Video</a> \|
	<a href="https://github.com/ChiSu001/SAT-HMR", target="_blank">GitHub</a>
	</h3>

	<!-- <div align="center">
	<img src="figures/results.png" width="70%">
	<img src="figures/results_3d.gif" width="29%">
	</div> -->


	<!-- <h3> Overview of SAT-HMR </h3> -->

	<p align="center">
	<img src="figures/pipeline.png"/>
	</p>

	<!-- <p align="center">
	<img src="figures/pipeline.png" style="height: 300px; object-fit: cover;"/>
	</p> -->

	## Installation

	We tested with python 3.11, PyTorch 2.4.1 and CUDA 12.1.

	1. Create a conda environment.
	```bash
	conda create -n sathmr python=3.11 -y
	conda activate sathmr
	```

	2. Install [PyTorch](https://pytorch.org/) and [xFormers](https://github.com/facebookresearch/xformers).
	```bash
	# Install PyTorch. It is recommended that you follow [official instruction](https://pytorch.org/) and adapt the cuda version to yours.
	conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia

	# Install xFormers. It is recommended that you follow [official instruction](https://github.com/facebookresearch/xformers) and adapt the cuda version to yours.
	pip install -U xformers==0.0.28.post1 --index-url https://download.pytorch.org/whl/cu121
	```

	3. Install other dependencies.
	```bash
	pip install -r requirements.txt
	```

	4. You may need to modify `chumpy` package to avoid errors. For detailed instructions, please check [this guidance](docs/fix_chumpy.md).

	## Download Models & Weights

	1. Download SMPL-related weights.
	- Download `basicModel_f_lbs_10_207_0_v1.0.0.pkl`, `basicModel_m_lbs_10_207_0_v1.0.0.pkl`, and `basicModel_neutral_lbs_10_207_0_v1.0.0.pkl` from [here](https://smpl.is.tue.mpg.de/) (female & male) and [here](http://smplify.is.tue.mpg.de/) (neutral) to `${Project}/weights/smpl_data/smpl`. Please rename them as `SMPL_FEMALE.pkl`, `SMPL_MALE.pkl`, and `SMPL_NEUTRAL.pkl`, respectively.
	- Download others from [Google drive](https://drive.google.com/drive/folders/1wmd_pjmmDn3eSl3TLgProgZgCQZgtZIC?usp=sharing) and put them to `${Project}/weights/smpl_data/smpl`.

	2. Download DINOv2 pretrained weights from [their official repository](https://github.com/facebookresearch/dinov2?tab=readme-ov-file#pretrained-models). We use `ViT-B/14 distilled (without registers)`. Please put `dinov2_vitb14_pretrain.pth` to `${Project}/weights/dinov2`. These weights will be used to initialize our encoder. You can skip this step if you are not going to train SAT-HMR.

	3. Download pretrained weights for inference and evaluation from [Google drive](https://drive.google.com/file/d/12tGbqcrJ8YACcrfi5qslZNEciIHxcScZ/view?usp=sharing) or [🤗HuggingFace](https://huggingface.co/ChiSu001/SAT-HMR/blob/main/weights/sat_hmr/sat_644.pth). Please put them to `${Project}/weights/sat_hmr`.

	Now the `weights` directory structure should be like this.

	```
	${Project}
	\|-- weights
	\|-- dinov2
	\| `-- dinov2_vitb14_pretrain.pth
	\|-- sat_hmt
	\| `-- sat_644.pth
	`-- smpl_data
	`-- smpl
	\|-- body_verts_smpl.npy
	\|-- J_regressor_h36m_correct.npy
	\|-- SMPL_FEMALE.pkl
	\|-- SMPL_MALE.pkl
	\|-- smpl_mean_params.npz
	`-- SMPL_NEUTRAL.pkl
	```

	## Inference on Images
	<h4> Inference with 1 GPU</h4>

	We provide some demo images in `${Project}/demo`. You can run SAT-HMR on all images on a single GPU via:


	```bash
	python main.py --mode infer --cfg demo
	```

	Results with overlayed meshes will be saved in `${Project}/demo_results`.

	You can specify your own inference configuration by modifing `${Project}/configs/run/demo.yaml`:

	- `input_dir` specifies the input image folder.
	- `output_dir` specifies the output folder.
	- `conf_thresh` specifies a list of confidence thresholds used for detection. SAT-HMR will run inference using thresholds in the list, respectively.
	- `infer_batch_size` specifies the batch size used for inference (on a single GPU).

	<h4> Inference with Multiple GPUs</h4>

	You can also try distributed inference on multiple GPUs if your input folder contains a large number of images.
	Since we use [🤗 Accelerate](https://huggingface.co/docs/accelerate/index) to launch our distributed configuration, first you may need to configure [🤗 Accelerate](https://huggingface.co/docs/accelerate/index) for how the current system is setup for distributed process. To do so run the following command and answer the questions prompted to you:

	```bash
	accelerate config
	```

	Then run:
	```bash
	accelerate launch main.py --mode infer --cfg demo
	```

	<!-- ## Datasets Preparation

	Coming soon.

	## Training and Evaluation

	Coming soon. -->

	## Citing

	If you find this code useful for your research, please consider citing our paper:
	```bibtex
	@InProceedings{Su_2025_CVPR,
	author = {Su, Chi and Ma, Xiaoxuan and Su, Jiajun and Wang, Yizhou},
	title = {SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens},
	booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
	month = {June},
	year = {2025},
	pages = {16796-16806}
	}
	```

	## Acknowledgement
	This repo is built on the excellent work [DINOv2](https://github.com/facebookresearch/dinov2), [DAB-DETR](https://github.com/IDEA-Research/DAB-DETR), [DINO](https://github.com/IDEA-Research/DINO) and [🤗 Accelerate](https://huggingface.co/docs/accelerate/index). Thanks for these great projects.