File size: 3,327 Bytes
63a2ca2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# Download Pretrained Models

All models are stored in `HunyuanVideo/ckpts` by default, and the file structure is as follows
```shell

HunyuanVideo

  ├──ckpts

  │  ├──README.md

  │  ├──hunyuan-video-t2v-720p

  │  │  ├──transformers

  │  │  │  ├──mp_rank_00_model_states.pt

  │  │  │  ├──mp_rank_00_model_states_fp8.pt

  │  │  │  ├──mp_rank_00_model_states_fp8_map.pt

  ├  │  ├──vae

  │  ├──text_encoder

  │  ├──text_encoder_2

  ├──...

```

## Download HunyuanVideo model
To download the HunyuanVideo model, first install the huggingface-cli. (Detailed instructions are available [here](https://huggingface.co/docs/huggingface_hub/guides/cli).)

```shell

python -m pip install "huggingface_hub[cli]"

```

Then download the model using the following commands:

```shell

# Switch to the directory named 'HunyuanVideo'

cd HunyuanVideo

# Use the huggingface-cli tool to download HunyuanVideo model in HunyuanVideo/ckpts dir.

# The download time may vary from 10 minutes to 1 hour depending on network conditions.

huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts

```

<details>
<summary>💡Tips for using huggingface-cli (network problem)</summary>

##### 1. Using HF-Mirror

If you encounter slow download speeds in China, you can try a mirror to speed up the download process. For example,

```shell

HF_ENDPOINT=https://hf-mirror.com huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts

```

##### 2. Resume Download

`huggingface-cli` supports resuming downloads. If the download is interrupted, you can just rerun the download 
command to resume the download process.

Note: If an `No such file or directory: 'ckpts/.huggingface/.gitignore.lock'` like error occurs during the download 
process, you can ignore the error and rerun the download command.

</details>

---

## Download Text Encoder

HunyuanVideo uses an MLLM model and a CLIP model as text encoder.

1. MLLM model (text_encoder folder)



HunyuanVideo supports different MLLMs (including HunyuanMLLM and open-source MLLM models). At this stage, we have not yet released HunyuanMLLM. We recommend the user in community to use [llava-llama-3-8b](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) provided by [Xtuer](https://huggingface.co/xtuner), which can be downloaded by the following command



```shell

cd HunyuanVideo/ckpts

huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers

```



In order to save GPU memory resources for model loading, we separate the language model parts of `llava-llama-3-8b-v1_1-transformers` into `text_encoder`.

```

cd HunyuanVideo

python hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py --input_dir ckpts/llava-llama-3-8b-v1_1-transformers --output_dir ckpts/text_encoder

```



2. CLIP model (text_encoder_2 folder)



We use [CLIP](https://huggingface.co/openai/clip-vit-large-patch14) provided by [OpenAI](https://openai.com) as another text encoder, users in the community can download this model by the following command



```

cd HunyuanVideo/ckpts

huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2

```