jbilcke-hf HF Staff commited on
Commit
1595c43
·
1 Parent(s): d72fa8b
Files changed (1) hide show
  1. CLAUDE.md +76 -0
CLAUDE.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ LIA-X is a Portrait Animator application built with Gradio that enables image animation, image editing, and video editing using deep learning models. It's deployed as a Hugging Face Space with GPU acceleration.
8
+
9
+ ## Architecture
10
+
11
+ ### Core Components
12
+
13
+ 1. **Main Application** (`app.py`): Gradio web interface that loads the model and serves three main tabs
14
+ 2. **Generator Network** (`networks/generator.py`): Core neural network model that handles animation and editing
15
+ - Uses encoder-decoder architecture
16
+ - Implements motion encoding and style transfer
17
+ - Pre-allocates tensors for performance optimization
18
+ 3. **Gradio Tabs** (`gradio_tabs/`): UI modules for different functionalities
19
+ - `animation.py`: Handles image-to-video animation
20
+ - `img_edit.py`: Image editing interface
21
+ - `vid_edit.py`: Video editing interface
22
+
23
+ ### Model Architecture
24
+
25
+ - **Encoder** (`networks/encoder.py`): Encodes source images and motion
26
+ - **Decoder** (`networks/decoder.py`): Reconstructs edited/animated outputs
27
+ - **Custom Ops** (`networks/op/`): CUDA kernels for optimized operations (fused_act, upfirdn2d)
28
+
29
+ ## Development Commands
30
+
31
+ ### Running the Application
32
+
33
+ ```bash
34
+ python app.py
35
+ ```
36
+
37
+ The app launches a Gradio interface on local server. Note: Requires CUDA-capable GPU.
38
+
39
+ ### Installing Dependencies
40
+
41
+ ```bash
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ Key dependencies: PyTorch 2.5.1, torchvision, Gradio 5.42.0, einops, imageio, av
46
+
47
+ ### Model Loading
48
+
49
+ The model checkpoint is automatically downloaded from Hugging Face Hub:
50
+ - Repository: `YaohuiW/LIA-X`
51
+ - File: `lia-x.pt`
52
+
53
+ ## Important Notes
54
+
55
+ - This is a GPU-only application (uses `torch.device("cuda")`)
56
+ - Uses `@spaces` decorator for Hugging Face Spaces GPU allocation
57
+ - Model operates at 512x512 resolution with motion_dim=40
58
+ - Chunk size of 16 frames for video processing
59
+ - Custom CUDA kernels in `networks/op/` require compilation with ninja
60
+ - Git LFS is configured for large files (models, videos, images)
61
+
62
+ ## File Processing
63
+
64
+ - Images: Loaded as RGB, resized to 512x512, normalized to [-1, 1]
65
+ - Videos: Processed with torchvision, maintains original FPS
66
+ - Supports cropping tools for better results (referenced in instruction.md)
67
+
68
+ ## Testing
69
+
70
+ No explicit test suite found. Manual testing through Gradio interface.
71
+
72
+ ## Data Structure
73
+
74
+ - `data/source/`: Source images for examples
75
+ - `data/driving/`: Driving videos for animation examples
76
+ - `assets/`: Documentation and UI text (instruction.md, title.md)