Spaces:
Paused
Paused
Commit
·
7b99bfa
1
Parent(s):
718c1e6
fix
Browse files
CLAUDE.md
ADDED
@@ -0,0 +1,123 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# CLAUDE.md
|
2 |
+
|
3 |
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
4 |
+
|
5 |
+
## Project Overview
|
6 |
+
|
7 |
+
This is the AI Toolkit by Ostris, packaged as a Hugging Face Space for Docker deployment. It's a comprehensive training suite for diffusion models supporting the latest models on consumer-grade hardware. The toolkit includes both CLI and web UI interfaces for training LoRA models, particularly focused on FLUX.1 models.
|
8 |
+
|
9 |
+
## Architecture
|
10 |
+
|
11 |
+
### Core Structure
|
12 |
+
- **Main Entry Points**:
|
13 |
+
- `run.py` - CLI interface for running training jobs with config files
|
14 |
+
- `flux_train_ui.py` - Gradio-based simple training interface
|
15 |
+
- `start.sh` - Docker entry point that launches the web UI
|
16 |
+
|
17 |
+
- **Web UI** (`ui/`): Next.js application with TypeScript
|
18 |
+
- Frontend in `src/app/` with API routes
|
19 |
+
- Background worker process for job management
|
20 |
+
- SQLite database via Prisma for job persistence
|
21 |
+
|
22 |
+
- **Core Toolkit** (`toolkit/`): Python modules for ML operations
|
23 |
+
- Model implementations in `toolkit/models/`
|
24 |
+
- Training processes in `jobs/process/`
|
25 |
+
- Configuration management and data loading utilities
|
26 |
+
|
27 |
+
- **Extensions** (`extensions_built_in/`): Modular training components
|
28 |
+
- Support for various model types (FLUX, SDXL, SD 1.5, etc.)
|
29 |
+
- Different training strategies (LoRA, fine-tuning, etc.)
|
30 |
+
|
31 |
+
### Key Configuration
|
32 |
+
- Training configs in `config/examples/` with YAML format
|
33 |
+
- Docker setup supports GPU passthrough with nvidia runtime
|
34 |
+
- Environment variables for HuggingFace tokens and authentication
|
35 |
+
|
36 |
+
## Common Development Commands
|
37 |
+
|
38 |
+
### Setup and Installation
|
39 |
+
```bash
|
40 |
+
# Python environment setup
|
41 |
+
python3 -m venv venv
|
42 |
+
source venv/bin/activate # or .\venv\Scripts\activate on Windows
|
43 |
+
pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
|
44 |
+
pip3 install -r requirements.txt
|
45 |
+
```
|
46 |
+
|
47 |
+
### Running Training Jobs
|
48 |
+
```bash
|
49 |
+
# CLI training with config file
|
50 |
+
python run.py config/your_config.yml
|
51 |
+
|
52 |
+
# Simple Gradio UI for FLUX training
|
53 |
+
python flux_train_ui.py
|
54 |
+
```
|
55 |
+
|
56 |
+
### Web UI Development
|
57 |
+
```bash
|
58 |
+
# Development mode (from ui/ directory)
|
59 |
+
cd ui
|
60 |
+
npm install
|
61 |
+
npm run dev
|
62 |
+
|
63 |
+
# Production build and start
|
64 |
+
npm run build_and_start
|
65 |
+
|
66 |
+
# Database updates
|
67 |
+
npm run update_db
|
68 |
+
```
|
69 |
+
|
70 |
+
### Docker Operations
|
71 |
+
```bash
|
72 |
+
# Run with docker-compose
|
73 |
+
docker-compose up
|
74 |
+
|
75 |
+
# Build custom image
|
76 |
+
docker build -f docker/Dockerfile -t ai-toolkit .
|
77 |
+
```
|
78 |
+
|
79 |
+
## Authentication Requirements
|
80 |
+
|
81 |
+
### HuggingFace Access
|
82 |
+
- FLUX.1-dev requires accepting license at https://huggingface.co/black-forest-labs/FLUX.1-dev
|
83 |
+
- Set `HF_TOKEN` environment variable with READ access token
|
84 |
+
- Create `.env` file in root: `HF_TOKEN=your_key_here`
|
85 |
+
|
86 |
+
### UI Security
|
87 |
+
- Set `AI_TOOLKIT_AUTH` environment variable for UI authentication
|
88 |
+
- Default password is "password" if not set
|
89 |
+
|
90 |
+
## Training Configuration
|
91 |
+
|
92 |
+
### Model Support
|
93 |
+
- **FLUX.1-dev**: Requires HF token, non-commercial license
|
94 |
+
- **FLUX.1-schnell**: Apache 2.0, needs training adapter
|
95 |
+
- **SDXL, SD 1.5**: Standard Stable Diffusion models
|
96 |
+
- **Video models**: Various I2V and text-to-video architectures
|
97 |
+
|
98 |
+
### Memory Requirements
|
99 |
+
- FLUX.1 training requires minimum 24GB VRAM
|
100 |
+
- Use `low_vram: true` in config if running with displays attached
|
101 |
+
- Supports various quantization options to reduce memory usage
|
102 |
+
|
103 |
+
### Dataset Format
|
104 |
+
- Images: JPG, JPEG, PNG (no WebP)
|
105 |
+
- Captions: `.txt` files with same name as images
|
106 |
+
- Use `[trigger]` placeholder in captions, replaced by `trigger_word` config
|
107 |
+
- Images auto-resized and bucketed, no manual preprocessing needed
|
108 |
+
|
109 |
+
## Key Files to Understand
|
110 |
+
|
111 |
+
- `run.py:46-85` - Main training job runner and argument parsing
|
112 |
+
- `toolkit/job.py` - Job management and configuration loading
|
113 |
+
- `ui/src/app/api/jobs/route.ts` - API endpoints for job management
|
114 |
+
- `config/examples/train_lora_flux_24gb.yaml` - Standard FLUX training template
|
115 |
+
- `extensions_built_in/sd_trainer/SDTrainer.py` - Core training logic
|
116 |
+
|
117 |
+
## Development Notes
|
118 |
+
|
119 |
+
- Jobs run independently of UI - UI is only for management
|
120 |
+
- Training can be stopped/resumed via checkpoints
|
121 |
+
- Output stored in `output/` directory with samples and models
|
122 |
+
- Extensions system allows custom training implementations
|
123 |
+
- Multi-GPU support via accelerate library
|
start.sh
CHANGED
@@ -2,4 +2,4 @@
|
|
2 |
set -e # Exit the script if any statement returns a non-true return value
|
3 |
|
4 |
echo "Starting AI Toolkit UI..."
|
5 |
-
cd ui && npm run start
|
|
|
2 |
set -e # Exit the script if any statement returns a non-true return value
|
3 |
|
4 |
echo "Starting AI Toolkit UI..."
|
5 |
+
cd /app/ai-toolkit/ui && npm run start
|