File size: 4,198 Bytes
8822914
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

This is the AI Toolkit by Ostris, packaged as a Hugging Face Space for Docker deployment. It's a comprehensive training suite for diffusion models supporting the latest models on consumer-grade hardware. The toolkit includes both CLI and web UI interfaces for training LoRA models, particularly focused on FLUX.1 models.

## Architecture

### Core Structure
- **Main Entry Points**: 
  - `run.py` - CLI interface for running training jobs with config files
  - `flux_train_ui.py` - Gradio-based simple training interface
  - `start.sh` - Docker entry point that launches the web UI

- **Web UI** (`ui/`): Next.js application with TypeScript
  - Frontend in `src/app/` with API routes
  - Background worker process for job management
  - SQLite database via Prisma for job persistence

- **Core Toolkit** (`toolkit/`): Python modules for ML operations
  - Model implementations in `toolkit/models/`
  - Training processes in `jobs/process/`
  - Configuration management and data loading utilities

- **Extensions** (`extensions_built_in/`): Modular training components
  - Support for various model types (FLUX, SDXL, SD 1.5, etc.)
  - Different training strategies (LoRA, fine-tuning, etc.)

### Key Configuration
- Training configs in `config/examples/` with YAML format
- Docker setup supports GPU passthrough with nvidia runtime
- Environment variables for HuggingFace tokens and authentication

## Common Development Commands

### Setup and Installation
```bash
# Python environment setup
python3 -m venv venv
source venv/bin/activate  # or .\venv\Scripts\activate on Windows
pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
pip3 install -r requirements.txt
```

### Running Training Jobs
```bash
# CLI training with config file
python run.py config/your_config.yml

# Simple Gradio UI for FLUX training
python flux_train_ui.py
```

### Web UI Development
```bash
# Development mode (from ui/ directory)
cd ui
npm install
npm run dev

# Production build and start
npm run build_and_start

# Database updates
npm run update_db
```

### Docker Operations
```bash
# Run with docker-compose
docker-compose up

# Build custom image
docker build -f docker/Dockerfile -t ai-toolkit .
```

## Authentication Requirements

### HuggingFace Access
- FLUX.1-dev requires accepting license at https://huggingface.co/black-forest-labs/FLUX.1-dev
- Set `HF_TOKEN` environment variable with READ access token
- Create `.env` file in root: `HF_TOKEN=your_key_here`

### UI Security
- Set `AI_TOOLKIT_AUTH` environment variable for UI authentication
- Default password is "password" if not set

## Training Configuration

### Model Support
- **FLUX.1-dev**: Requires HF token, non-commercial license
- **FLUX.1-schnell**: Apache 2.0, needs training adapter
- **SDXL, SD 1.5**: Standard Stable Diffusion models
- **Video models**: Various I2V and text-to-video architectures

### Memory Requirements
- FLUX.1 training requires minimum 24GB VRAM
- Use `low_vram: true` in config if running with displays attached
- Supports various quantization options to reduce memory usage

### Dataset Format
- Images: JPG, JPEG, PNG (no WebP)
- Captions: `.txt` files with same name as images
- Use `[trigger]` placeholder in captions, replaced by `trigger_word` config
- Images auto-resized and bucketed, no manual preprocessing needed

## Key Files to Understand

- `run.py:46-85` - Main training job runner and argument parsing
- `toolkit/job.py` - Job management and configuration loading
- `ui/src/app/api/jobs/route.ts` - API endpoints for job management
- `config/examples/train_lora_flux_24gb.yaml` - Standard FLUX training template
- `extensions_built_in/sd_trainer/SDTrainer.py` - Core training logic

## Development Notes

- Jobs run independently of UI - UI is only for management
- Training can be stopped/resumed via checkpoints
- Output stored in `output/` directory with samples and models
- Extensions system allows custom training implementations
- Multi-GPU support via accelerate library