Spaces:
Paused
Paused
# CLAUDE.md | |
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | |
## Project Overview | |
This is the AI Toolkit by Ostris, packaged as a Hugging Face Space for Docker deployment. It's a comprehensive training suite for diffusion models supporting the latest models on consumer-grade hardware. The toolkit includes both CLI and web UI interfaces for training LoRA models, particularly focused on FLUX.1 models. | |
## Architecture | |
### Core Structure | |
- **Main Entry Points**: | |
- `run.py` - CLI interface for running training jobs with config files | |
- `flux_train_ui.py` - Gradio-based simple training interface | |
- `start.sh` - Docker entry point that launches the web UI | |
- **Web UI** (`ui/`): Next.js application with TypeScript | |
- Frontend in `src/app/` with API routes | |
- Background worker process for job management | |
- SQLite database via Prisma for job persistence | |
- **Core Toolkit** (`toolkit/`): Python modules for ML operations | |
- Model implementations in `toolkit/models/` | |
- Training processes in `jobs/process/` | |
- Configuration management and data loading utilities | |
- **Extensions** (`extensions_built_in/`): Modular training components | |
- Support for various model types (FLUX, SDXL, SD 1.5, etc.) | |
- Different training strategies (LoRA, fine-tuning, etc.) | |
### Key Configuration | |
- Training configs in `config/examples/` with YAML format | |
- Docker setup supports GPU passthrough with nvidia runtime | |
- Environment variables for HuggingFace tokens and authentication | |
## Common Development Commands | |
### Setup and Installation | |
```bash | |
# Python environment setup | |
python3 -m venv venv | |
source venv/bin/activate # or .\venv\Scripts\activate on Windows | |
pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126 | |
pip3 install -r requirements.txt | |
``` | |
### Running Training Jobs | |
```bash | |
# CLI training with config file | |
python run.py config/your_config.yml | |
# Simple Gradio UI for FLUX training | |
python flux_train_ui.py | |
``` | |
### Web UI Development | |
```bash | |
# Development mode (from ui/ directory) | |
cd ui | |
npm install | |
npm run dev | |
# Production build and start | |
npm run build_and_start | |
# Database updates | |
npm run update_db | |
``` | |
### Docker Operations | |
```bash | |
# Run with docker-compose | |
docker-compose up | |
# Build custom image | |
docker build -f docker/Dockerfile -t ai-toolkit . | |
``` | |
## Authentication Requirements | |
### HuggingFace Access | |
- FLUX.1-dev requires accepting license at https://huggingface.co/black-forest-labs/FLUX.1-dev | |
- Set `HF_TOKEN` environment variable with READ access token | |
- Create `.env` file in root: `HF_TOKEN=your_key_here` | |
### UI Security | |
- Set `AI_TOOLKIT_AUTH` environment variable for UI authentication | |
- Default password is "password" if not set | |
## Training Configuration | |
### Model Support | |
- **FLUX.1-dev**: Requires HF token, non-commercial license | |
- **FLUX.1-schnell**: Apache 2.0, needs training adapter | |
- **SDXL, SD 1.5**: Standard Stable Diffusion models | |
- **Video models**: Various I2V and text-to-video architectures | |
### Memory Requirements | |
- FLUX.1 training requires minimum 24GB VRAM | |
- Use `low_vram: true` in config if running with displays attached | |
- Supports various quantization options to reduce memory usage | |
### Dataset Format | |
- Images: JPG, JPEG, PNG (no WebP) | |
- Captions: `.txt` files with same name as images | |
- Use `[trigger]` placeholder in captions, replaced by `trigger_word` config | |
- Images auto-resized and bucketed, no manual preprocessing needed | |
## Key Files to Understand | |
- `run.py:46-85` - Main training job runner and argument parsing | |
- `toolkit/job.py` - Job management and configuration loading | |
- `ui/src/app/api/jobs/route.ts` - API endpoints for job management | |
- `config/examples/train_lora_flux_24gb.yaml` - Standard FLUX training template | |
- `extensions_built_in/sd_trainer/SDTrainer.py` - Core training logic | |
## Development Notes | |
- Jobs run independently of UI - UI is only for management | |
- Training can be stopped/resumed via checkpoints | |
- Output stored in `output/` directory with samples and models | |
- Extensions system allows custom training implementations | |
- Multi-GPU support via accelerate library |