Spaces:

jbilcke-hf
/

ai-toolkit

Paused

App Files Files Community

jbilcke-hf HF Staff commited on 15 days ago

Commit

7b99bfa

1 Parent(s): 718c1e6

fix

Browse files

Files changed (2) hide show

CLAUDE.md +123 -0
start.sh +1 -1

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,123 @@

+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Project Overview
+This is the AI Toolkit by Ostris, packaged as a Hugging Face Space for Docker deployment. It's a comprehensive training suite for diffusion models supporting the latest models on consumer-grade hardware. The toolkit includes both CLI and web UI interfaces for training LoRA models, particularly focused on FLUX.1 models.
+## Architecture
+### Core Structure
+- **Main Entry Points**:
+  - `run.py` - CLI interface for running training jobs with config files
+  - `flux_train_ui.py` - Gradio-based simple training interface
+  - `start.sh` - Docker entry point that launches the web UI
+- **Web UI** (`ui/`): Next.js application with TypeScript
+  - Frontend in `src/app/` with API routes
+  - Background worker process for job management
+  - SQLite database via Prisma for job persistence
+- **Core Toolkit** (`toolkit/`): Python modules for ML operations
+  - Model implementations in `toolkit/models/`
+  - Training processes in `jobs/process/`
+  - Configuration management and data loading utilities
+- **Extensions** (`extensions_built_in/`): Modular training components
+  - Support for various model types (FLUX, SDXL, SD 1.5, etc.)
+  - Different training strategies (LoRA, fine-tuning, etc.)
+### Key Configuration
+- Training configs in `config/examples/` with YAML format
+- Docker setup supports GPU passthrough with nvidia runtime
+- Environment variables for HuggingFace tokens and authentication
+## Common Development Commands
+### Setup and Installation
+```bash
+# Python environment setup
+python3 -m venv venv
+source venv/bin/activate  # or .\venv\Scripts\activate on Windows
+pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
+pip3 install -r requirements.txt
+```
+### Running Training Jobs
+```bash
+# CLI training with config file
+python run.py config/your_config.yml
+# Simple Gradio UI for FLUX training
+python flux_train_ui.py
+```
+### Web UI Development
+```bash
+# Development mode (from ui/ directory)
+cd ui
+npm install
+npm run dev
+# Production build and start
+npm run build_and_start
+# Database updates
+npm run update_db
+```
+### Docker Operations
+```bash
+# Run with docker-compose
+docker-compose up
+# Build custom image
+docker build -f docker/Dockerfile -t ai-toolkit .
+```
+## Authentication Requirements
+### HuggingFace Access
+- FLUX.1-dev requires accepting license at https://huggingface.co/black-forest-labs/FLUX.1-dev
+- Set `HF_TOKEN` environment variable with READ access token
+- Create `.env` file in root: `HF_TOKEN=your_key_here`
+### UI Security
+- Set `AI_TOOLKIT_AUTH` environment variable for UI authentication
+- Default password is "password" if not set
+## Training Configuration
+### Model Support
+- **FLUX.1-dev**: Requires HF token, non-commercial license
+- **FLUX.1-schnell**: Apache 2.0, needs training adapter
+- **SDXL, SD 1.5**: Standard Stable Diffusion models
+- **Video models**: Various I2V and text-to-video architectures
+### Memory Requirements
+- FLUX.1 training requires minimum 24GB VRAM
+- Use `low_vram: true` in config if running with displays attached
+- Supports various quantization options to reduce memory usage
+### Dataset Format
+- Images: JPG, JPEG, PNG (no WebP)
+- Captions: `.txt` files with same name as images
+- Use `[trigger]` placeholder in captions, replaced by `trigger_word` config
+- Images auto-resized and bucketed, no manual preprocessing needed
+## Key Files to Understand
+- `run.py:46-85` - Main training job runner and argument parsing
+- `toolkit/job.py` - Job management and configuration loading
+- `ui/src/app/api/jobs/route.ts` - API endpoints for job management
+- `config/examples/train_lora_flux_24gb.yaml` - Standard FLUX training template
+- `extensions_built_in/sd_trainer/SDTrainer.py` - Core training logic
+## Development Notes
+- Jobs run independently of UI - UI is only for management
+- Training can be stopped/resumed via checkpoints
+- Output stored in `output/` directory with samples and models
+- Extensions system allows custom training implementations
+- Multi-GPU support via accelerate library

start.sh CHANGED Viewed

@@ -2,4 +2,4 @@
 set -e  # Exit the script if any statement returns a non-true return value
 echo "Starting AI Toolkit UI..."
-cd ui && npm run start

 set -e  # Exit the script if any statement returns a non-true return value
 echo "Starting AI Toolkit UI..."
+cd /app/ai-toolkit/ui && npm run start