Comp-I / docs /PHASE3E_PERFORMANCE_GUIDE.md
axrzce's picture
Deploy from GitHub main
338d95d verified
|
raw
history blame
11.2 kB

βš™οΈ CompI Phase 3.E: Performance, Model Management & Reliability - Complete Guide

🎯 What Phase 3.E Delivers

Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.

πŸ€– Model Manager

  • Dynamic Model Switching: Switch between SD 1.5 and SDXL based on requirements
  • Auto-Availability Checking: Intelligent detection of model compatibility and VRAM requirements
  • Universal LoRA Support: Load and scale LoRA weights across all models and generation modes
  • Smart Recommendations: Hardware-based model suggestions and optimization advice

⚑ Performance Controls

  • xFormers Integration: Memory-efficient attention with automatic fallback
  • Advanced Memory Optimization: Attention slicing, VAE slicing/tiling, CPU offloading
  • Precision Control: Automatic dtype selection (fp16/bf16/fp32) based on hardware
  • Batch Optimization: Memory-aware batch processing with intelligent sizing

πŸ“Š VRAM Monitoring

  • Real-time Tracking: Live GPU memory usage monitoring and alerts
  • Usage Analytics: Memory usage patterns and optimization suggestions
  • Threshold Warnings: Automatic alerts when approaching memory limits
  • Cache Management: Intelligent GPU cache clearing and memory cleanup

πŸ›‘οΈ Reliability Engine

  • OOM-Safe Generation: Automatic retry with progressive fallback strategies
  • Intelligent Fallbacks: Reduce size β†’ reduce steps β†’ CPU fallback progression
  • Error Classification: Smart error detection and appropriate response strategies
  • Graceful Degradation: Maintain functionality even under resource constraints

πŸ“¦ Batch Processing

  • Seed-Controlled Batches: Deterministic seed sequences for reproducible results
  • Memory-Aware Batching: Automatic batch size optimization based on available VRAM
  • Progress Tracking: Detailed progress monitoring with per-image status
  • Failure Recovery: Continue batch processing even if individual images fail

πŸ” Upscaler Integration

  • Latent Upscaler: Optional 2x upscaling using Stable Diffusion Latent Upscaler
  • Graceful Degradation: Clean fallback when upscaler unavailable
  • Memory Management: Intelligent memory allocation for upscaling operations
  • Quality Enhancement: Professional-grade image enhancement capabilities

πŸš€ Quick Start Guide

1. Launch Phase 3.E

# Method 1: Using launcher script (recommended)
python run_phase3e_performance_manager.py

# Method 2: Direct Streamlit launch
streamlit run src/ui/compi_phase3e_performance_manager.py --server.port 8505

2. System Requirements Check

The launcher automatically checks:

  • GPU Setup: CUDA availability and VRAM capacity
  • Dependencies: Required and optional packages
  • Model Support: SD 1.5 and SDXL availability
  • Performance Features: xFormers and upscaler support

3. Access the Interface

  • URL: http://localhost:8505
  • Interface: Professional Streamlit dashboard with real-time monitoring
  • Sidebar: Live VRAM monitoring and system status

🎨 Professional Workflow

Step 1: Model Selection

  1. Choose Base Model: SD 1.5 (fast, compatible) or SDXL (high quality, more VRAM)
  2. Select Generation Mode: txt2img or img2img
  3. Check Compatibility: System automatically validates model/mode combinations
  4. Review VRAM Requirements: See memory requirements and availability status

Step 2: LoRA Integration (Optional)

  1. Enable LoRA: Toggle LoRA support
  2. Specify Path: Enter path to LoRA weights (diffusers format)
  3. Set Scale: Adjust LoRA influence (0.1-2.0)
  4. Verify Status: Check LoRA loading status and compatibility

Step 3: Performance Optimization

  1. Choose Optimization Level: Conservative, Balanced, Aggressive, or Extreme
  2. Monitor VRAM: Watch real-time memory usage in sidebar
  3. Adjust Settings: Fine-tune individual optimization features
  4. Enable Reliability: Configure OOM retry and CPU fallback options

Step 4: Generation

  1. Single Images: Generate individual images with full control
  2. Batch Processing: Create multiple images with seed sequences
  3. Monitor Progress: Track generation progress and memory usage
  4. Review Results: Analyze generation statistics and performance metrics

πŸ”§ Advanced Features

πŸ€– Model Manager Deep Dive

Model Compatibility Matrix

SD 1.5:
  βœ… txt2img (512x512 optimal)
  βœ… img2img (all strengths)
  βœ… ControlNet (full support)
  βœ… LoRA (universal compatibility)
  πŸ’Ύ VRAM: 4+ GB recommended

SDXL:
  βœ… txt2img (1024x1024 optimal)
  βœ… img2img (limited support)
  ⚠️ ControlNet (requires special handling)
  βœ… LoRA (SDXL-compatible weights only)
  πŸ’Ύ VRAM: 8+ GB recommended

Automatic Model Selection Logic

  • VRAM < 6GB: Recommends SD 1.5 only
  • VRAM 6-8GB: SD 1.5 preferred, SDXL with warnings
  • VRAM 8GB+: Full SDXL support with optimizations
  • CPU Mode: SD 1.5 only with aggressive optimizations

⚑ Performance Optimization Levels

Conservative Mode

  • Basic attention slicing
  • Standard precision (fp16/fp32)
  • Minimal memory optimizations
  • Best for: Stable systems, first-time users

Balanced Mode (Default)

  • xFormers attention (if available)
  • Attention + VAE slicing
  • Automatic precision selection
  • Best for: Most users, good performance/stability balance

Aggressive Mode

  • All memory optimizations enabled
  • VAE tiling for large images
  • Maximum memory efficiency
  • Best for: Limited VRAM, large batch processing

Extreme Mode

  • CPU offloading enabled
  • Maximum memory savings
  • Slower but uses minimal VRAM
  • Best for: Very limited VRAM (<4GB)

πŸ›‘οΈ Reliability Engine Strategies

Fallback Progression

Strategy 1: Original settings (100% size, 100% steps)
Strategy 2: Reduced size (75% size, 90% steps)  
Strategy 3: Half size (50% size, 80% steps)
Strategy 4: Minimal (50% size, 60% steps)
Final: CPU fallback if all GPU attempts fail

Error Classification

  • CUDA OOM: Triggers progressive fallback
  • Model Loading: Suggests alternative models
  • LoRA Errors: Disables LoRA and retries
  • General Errors: Logs and reports with context

πŸ“Š VRAM Monitoring System

Real-time Metrics

  • Total VRAM: Hardware capacity
  • Used VRAM: Currently allocated memory
  • Free VRAM: Available for new operations
  • Usage Percentage: Current utilization level

Smart Alerts

  • Green (0-60%): Optimal usage
  • Yellow (60-80%): Moderate usage, monitor closely
  • Red (80%+): High usage, optimization recommended

Memory Management

  • Automatic Cache Clearing: Between batch generations
  • Memory Leak Detection: Identifies and resolves memory issues
  • Optimization Suggestions: Hardware-specific recommendations

πŸ“ˆ Performance Benchmarks

Generation Speed Comparison

SD 1.5 (512x512, 20 steps):
  RTX 4090: ~15-25 seconds
  RTX 3080: ~25-35 seconds  
  RTX 2080: ~45-60 seconds
  CPU: ~5-10 minutes

SDXL (1024x1024, 20 steps):
  RTX 4090: ~30-45 seconds
  RTX 3080: ~60-90 seconds
  RTX 2080: ~2-3 minutes (with optimizations)
  CPU: ~15-30 minutes

Memory Usage Patterns

SD 1.5:
  Base: ~3.5GB VRAM
  + LoRA: ~3.7GB VRAM
  + Upscaler: ~5.5GB VRAM

SDXL:
  Base: ~6.5GB VRAM
  + LoRA: ~7.0GB VRAM  
  + Upscaler: ~9.0GB VRAM

πŸ” Troubleshooting Guide

Common Issues & Solutions

"CUDA Out of Memory" Errors

  1. Enable OOM Auto-Retry: Automatic fallback handling
  2. Reduce Image Size: Use 512x512 instead of 1024x1024
  3. Lower Batch Size: Generate fewer images simultaneously
  4. Enable Aggressive Optimizations: Use VAE slicing/tiling
  5. Clear GPU Cache: Use sidebar "Clear GPU Cache" button

Slow Generation Speed

  1. Enable xFormers: Significant speed improvement if available
  2. Use Balanced Optimization: Good speed/quality trade-off
  3. Reduce Inference Steps: 15-20 steps often sufficient
  4. Check VRAM Usage: Ensure not hitting memory limits

Model Loading Failures

  1. Check Internet Connection: Models download on first use
  2. Verify Disk Space: Models require 2-7GB storage each
  3. Try Alternative Model: Switch between SD 1.5 and SDXL
  4. Clear Model Cache: Remove cached models and re-download

LoRA Loading Issues

  1. Verify Path: Ensure LoRA files exist at specified path
  2. Check Format: Use diffusers-compatible LoRA weights
  3. Model Compatibility: Ensure LoRA matches base model type
  4. Scale Adjustment: Try different LoRA scale values

🎯 Best Practices

πŸ“ Performance Optimization

  1. Start Conservative: Begin with balanced settings, adjust as needed
  2. Monitor VRAM: Keep usage below 80% for stability
  3. Batch Wisely: Use smaller batches on limited hardware
  4. Clear Cache Regularly: Prevent memory accumulation

πŸ€– Model Selection

  1. SD 1.5 for Speed: Faster generation, lower VRAM requirements
  2. SDXL for Quality: Higher resolution, better detail
  3. Match Hardware: Choose model based on available VRAM
  4. Test Compatibility: Verify model works with your use case

πŸ›‘οΈ Reliability

  1. Enable Auto-Retry: Let system handle OOM errors automatically
  2. Use Fallbacks: Allow progressive degradation for reliability
  3. Monitor Logs: Check run logs for patterns and issues
  4. Plan for Failures: Design workflows that handle generation failures

πŸš€ Integration with CompI Ecosystem

Universal Enhancement

Phase 3.E enhances ALL existing CompI components:

  • Ultimate Dashboard: Model switching and performance controls
  • Phase 2.A-2.E: Reliability and optimization for all multimodal phases
  • Phase 1.A-1.E: Enhanced foundation with professional features
  • Phase 3.D: Performance metrics in workflow management

Backward Compatibility

  • Graceful Degradation: Works on all hardware configurations
  • Default Settings: Optimal defaults for most users
  • Progressive Enhancement: Advanced features when available
  • Legacy Support: Maintains compatibility with existing workflows

πŸŽ‰ Phase 3.E: Production-Grade CompI Complete

Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.

Key Benefits:

  • βœ… Professional Performance: Industry-standard optimization and monitoring
  • βœ… Intelligent Reliability: Automatic error handling and recovery
  • βœ… Advanced Model Management: Dynamic switching and LoRA integration
  • βœ… Production Ready: Suitable for commercial and professional use
  • βœ… Universal Enhancement: Improves all existing CompI features

CompI is now a complete, production-grade multimodal AI art generation platform! 🎨✨