βοΈ CompI Phase 3.E: Performance, Model Management & Reliability - Complete Guide
π― What Phase 3.E Delivers
Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.
π€ Model Manager
- Dynamic Model Switching: Switch between SD 1.5 and SDXL based on requirements
- Auto-Availability Checking: Intelligent detection of model compatibility and VRAM requirements
- Universal LoRA Support: Load and scale LoRA weights across all models and generation modes
- Smart Recommendations: Hardware-based model suggestions and optimization advice
β‘ Performance Controls
- xFormers Integration: Memory-efficient attention with automatic fallback
- Advanced Memory Optimization: Attention slicing, VAE slicing/tiling, CPU offloading
- Precision Control: Automatic dtype selection (fp16/bf16/fp32) based on hardware
- Batch Optimization: Memory-aware batch processing with intelligent sizing
π VRAM Monitoring
- Real-time Tracking: Live GPU memory usage monitoring and alerts
- Usage Analytics: Memory usage patterns and optimization suggestions
- Threshold Warnings: Automatic alerts when approaching memory limits
- Cache Management: Intelligent GPU cache clearing and memory cleanup
π‘οΈ Reliability Engine
- OOM-Safe Generation: Automatic retry with progressive fallback strategies
- Intelligent Fallbacks: Reduce size β reduce steps β CPU fallback progression
- Error Classification: Smart error detection and appropriate response strategies
- Graceful Degradation: Maintain functionality even under resource constraints
π¦ Batch Processing
- Seed-Controlled Batches: Deterministic seed sequences for reproducible results
- Memory-Aware Batching: Automatic batch size optimization based on available VRAM
- Progress Tracking: Detailed progress monitoring with per-image status
- Failure Recovery: Continue batch processing even if individual images fail
π Upscaler Integration
- Latent Upscaler: Optional 2x upscaling using Stable Diffusion Latent Upscaler
- Graceful Degradation: Clean fallback when upscaler unavailable
- Memory Management: Intelligent memory allocation for upscaling operations
- Quality Enhancement: Professional-grade image enhancement capabilities
π Quick Start Guide
1. Launch Phase 3.E
# Method 1: Using launcher script (recommended)
python run_phase3e_performance_manager.py
# Method 2: Direct Streamlit launch
streamlit run src/ui/compi_phase3e_performance_manager.py --server.port 8505
2. System Requirements Check
The launcher automatically checks:
- GPU Setup: CUDA availability and VRAM capacity
- Dependencies: Required and optional packages
- Model Support: SD 1.5 and SDXL availability
- Performance Features: xFormers and upscaler support
3. Access the Interface
- URL:
http://localhost:8505
- Interface: Professional Streamlit dashboard with real-time monitoring
- Sidebar: Live VRAM monitoring and system status
π¨ Professional Workflow
Step 1: Model Selection
- Choose Base Model: SD 1.5 (fast, compatible) or SDXL (high quality, more VRAM)
- Select Generation Mode: txt2img or img2img
- Check Compatibility: System automatically validates model/mode combinations
- Review VRAM Requirements: See memory requirements and availability status
Step 2: LoRA Integration (Optional)
- Enable LoRA: Toggle LoRA support
- Specify Path: Enter path to LoRA weights (diffusers format)
- Set Scale: Adjust LoRA influence (0.1-2.0)
- Verify Status: Check LoRA loading status and compatibility
Step 3: Performance Optimization
- Choose Optimization Level: Conservative, Balanced, Aggressive, or Extreme
- Monitor VRAM: Watch real-time memory usage in sidebar
- Adjust Settings: Fine-tune individual optimization features
- Enable Reliability: Configure OOM retry and CPU fallback options
Step 4: Generation
- Single Images: Generate individual images with full control
- Batch Processing: Create multiple images with seed sequences
- Monitor Progress: Track generation progress and memory usage
- Review Results: Analyze generation statistics and performance metrics
π§ Advanced Features
π€ Model Manager Deep Dive
Model Compatibility Matrix
SD 1.5:
β
txt2img (512x512 optimal)
β
img2img (all strengths)
β
ControlNet (full support)
β
LoRA (universal compatibility)
πΎ VRAM: 4+ GB recommended
SDXL:
β
txt2img (1024x1024 optimal)
β
img2img (limited support)
β οΈ ControlNet (requires special handling)
β
LoRA (SDXL-compatible weights only)
πΎ VRAM: 8+ GB recommended
Automatic Model Selection Logic
- VRAM < 6GB: Recommends SD 1.5 only
- VRAM 6-8GB: SD 1.5 preferred, SDXL with warnings
- VRAM 8GB+: Full SDXL support with optimizations
- CPU Mode: SD 1.5 only with aggressive optimizations
β‘ Performance Optimization Levels
Conservative Mode
- Basic attention slicing
- Standard precision (fp16/fp32)
- Minimal memory optimizations
- Best for: Stable systems, first-time users
Balanced Mode (Default)
- xFormers attention (if available)
- Attention + VAE slicing
- Automatic precision selection
- Best for: Most users, good performance/stability balance
Aggressive Mode
- All memory optimizations enabled
- VAE tiling for large images
- Maximum memory efficiency
- Best for: Limited VRAM, large batch processing
Extreme Mode
- CPU offloading enabled
- Maximum memory savings
- Slower but uses minimal VRAM
- Best for: Very limited VRAM (<4GB)
π‘οΈ Reliability Engine Strategies
Fallback Progression
Strategy 1: Original settings (100% size, 100% steps)
Strategy 2: Reduced size (75% size, 90% steps)
Strategy 3: Half size (50% size, 80% steps)
Strategy 4: Minimal (50% size, 60% steps)
Final: CPU fallback if all GPU attempts fail
Error Classification
- CUDA OOM: Triggers progressive fallback
- Model Loading: Suggests alternative models
- LoRA Errors: Disables LoRA and retries
- General Errors: Logs and reports with context
π VRAM Monitoring System
Real-time Metrics
- Total VRAM: Hardware capacity
- Used VRAM: Currently allocated memory
- Free VRAM: Available for new operations
- Usage Percentage: Current utilization level
Smart Alerts
- Green (0-60%): Optimal usage
- Yellow (60-80%): Moderate usage, monitor closely
- Red (80%+): High usage, optimization recommended
Memory Management
- Automatic Cache Clearing: Between batch generations
- Memory Leak Detection: Identifies and resolves memory issues
- Optimization Suggestions: Hardware-specific recommendations
π Performance Benchmarks
Generation Speed Comparison
SD 1.5 (512x512, 20 steps):
RTX 4090: ~15-25 seconds
RTX 3080: ~25-35 seconds
RTX 2080: ~45-60 seconds
CPU: ~5-10 minutes
SDXL (1024x1024, 20 steps):
RTX 4090: ~30-45 seconds
RTX 3080: ~60-90 seconds
RTX 2080: ~2-3 minutes (with optimizations)
CPU: ~15-30 minutes
Memory Usage Patterns
SD 1.5:
Base: ~3.5GB VRAM
+ LoRA: ~3.7GB VRAM
+ Upscaler: ~5.5GB VRAM
SDXL:
Base: ~6.5GB VRAM
+ LoRA: ~7.0GB VRAM
+ Upscaler: ~9.0GB VRAM
π Troubleshooting Guide
Common Issues & Solutions
"CUDA Out of Memory" Errors
- Enable OOM Auto-Retry: Automatic fallback handling
- Reduce Image Size: Use 512x512 instead of 1024x1024
- Lower Batch Size: Generate fewer images simultaneously
- Enable Aggressive Optimizations: Use VAE slicing/tiling
- Clear GPU Cache: Use sidebar "Clear GPU Cache" button
Slow Generation Speed
- Enable xFormers: Significant speed improvement if available
- Use Balanced Optimization: Good speed/quality trade-off
- Reduce Inference Steps: 15-20 steps often sufficient
- Check VRAM Usage: Ensure not hitting memory limits
Model Loading Failures
- Check Internet Connection: Models download on first use
- Verify Disk Space: Models require 2-7GB storage each
- Try Alternative Model: Switch between SD 1.5 and SDXL
- Clear Model Cache: Remove cached models and re-download
LoRA Loading Issues
- Verify Path: Ensure LoRA files exist at specified path
- Check Format: Use diffusers-compatible LoRA weights
- Model Compatibility: Ensure LoRA matches base model type
- Scale Adjustment: Try different LoRA scale values
π― Best Practices
π Performance Optimization
- Start Conservative: Begin with balanced settings, adjust as needed
- Monitor VRAM: Keep usage below 80% for stability
- Batch Wisely: Use smaller batches on limited hardware
- Clear Cache Regularly: Prevent memory accumulation
π€ Model Selection
- SD 1.5 for Speed: Faster generation, lower VRAM requirements
- SDXL for Quality: Higher resolution, better detail
- Match Hardware: Choose model based on available VRAM
- Test Compatibility: Verify model works with your use case
π‘οΈ Reliability
- Enable Auto-Retry: Let system handle OOM errors automatically
- Use Fallbacks: Allow progressive degradation for reliability
- Monitor Logs: Check run logs for patterns and issues
- Plan for Failures: Design workflows that handle generation failures
π Integration with CompI Ecosystem
Universal Enhancement
Phase 3.E enhances ALL existing CompI components:
- Ultimate Dashboard: Model switching and performance controls
- Phase 2.A-2.E: Reliability and optimization for all multimodal phases
- Phase 1.A-1.E: Enhanced foundation with professional features
- Phase 3.D: Performance metrics in workflow management
Backward Compatibility
- Graceful Degradation: Works on all hardware configurations
- Default Settings: Optimal defaults for most users
- Progressive Enhancement: Advanced features when available
- Legacy Support: Maintains compatibility with existing workflows
π Phase 3.E: Production-Grade CompI Complete
Phase 3.E transforms CompI into a production-grade platform with professional performance management, intelligent reliability, and advanced model capabilities.
Key Benefits:
- β Professional Performance: Industry-standard optimization and monitoring
- β Intelligent Reliability: Automatic error handling and recovery
- β Advanced Model Management: Dynamic switching and LoRA integration
- β Production Ready: Suitable for commercial and professional use
- β Universal Enhancement: Improves all existing CompI features
CompI is now a complete, production-grade multimodal AI art generation platform! π¨β¨