metadata
title: DP-SGD Interactive Playground
emoji: 🛡️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
DP-SGD Interactive Playground
An interactive web application for exploring Differentially Private Stochastic Gradient Descent (DP-SGD) training. This tool helps users understand the privacy-utility trade-offs in privacy-preserving machine learning through realistic simulations and visualizations.
🚀 Recent Improvements (v2.0)
Enhanced Chart Visualization
- Clearer dual-axis charts: Improved color coding and styling to distinguish accuracy (green, solid line) from loss (red, dashed line)
- Better scaling: Separate colored axes with appropriate ranges (0-100% for accuracy, 0-3 for loss)
- Enhanced tooltips: More informative hover information with better formatting
- Visual differentiation: Added point styles, line weights, and backgrounds for clarity
Realistic DP-SGD Training Data
- Research-based accuracy ranges:
- ε=1: 60-72% accuracy (high privacy)
- ε=2-3: 75-85% accuracy (balanced)
- ε=8: 85-90% accuracy (lower privacy)
- Consistent training progress: Final metrics now match training chart progression
- Realistic learning curves: Exponential improvement with noise-dependent variation
- Proper privacy degradation: Higher noise multipliers significantly impact performance
Improved Parameter Recommendations
- Noise multiplier guidance: Optimal range σ = 0.8-1.5 for good trade-offs
- Batch size recommendations: ≥128 for DP-SGD stability
- Learning rate advice: ≤0.02 for noisy training environments
- Epochs guidance: 8-20 epochs for good convergence vs privacy cost
Dynamic Privacy-Utility Display
- Real-time privacy budget: Shows calculated ε values based on actual parameters
- Context-aware assessments: Different recommendations based on achieved accuracy
- Educational messaging: Helps users understand what constitutes good/poor trade-offs
Features
- Interactive Parameter Tuning: Adjust clipping norm, noise multiplier, batch size, learning rate, and epochs
- Real-time Training: Choose between mock simulation or actual MNIST training
- Multiple Visualizations:
- Training progress (accuracy/loss over epochs/iterations)
- Gradient clipping visualization
- Privacy budget tracking
- Smart Recommendations: Get suggestions for improving your privacy-utility trade-off
- Educational Content: Learn about DP-SGD concepts through interactive exploration
Quick Start
Prerequisites
- Python 3.8+
- pip or conda
Installation
- Clone the repository:
git clone <repository-url>
cd DPSGD
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python3 run.py
- Open your browser and navigate to
http://127.0.0.1:5000
Using the Application
- Set Parameters: Use the sliders to adjust DP-SGD parameters
- Choose Training Mode: Select between mock simulation (fast) or real MNIST training
- Run Training: Click "Run Training" to see results
- Analyze Results:
- View training progress in the interactive charts
- Check final metrics (accuracy, loss, privacy budget)
- Read personalized recommendations
- Experiment: Try the "Use Optimal Parameters" button for research-backed settings
Understanding the Results
Chart Interpretation
- Green solid line: Model accuracy (left y-axis, 0-100%)
- Red dashed line: Training loss (right y-axis, 0-3)
- Privacy Budget (ε): Lower values = stronger privacy protection
- Consistent metrics: Training progress matches final results
Recommended Parameter Ranges
- Clipping Norm (C): 1.0-2.0 (balance between privacy and utility)
- Noise Multiplier (σ): 0.8-1.5 (avoid σ > 2.0 for usable models)
- Batch Size: 128+ (larger batches help with DP-SGD stability)
- Learning Rate: 0.01-0.02 (conservative rates work better with noise)
- Epochs: 8-20 (balance convergence vs privacy cost)
Privacy-Utility Trade-offs
- ε < 1: Very strong privacy, expect 60-70% accuracy
- ε = 2-4: Good privacy-utility balance, expect 75-85% accuracy
- ε > 8: Weaker privacy, expect 85-90% accuracy
Technical Details
Architecture
- Backend: Flask with TensorFlow/Keras for real training
- Frontend: Vanilla JavaScript with Chart.js for visualizations
- Training: Supports both mock simulation and real DP-SGD with MNIST
Algorithms
- Real Training: Implements simplified DP-SGD with gradient clipping and Gaussian noise
- Mock Training: Research-based simulation reflecting actual DP-SGD behavior patterns
- Privacy Calculation: RDP-based privacy budget estimation
Research Basis
The simulation parameters and accuracy ranges are based on recent DP-SGD research:
- "TAN without a burn: Scaling Laws of DP-SGD" (2023)
- "Unlocking High-Accuracy Differentially Private Image Classification through Scale" (2022)
- "Differentially Private Generation of Small Images" (2020)
Contributing
We welcome contributions! Areas for improvement:
- Additional datasets beyond MNIST
- More sophisticated privacy accounting methods
- Enhanced visualizations
- Better mobile responsiveness
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- TensorFlow Privacy team for DP-SGD implementation
- Research community for privacy-preserving ML advances
- Chart.js for excellent visualization capabilities