metadata

title: DP-SGD Interactive Playground
emoji: 🛡️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false

DP-SGD Interactive Playground

An interactive web application for exploring Differentially Private Stochastic Gradient Descent (DP-SGD) training. This tool helps users understand the privacy-utility trade-offs in privacy-preserving machine learning through realistic simulations and visualizations.

🚀 Recent Improvements (v2.0)

Enhanced Chart Visualization

Clearer dual-axis charts: Improved color coding and styling to distinguish accuracy (green, solid line) from loss (red, dashed line)
Better scaling: Separate colored axes with appropriate ranges (0-100% for accuracy, 0-3 for loss)
Enhanced tooltips: More informative hover information with better formatting
Visual differentiation: Added point styles, line weights, and backgrounds for clarity

Realistic DP-SGD Training Data

Research-based accuracy ranges:
- ε=1: 60-72% accuracy (high privacy)
- ε=2-3: 75-85% accuracy (balanced)
- ε=8: 85-90% accuracy (lower privacy)
Consistent training progress: Final metrics now match training chart progression
Realistic learning curves: Exponential improvement with noise-dependent variation
Proper privacy degradation: Higher noise multipliers significantly impact performance

Improved Parameter Recommendations

Noise multiplier guidance: Optimal range σ = 0.8-1.5 for good trade-offs
Batch size recommendations: ≥128 for DP-SGD stability
Learning rate advice: ≤0.02 for noisy training environments
Epochs guidance: 8-20 epochs for good convergence vs privacy cost

Dynamic Privacy-Utility Display

Real-time privacy budget: Shows calculated ε values based on actual parameters
Context-aware assessments: Different recommendations based on achieved accuracy
Educational messaging: Helps users understand what constitutes good/poor trade-offs

Features

Interactive Parameter Tuning: Adjust clipping norm, noise multiplier, batch size, learning rate, and epochs
Real-time Training: Choose between mock simulation or actual MNIST training
Multiple Visualizations:
- Training progress (accuracy/loss over epochs/iterations)
- Gradient clipping visualization
- Privacy budget tracking
Smart Recommendations: Get suggestions for improving your privacy-utility trade-off
Educational Content: Learn about DP-SGD concepts through interactive exploration

Quick Start

Prerequisites

Python 3.8+
pip or conda

Installation

Clone the repository:

git clone <repository-url>
cd DPSGD

Install dependencies:

pip install -r requirements.txt

Run the application:

python3 run.py

Open your browser and navigate to http://127.0.0.1:5000

Using the Application

Set Parameters: Use the sliders to adjust DP-SGD parameters
Choose Training Mode: Select between mock simulation (fast) or real MNIST training
Run Training: Click "Run Training" to see results
Analyze Results:
- View training progress in the interactive charts
- Check final metrics (accuracy, loss, privacy budget)
- Read personalized recommendations
Experiment: Try the "Use Optimal Parameters" button for research-backed settings

Understanding the Results

Chart Interpretation

Green solid line: Model accuracy (left y-axis, 0-100%)
Red dashed line: Training loss (right y-axis, 0-3)
Privacy Budget (ε): Lower values = stronger privacy protection
Consistent metrics: Training progress matches final results

Recommended Parameter Ranges

Clipping Norm (C): 1.0-2.0 (balance between privacy and utility)
Noise Multiplier (σ): 0.8-1.5 (avoid σ > 2.0 for usable models)
Batch Size: 128+ (larger batches help with DP-SGD stability)
Learning Rate: 0.01-0.02 (conservative rates work better with noise)
Epochs: 8-20 (balance convergence vs privacy cost)

Privacy-Utility Trade-offs

ε < 1: Very strong privacy, expect 60-70% accuracy
ε = 2-4: Good privacy-utility balance, expect 75-85% accuracy
ε > 8: Weaker privacy, expect 85-90% accuracy

Technical Details

Architecture

Backend: Flask with TensorFlow/Keras for real training
Frontend: Vanilla JavaScript with Chart.js for visualizations
Training: Supports both mock simulation and real DP-SGD with MNIST

Algorithms

Real Training: Implements simplified DP-SGD with gradient clipping and Gaussian noise
Mock Training: Research-based simulation reflecting actual DP-SGD behavior patterns
Privacy Calculation: RDP-based privacy budget estimation

Research Basis

The simulation parameters and accuracy ranges are based on recent DP-SGD research:

"TAN without a burn: Scaling Laws of DP-SGD" (2023)
"Unlocking High-Accuracy Differentially Private Image Classification through Scale" (2022)
"Differentially Private Generation of Small Images" (2020)

Contributing

We welcome contributions! Areas for improvement:

Additional datasets beyond MNIST
More sophisticated privacy accounting methods
Enhanced visualizations
Better mobile responsiveness

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

TensorFlow Privacy team for DP-SGD implementation
Research community for privacy-preserving ML advances
Chart.js for excellent visualization capabilities