File size: 5,579 Bytes
a012047 e3e63bf 6640531 e3e63bf b0b2c21 6640531 e3e63bf 6640531 e3e63bf 6640531 e3e63bf b0b2c21 e3e63bf b0b2c21 6640531 e3e63bf 6640531 e3e63bf b0b2c21 e3e63bf b0b2c21 e3e63bf b0b2c21 e3e63bf b0b2c21 e3e63bf 6640531 e3e63bf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
---
title: DP-SGD Interactive Playground
emoji: 🛡️
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
---
# DP-SGD Interactive Playground
An interactive web application for exploring Differentially Private Stochastic Gradient Descent (DP-SGD) training. This tool helps users understand the privacy-utility trade-offs in privacy-preserving machine learning through realistic simulations and visualizations.
## 🚀 Recent Improvements (v2.0)
### Enhanced Chart Visualization
- **Clearer dual-axis charts**: Improved color coding and styling to distinguish accuracy (green, solid line) from loss (red, dashed line)
- **Better scaling**: Separate colored axes with appropriate ranges (0-100% for accuracy, 0-3 for loss)
- **Enhanced tooltips**: More informative hover information with better formatting
- **Visual differentiation**: Added point styles, line weights, and backgrounds for clarity
### Realistic DP-SGD Training Data
- **Research-based accuracy ranges**:
- ε=1: 60-72% accuracy (high privacy)
- ε=2-3: 75-85% accuracy (balanced)
- ε=8: 85-90% accuracy (lower privacy)
- **Consistent training progress**: Final metrics now match training chart progression
- **Realistic learning curves**: Exponential improvement with noise-dependent variation
- **Proper privacy degradation**: Higher noise multipliers significantly impact performance
### Improved Parameter Recommendations
- **Noise multiplier guidance**: Optimal range σ = 0.8-1.5 for good trade-offs
- **Batch size recommendations**: ≥128 for DP-SGD stability
- **Learning rate advice**: ≤0.02 for noisy training environments
- **Epochs guidance**: 8-20 epochs for good convergence vs privacy cost
### Dynamic Privacy-Utility Display
- **Real-time privacy budget**: Shows calculated ε values based on actual parameters
- **Context-aware assessments**: Different recommendations based on achieved accuracy
- **Educational messaging**: Helps users understand what constitutes good/poor trade-offs
## Features
- **Interactive Parameter Tuning**: Adjust clipping norm, noise multiplier, batch size, learning rate, and epochs
- **Real-time Training**: Choose between mock simulation or actual MNIST training
- **Multiple Visualizations**:
- Training progress (accuracy/loss over epochs/iterations)
- Gradient clipping visualization
- Privacy budget tracking
- **Smart Recommendations**: Get suggestions for improving your privacy-utility trade-off
- **Educational Content**: Learn about DP-SGD concepts through interactive exploration
## Quick Start
### Prerequisites
- Python 3.8+
- pip or conda
### Installation
1. Clone the repository:
```bash
git clone <repository-url>
cd DPSGD
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the application:
```bash
python3 run.py
```
4. Open your browser and navigate to `http://127.0.0.1:5000`
### Using the Application
1. **Set Parameters**: Use the sliders to adjust DP-SGD parameters
2. **Choose Training Mode**: Select between mock simulation (fast) or real MNIST training
3. **Run Training**: Click "Run Training" to see results
4. **Analyze Results**:
- View training progress in the interactive charts
- Check final metrics (accuracy, loss, privacy budget)
- Read personalized recommendations
5. **Experiment**: Try the "Use Optimal Parameters" button for research-backed settings
## Understanding the Results
### Chart Interpretation
- **Green solid line**: Model accuracy (left y-axis, 0-100%)
- **Red dashed line**: Training loss (right y-axis, 0-3)
- **Privacy Budget (ε)**: Lower values = stronger privacy protection
- **Consistent metrics**: Training progress matches final results
### Recommended Parameter Ranges
- **Clipping Norm (C)**: 1.0-2.0 (balance between privacy and utility)
- **Noise Multiplier (σ)**: 0.8-1.5 (avoid σ > 2.0 for usable models)
- **Batch Size**: 128+ (larger batches help with DP-SGD stability)
- **Learning Rate**: 0.01-0.02 (conservative rates work better with noise)
- **Epochs**: 8-20 (balance convergence vs privacy cost)
### Privacy-Utility Trade-offs
- **ε < 1**: Very strong privacy, expect 60-70% accuracy
- **ε = 2-4**: Good privacy-utility balance, expect 75-85% accuracy
- **ε > 8**: Weaker privacy, expect 85-90% accuracy
## Technical Details
### Architecture
- **Backend**: Flask with TensorFlow/Keras for real training
- **Frontend**: Vanilla JavaScript with Chart.js for visualizations
- **Training**: Supports both mock simulation and real DP-SGD with MNIST
### Algorithms
- **Real Training**: Implements simplified DP-SGD with gradient clipping and Gaussian noise
- **Mock Training**: Research-based simulation reflecting actual DP-SGD behavior patterns
- **Privacy Calculation**: RDP-based privacy budget estimation
### Research Basis
The simulation parameters and accuracy ranges are based on recent DP-SGD research:
- "TAN without a burn: Scaling Laws of DP-SGD" (2023)
- "Unlocking High-Accuracy Differentially Private Image Classification through Scale" (2022)
- "Differentially Private Generation of Small Images" (2020)
## Contributing
We welcome contributions! Areas for improvement:
- Additional datasets beyond MNIST
- More sophisticated privacy accounting methods
- Enhanced visualizations
- Better mobile responsiveness
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- TensorFlow Privacy team for DP-SGD implementation
- Research community for privacy-preserving ML advances
- Chart.js for excellent visualization capabilities |