--- title: PerplexityViewer emoji: 📈 colorFrom: gray colorTo: blue sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false license: gpl-3.0 short_description: Simple inspection of perplexity using color-gradients --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # PerplexityViewer 📈 A Gradio-based web application for visualizing text perplexity using color-coded gradients. Perfect for understanding how confident language models are about different parts of your text. ## Features - **Dual Model Support**: Works with both decoder models (GPT, DialoGPT) and encoder models (BERT, RoBERTa) - **Interactive Visualization**: Color-coded per-token perplexity using spaCy's displaCy - **Configurable Analysis**: Adjustable iterations and MLM probability settings - **Real-time Processing**: Instant analysis with cached models for faster subsequent runs - **Multiple Model Types**: - **Decoder Models**: Calculate true perplexity for causal language models - **Encoder Models**: Calculate pseudo-perplexity using masked language modeling ## How It Works - **Red tokens**: High perplexity (model is uncertain about this token) - **Green tokens**: Low perplexity (model is confident about this token) - **Gradient colors**: Show varying degrees of model confidence ## Installation 1. Clone this repository or download the files 2. Install dependencies: ```bash pip install -r requirements.txt ``` ## Quick Start ### Option 1: Using the startup script (recommended) ```bash python run.py ``` ### Option 2: Direct launch ```bash python app.py ``` ### Option 3: With dependency installation and testing ```bash python run.py --install --test ``` ## Usage 1. **Enter your text** in the input box 2. **Select a model** from the dropdown or enter a custom HuggingFace model name 3. **Choose model type**: - **Decoder**: For GPT-like models (true perplexity) - **Encoder**: For BERT-like models (pseudo-perplexity via MLM) 4. **Adjust settings** (optional): 5. **Click "Analyze"** to see the results ## Supported Models ### Decoder Models (Causal LM) - `gpt2`, `distilgpt2` - `microsoft/DialoGPT-small`, `microsoft/DialoGPT-medium` - `openai-gpt` - Any HuggingFace causal language model ### Encoder Models (Masked LM) - `bert-base-uncased`, `bert-base-cased` - `distilbert-base-uncased` - `roberta-base` - `albert-base-v2` - Any HuggingFace masked language model ## Understanding the Results ### Perplexity Interpretation - **Lower perplexity**: Model is more confident (text is more predictable) - **Higher perplexity**: Model is less confident (text is more surprising) ### Color Coding - **Green**: Low perplexity (≤ 2.0) - very predictable - **Yellow/Orange**: Medium perplexity (2.0-10.0) - somewhat predictable - **Red**: High perplexity (≥ 10.0) - surprising or difficult to predict ## Technical Details ### Decoder Models (True Perplexity) - Uses next-token prediction to calculate perplexity - Formula: `PPL = exp(average_cross_entropy_loss)` - Each token's perplexity is based on how well the model predicted it given the previous context ### Encoder Models (Pseudo-Perplexity) - Uses masked language modeling (MLM) - Masks each token individually and measures prediction confidence - Pseudo-perplexity approximates true perplexity for bidirectional models - All content tokens are analyzed for comprehensive results ## Testing Run the test suite to verify everything works: ```bash python test_app.py ``` Or use the startup script with testing: ```bash python run.py --test ``` ## Configuration The app uses sensible defaults but can be customized via `config.py`: - Default model lists - Processing settings - Visualization colors and settings - UI configuration ## Requirements - Python 3.7+ - PyTorch - Transformers - Gradio 4.0+ - spaCy - pandas - numpy ## GPU Support The app automatically uses GPU acceleration when available, falling back to CPU processing otherwise. ## Troubleshooting ### Common Issues 1. **Model loading errors**: Ensure you have internet connection for first-time model downloads 2. **Memory issues**: Try smaller models like `distilgpt2` or `distilbert-base-uncased` 3. **CUDA out of memory**: Reduce text length or use CPU-only mode 4. **Encoder models slow**: This is normal - each token is analyzed individually for accuracy 5. **Single analysis**: The app now performs one comprehensive analysis per run (no iterations needed) ### Getting Help If you encounter issues: 1. Check the console output for error messages 2. Try running the test suite: `python test_app.py` 3. Ensure all dependencies are installed: `pip install -r requirements.txt` ## Examples Try these example texts to see the app in action: - **"The quick brown fox jumps over the lazy dog."** (Common phrase - should show low perplexity) - **"Quantum entanglement defies classical intuition."** (Technical content - may show higher perplexity) - **"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo."** (Grammatically complex - interesting perplexity patterns)