---
title: PerplexityViewer
emoji: 📈
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: gpl-3.0
short_description: Simple inspection of perplexity using color-gradients
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# PerplexityViewer 📈

A Gradio-based web application for visualizing text perplexity using color-coded gradients. Perfect for understanding how confident language models are about different parts of your text.

## Features

- **Dual Model Support**: Works with both decoder models (GPT, DialoGPT) and encoder models (BERT, RoBERTa)
- **Interactive Visualization**: Color-coded per-token perplexity using spaCy's displaCy
- **Configurable Analysis**: Adjustable iterations and MLM probability settings
- **Real-time Processing**: Instant analysis with cached models for faster subsequent runs
- **Multiple Model Types**:
  - **Decoder Models**: Calculate true perplexity for causal language models
  - **Encoder Models**: Calculate pseudo-perplexity using masked language modeling

## How It Works

- **Red tokens**: High perplexity (model is uncertain about this token)
- **Green tokens**: Low perplexity (model is confident about this token)
- **Gradient colors**: Show varying degrees of model confidence

## Installation

1. Clone this repository or download the files
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

## Quick Start

### Option 1: Using the startup script (recommended)
```bash
python run.py
```

### Option 2: Direct launch
```bash
python app.py
```

### Option 3: With dependency installation and testing
```bash
python run.py --install --test
```

## Usage

1. **Enter your text** in the input box
2. **Select a model** from the dropdown or enter a custom HuggingFace model name
3. **Choose model type**:
   - **Decoder**: For GPT-like models (true perplexity)
   - **Encoder**: For BERT-like models (pseudo-perplexity via MLM)
4. **Adjust settings** (optional):
5. **Click "Analyze"** to see the results

## Supported Models

### Decoder Models (Causal LM)
- `gpt2`, `distilgpt2`
- `microsoft/DialoGPT-small`, `microsoft/DialoGPT-medium`
- `openai-gpt`
- Any HuggingFace causal language model

### Encoder Models (Masked LM)
- `bert-base-uncased`, `bert-base-cased`
- `distilbert-base-uncased`
- `roberta-base`
- `albert-base-v2`
- Any HuggingFace masked language model

## Understanding the Results

### Perplexity Interpretation
- **Lower perplexity**: Model is more confident (text is more predictable)
- **Higher perplexity**: Model is less confident (text is more surprising)

### Color Coding
- **Green**: Low perplexity (≤ 2.0) - very predictable
- **Yellow/Orange**: Medium perplexity (2.0-10.0) - somewhat predictable
- **Red**: High perplexity (≥ 10.0) - surprising or difficult to predict

## Technical Details

### Decoder Models (True Perplexity)
- Uses next-token prediction to calculate perplexity
- Formula: `PPL = exp(average_cross_entropy_loss)`
- Each token's perplexity is based on how well the model predicted it given the previous context

### Encoder Models (Pseudo-Perplexity)
- Uses masked language modeling (MLM)
- Masks each token individually and measures prediction confidence
- Pseudo-perplexity approximates true perplexity for bidirectional models
- All content tokens are analyzed for comprehensive results

## Testing

Run the test suite to verify everything works:
```bash
python test_app.py
```

Or use the startup script with testing:
```bash
python run.py --test
```

## Configuration

The app uses sensible defaults but can be customized via `config.py`:
- Default model lists
- Processing settings
- Visualization colors and settings
- UI configuration

## Requirements

- Python 3.7+
- PyTorch
- Transformers
- Gradio 4.0+
- spaCy
- pandas
- numpy

## GPU Support

The app automatically uses GPU acceleration when available, falling back to CPU processing otherwise.

## Troubleshooting

### Common Issues

1. **Model loading errors**: Ensure you have internet connection for first-time model downloads
2. **Memory issues**: Try smaller models like `distilgpt2` or `distilbert-base-uncased`
3. **CUDA out of memory**: Reduce text length or use CPU-only mode
4. **Encoder models slow**: This is normal - each token is analyzed individually for accuracy
5. **Single analysis**: The app now performs one comprehensive analysis per run (no iterations needed)

### Getting Help

If you encounter issues:
1. Check the console output for error messages
2. Try running the test suite: `python test_app.py`
3. Ensure all dependencies are installed: `pip install -r requirements.txt`

## Examples

Try these example texts to see the app in action:

- **"The quick brown fox jumps over the lazy dog."** (Common phrase - should show low perplexity)
- **"Quantum entanglement defies classical intuition."** (Technical content - may show higher perplexity)
- **"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo."** (Grammatically complex - interesting perplexity patterns)