PerplexityViewer / README.md
Bram van Es
bla
ef12530

A newer version of the Gradio SDK is available: 6.0.0

Upgrade
metadata
title: PerplexityViewer
emoji: πŸ“ˆ
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: gpl-3.0
short_description: Simple inspection of perplexity using color-gradients

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

PerplexityViewer πŸ“ˆ

A Gradio-based web application for visualizing text perplexity using color-coded gradients. Perfect for understanding how confident language models are about different parts of your text.

Features

  • Dual Model Support: Works with both decoder models (GPT, DialoGPT) and encoder models (BERT, RoBERTa)
  • Interactive Visualization: Color-coded per-token perplexity using spaCy's displaCy
  • Configurable Analysis: Adjustable iterations and MLM probability settings
  • Real-time Processing: Instant analysis with cached models for faster subsequent runs
  • Multiple Model Types:
    • Decoder Models: Calculate true perplexity for causal language models
    • Encoder Models: Calculate pseudo-perplexity using masked language modeling

How It Works

  • Red tokens: High perplexity (model is uncertain about this token)
  • Green tokens: Low perplexity (model is confident about this token)
  • Gradient colors: Show varying degrees of model confidence

Installation

  1. Clone this repository or download the files
  2. Install dependencies:
    pip install -r requirements.txt
    

Quick Start

Option 1: Using the startup script (recommended)

python run.py

Option 2: Direct launch

python app.py

Option 3: With dependency installation and testing

python run.py --install --test

Usage

  1. Enter your text in the input box
  2. Select a model from the dropdown or enter a custom HuggingFace model name
  3. Choose model type:
    • Decoder: For GPT-like models (true perplexity)
    • Encoder: For BERT-like models (pseudo-perplexity via MLM)
  4. Adjust settings (optional):
  5. Click "Analyze" to see the results

Supported Models

Decoder Models (Causal LM)

  • gpt2, distilgpt2
  • microsoft/DialoGPT-small, microsoft/DialoGPT-medium
  • openai-gpt
  • Any HuggingFace causal language model

Encoder Models (Masked LM)

  • bert-base-uncased, bert-base-cased
  • distilbert-base-uncased
  • roberta-base
  • albert-base-v2
  • Any HuggingFace masked language model

Understanding the Results

Perplexity Interpretation

  • Lower perplexity: Model is more confident (text is more predictable)
  • Higher perplexity: Model is less confident (text is more surprising)

Color Coding

  • Green: Low perplexity (≀ 2.0) - very predictable
  • Yellow/Orange: Medium perplexity (2.0-10.0) - somewhat predictable
  • Red: High perplexity (β‰₯ 10.0) - surprising or difficult to predict

Technical Details

Decoder Models (True Perplexity)

  • Uses next-token prediction to calculate perplexity
  • Formula: PPL = exp(average_cross_entropy_loss)
  • Each token's perplexity is based on how well the model predicted it given the previous context

Encoder Models (Pseudo-Perplexity)

  • Uses masked language modeling (MLM)
  • Masks each token individually and measures prediction confidence
  • Pseudo-perplexity approximates true perplexity for bidirectional models
  • All content tokens are analyzed for comprehensive results

Testing

Run the test suite to verify everything works:

python test_app.py

Or use the startup script with testing:

python run.py --test

Configuration

The app uses sensible defaults but can be customized via config.py:

  • Default model lists
  • Processing settings
  • Visualization colors and settings
  • UI configuration

Requirements

  • Python 3.7+
  • PyTorch
  • Transformers
  • Gradio 4.0+
  • spaCy
  • pandas
  • numpy

GPU Support

The app automatically uses GPU acceleration when available, falling back to CPU processing otherwise.

Troubleshooting

Common Issues

  1. Model loading errors: Ensure you have internet connection for first-time model downloads
  2. Memory issues: Try smaller models like distilgpt2 or distilbert-base-uncased
  3. CUDA out of memory: Reduce text length or use CPU-only mode
  4. Encoder models slow: This is normal - each token is analyzed individually for accuracy
  5. Single analysis: The app now performs one comprehensive analysis per run (no iterations needed)

Getting Help

If you encounter issues:

  1. Check the console output for error messages
  2. Try running the test suite: python test_app.py
  3. Ensure all dependencies are installed: pip install -r requirements.txt

Examples

Try these example texts to see the app in action:

  • "The quick brown fox jumps over the lazy dog." (Common phrase - should show low perplexity)
  • "Quantum entanglement defies classical intuition." (Technical content - may show higher perplexity)
  • "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo." (Grammatically complex - interesting perplexity patterns)