Spaces:

UMCU
/

PerplexityViewer

Sleeping

App Files Files Community

PerplexityViewer / README.md

Bram van Es

bla

ef12530 25 days ago

preview code

raw

history blame contribute delete

5.1 kB

A newer version of the Gradio SDK is available: 6.0.0

Upgrade

metadata

title: PerplexityViewer
emoji: 📈
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: gpl-3.0
short_description: Simple inspection of perplexity using color-gradients

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

PerplexityViewer 📈

A Gradio-based web application for visualizing text perplexity using color-coded gradients. Perfect for understanding how confident language models are about different parts of your text.

Features

Dual Model Support: Works with both decoder models (GPT, DialoGPT) and encoder models (BERT, RoBERTa)
Interactive Visualization: Color-coded per-token perplexity using spaCy's displaCy
Configurable Analysis: Adjustable iterations and MLM probability settings
Real-time Processing: Instant analysis with cached models for faster subsequent runs
Multiple Model Types:
- Decoder Models: Calculate true perplexity for causal language models
- Encoder Models: Calculate pseudo-perplexity using masked language modeling

How It Works

Red tokens: High perplexity (model is uncertain about this token)
Green tokens: Low perplexity (model is confident about this token)
Gradient colors: Show varying degrees of model confidence

Installation

Clone this repository or download the files
Install dependencies:
```
pip install -r requirements.txt
```

Quick Start

Option 1: Using the startup script (recommended)

python run.py

Option 2: Direct launch

python app.py

Option 3: With dependency installation and testing

python run.py --install --test

Usage

Enter your text in the input box
Select a model from the dropdown or enter a custom HuggingFace model name
Choose model type:
- Decoder: For GPT-like models (true perplexity)
- Encoder: For BERT-like models (pseudo-perplexity via MLM)
Adjust settings (optional):
Click "Analyze" to see the results

Supported Models

Decoder Models (Causal LM)

gpt2, distilgpt2
microsoft/DialoGPT-small, microsoft/DialoGPT-medium
openai-gpt
Any HuggingFace causal language model

Encoder Models (Masked LM)

bert-base-uncased, bert-base-cased
distilbert-base-uncased
roberta-base
albert-base-v2
Any HuggingFace masked language model

Understanding the Results

Perplexity Interpretation

Lower perplexity: Model is more confident (text is more predictable)
Higher perplexity: Model is less confident (text is more surprising)

Color Coding

Green: Low perplexity (≤ 2.0) - very predictable
Yellow/Orange: Medium perplexity (2.0-10.0) - somewhat predictable
Red: High perplexity (≥ 10.0) - surprising or difficult to predict

Technical Details

Decoder Models (True Perplexity)

Uses next-token prediction to calculate perplexity
Formula: PPL = exp(average_cross_entropy_loss)
Each token's perplexity is based on how well the model predicted it given the previous context

Encoder Models (Pseudo-Perplexity)

Uses masked language modeling (MLM)
Masks each token individually and measures prediction confidence
Pseudo-perplexity approximates true perplexity for bidirectional models
All content tokens are analyzed for comprehensive results

Testing

Run the test suite to verify everything works:

python test_app.py

Or use the startup script with testing:

python run.py --test

Configuration

The app uses sensible defaults but can be customized via config.py:

Default model lists
Processing settings
Visualization colors and settings
UI configuration

Requirements

Python 3.7+
PyTorch
Transformers
Gradio 4.0+
spaCy
pandas
numpy

GPU Support

The app automatically uses GPU acceleration when available, falling back to CPU processing otherwise.

Troubleshooting

Common Issues

Model loading errors: Ensure you have internet connection for first-time model downloads
Memory issues: Try smaller models like distilgpt2 or distilbert-base-uncased
CUDA out of memory: Reduce text length or use CPU-only mode
Encoder models slow: This is normal - each token is analyzed individually for accuracy
Single analysis: The app now performs one comprehensive analysis per run (no iterations needed)

Getting Help

If you encounter issues:

Check the console output for error messages
Try running the test suite: python test_app.py
Ensure all dependencies are installed: pip install -r requirements.txt

Examples

Try these example texts to see the app in action:

"The quick brown fox jumps over the lazy dog." (Common phrase - should show low perplexity)
"Quantum entanglement defies classical intuition." (Technical content - may show higher perplexity)
"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo." (Grammatically complex - interesting perplexity patterns)