guipenedo's picture
guipenedo HF Staff
added app
c502324

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: Vibe Check Translations
emoji: πŸ“ˆ
colorFrom: gray
colorTo: green
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
short_description: A/B test translations

Translation A/B Testing App

A Gradio app for comparing translation quality between different model configurations through A/B testing.

Features

  • Language Selection: Choose from available languages in the S3 bucket
  • Side-by-Side Comparison: Compare translations from "few-shots" vs "no-few-shots" configurations
  • Randomized Presentation: The order of configurations is randomized to avoid bias
  • Progress Tracking: Shows current progress through the dataset
  • Results Summary: Displays final vote counts and percentages

Setup

  1. Install dependencies:
pip install -r requirements.txt
  1. Configure AWS credentials (for S3 access):
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
# or use AWS CLI: aws configure
  1. Run the app:
python app.py

The app will be available at http://localhost:7860

Usage

  1. Select Language: Choose a language from the dropdown menu
  2. Load Data: Click "Load Data" to fetch translation pairs from S3
  3. Compare Translations:
    • Original text is shown at the top
    • Two translations (A and B) are shown side by side
    • Click "Choose Left" or "Choose Right" to select the better translation
  4. View Results: After all comparisons, see the final vote counts

Data Source

The app loads translation data from s3://fineweb-multilingual-v1/experiments/translations/vibe-checks/ with the following structure:

  • {language}_Latn/few-shots.jsonl - Translations with few-shot examples
  • {language}_Latn/no-few-shots.jsonl - Translations without few-shot examples

Each JSONL file contains documents with:

  • text: Original text to translate
  • id: Unique document identifier
  • inference_results: Array with translation results