guipenedo's picture
guipenedo HF Staff
added app
c502324
---
title: Vibe Check Translations
emoji: πŸ“ˆ
colorFrom: gray
colorTo: green
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
short_description: A/B test translations
---
# Translation A/B Testing App
A Gradio app for comparing translation quality between different model configurations through A/B testing.
## Features
- **Language Selection**: Choose from available languages in the S3 bucket
- **Side-by-Side Comparison**: Compare translations from "few-shots" vs "no-few-shots" configurations
- **Randomized Presentation**: The order of configurations is randomized to avoid bias
- **Progress Tracking**: Shows current progress through the dataset
- **Results Summary**: Displays final vote counts and percentages
## Setup
1. Install dependencies:
```bash
pip install -r requirements.txt
```
2. Configure AWS credentials (for S3 access):
```bash
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
# or use AWS CLI: aws configure
```
3. Run the app:
```bash
python app.py
```
The app will be available at `http://localhost:7860`
## Usage
1. **Select Language**: Choose a language from the dropdown menu
2. **Load Data**: Click "Load Data" to fetch translation pairs from S3
3. **Compare Translations**:
- Original text is shown at the top
- Two translations (A and B) are shown side by side
- Click "Choose Left" or "Choose Right" to select the better translation
4. **View Results**: After all comparisons, see the final vote counts
## Data Source
The app loads translation data from `s3://fineweb-multilingual-v1/experiments/translations/vibe-checks/` with the following structure:
- `{language}_Latn/few-shots.jsonl` - Translations with few-shot examples
- `{language}_Latn/no-few-shots.jsonl` - Translations without few-shot examples
Each JSONL file contains documents with:
- `text`: Original text to translate
- `id`: Unique document identifier
- `inference_results`: Array with translation results