--- title: MobileCLIP Image Classifier emoji: 📸 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: mit --- # 📸 MobileCLIP-B Image Classifier Zero-shot image classification powered by Apple's MobileCLIP-B model, served through an interactive Gradio web interface. This application enables real-time image classification against a dynamic set of text labels, with support for admin-managed label updates and optional Hugging Face Hub persistence. ## 🎯 Key Features ### Core Capabilities - **🖼️ Zero-Shot Classification**: Upload any image for instant classification without model retraining - **🏷️ Dynamic Label Management**: Add, remove, and update classification labels on-the-fly - **📊 Interactive Results**: Visual confidence scores with sortable data tables - **⚡ Optimized Performance**: Sub-30ms inference on GPU with re-parameterized MobileOne blocks - **🔒 Secure Admin Panel**: Token-protected label management interface - **☁️ Hub Persistence**: Optional versioned label storage on Hugging Face Hub ### API Access - **REST API**: Fully accessible via Gradio's automatic API endpoints - **Base64 Support**: Direct base64 image input for backend integration - **Batch Processing**: Efficient handling of multiple classification requests ## 🏗️ Architecture ### Components - **`app.py`**: Main Gradio interface with public/admin tabs and API endpoints - **`handler.py`**: Core model management, inference logic, and label operations - **`reparam.py`**: MobileOne re-parameterization for optimized inference - **`items.json`**: Default label catalog with metadata ### Model Details - **Architecture**: MobileCLIP-B with re-parameterized MobileOne image encoder - **Text Encoder**: Optimized CLIP text transformer - **Embedding Cache**: Pre-computed text embeddings for fast inference - **Device Support**: Automatic GPU/CPU detection with float16 optimization ## 🚀 Quick Start ### Environment Variables Configure in your Space Settings → Variables and secrets: | Variable | Description | Required | |----------|-------------|----------| | `ADMIN_TOKEN` | Secret token for admin operations | Yes (for admin) | | `HF_LABEL_REPO` | Hub dataset for label storage (e.g., `user/labels`) | No | | `HF_WRITE_TOKEN` | Token with write permissions to dataset repo | No | | `HF_READ_TOKEN` | Token with read permissions (defaults to write token) | No | ### Usage Examples #### Web Interface 1. Navigate to the Space URL 2. Upload an image in the Classification tab 3. Adjust top-k results (default: 10) 4. View ranked predictions with confidence scores #### API Usage **Standard Classification:** ```python import requests response = requests.post( "YOUR_SPACE_URL/api/classify_image", files={"image": open("photo.jpg", "rb")}, data={"top_k": 5} ) results = response.json() ``` **Base64 Input:** ```python import base64 import requests with open("photo.jpg", "rb") as f: img_base64 = base64.b64encode(f.read()).decode() response = requests.post( "YOUR_SPACE_URL/api/classify_base64", json={ "image": img_base64, "top_k": 10 } ) results = response.json() ``` ## 🔧 Admin Operations ### Label Management Authenticated admins can perform the following operations: #### Add Labels ```json { "op": "upsert_labels", "token": "YOUR_ADMIN_TOKEN", "items": [ {"id": 100, "name": "bicycle", "prompt": "a photo of a bicycle"}, {"id": 101, "name": "airplane", "prompt": "a photo of an airplane"} ] } ``` #### Reload Specific Version ```json { "op": "reload_labels", "token": "YOUR_ADMIN_TOKEN", "version": 5 } ``` #### Remove Labels ```json { "op": "remove_labels", "token": "YOUR_ADMIN_TOKEN", "ids": [100, 101] } ``` ### Label Deduplication - Automatic case-insensitive name deduplication - Prevents duplicate entries (e.g., "cat", "Cat", "CAT" treated as same) - ID-based deduplication for consistent label management ## 📦 Hub Integration When configured with `HF_LABEL_REPO` and tokens, the system automatically: 1. **Saves Snapshots**: Each label update creates versioned snapshots - `snapshots/v{N}/embeddings.safetensors`: Pre-computed text embeddings - `snapshots/v{N}/meta.json`: Label metadata and model info - `snapshots/latest.json`: Points to current version 2. **Loads on Startup**: Fetches latest snapshot or specified version 3. **Fallback**: Uses local `items.json` if Hub unavailable ## 🎨 Default Label Catalog The bundled `items.json` includes 50+ kid-friendly objects with: - Unique IDs and display names - CLIP-optimized prompts - Category metadata - Fun facts and rarity ratings Categories include animals, toys, food, vehicles, nature, and everyday objects. ## ⚡ Performance Optimization - **GPU Acceleration**: Automatic CUDA detection with float16 inference - **CPU Fallback**: Graceful degradation with float32 precision - **Embedding Cache**: Pre-computed text embeddings updated on label changes - **Re-parameterization**: MobileOne blocks optimized for inference speed - **Batch Processing**: Efficient matrix operations for multi-label scoring ## 🔐 Security Considerations - **Token Protection**: Admin operations require `ADMIN_TOKEN` - **Private Datasets**: Keep label repos private for sensitive applications - **Input Validation**: Automatic sanitization of uploaded images - **Memory Management**: Images processed and discarded after inference ## 📄 License - **Model Weights**: Apple Sample Code License (ASCL) - **Interface Code**: MIT License ## 🤝 Contributing Contributions welcome! Areas for improvement: - Additional label management features - Performance optimizations - Extended API capabilities - Multi-language support ## 📚 Resources - [MobileCLIP Paper](https://arxiv.org/abs/2311.17049) - [OpenCLIP Library](https://github.com/mlfoundations/open_clip) - [Gradio Documentation](https://gradio.app/docs) - [Hugging Face Spaces](https://huggingface.co/spaces)