clipspace / README.md
borso271's picture
Enhance README with comprehensive documentation
c190603

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: MobileCLIP Image Classifier
emoji: πŸ“Έ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

πŸ“Έ MobileCLIP-B Image Classifier

Zero-shot image classification powered by Apple's MobileCLIP-B model, served through an interactive Gradio web interface. This application enables real-time image classification against a dynamic set of text labels, with support for admin-managed label updates and optional Hugging Face Hub persistence.

🎯 Key Features

Core Capabilities

  • πŸ–ΌοΈ Zero-Shot Classification: Upload any image for instant classification without model retraining
  • 🏷️ Dynamic Label Management: Add, remove, and update classification labels on-the-fly
  • πŸ“Š Interactive Results: Visual confidence scores with sortable data tables
  • ⚑ Optimized Performance: Sub-30ms inference on GPU with re-parameterized MobileOne blocks
  • πŸ”’ Secure Admin Panel: Token-protected label management interface
  • ☁️ Hub Persistence: Optional versioned label storage on Hugging Face Hub

API Access

  • REST API: Fully accessible via Gradio's automatic API endpoints
  • Base64 Support: Direct base64 image input for backend integration
  • Batch Processing: Efficient handling of multiple classification requests

πŸ—οΈ Architecture

Components

  • app.py: Main Gradio interface with public/admin tabs and API endpoints
  • handler.py: Core model management, inference logic, and label operations
  • reparam.py: MobileOne re-parameterization for optimized inference
  • items.json: Default label catalog with metadata

Model Details

  • Architecture: MobileCLIP-B with re-parameterized MobileOne image encoder
  • Text Encoder: Optimized CLIP text transformer
  • Embedding Cache: Pre-computed text embeddings for fast inference
  • Device Support: Automatic GPU/CPU detection with float16 optimization

πŸš€ Quick Start

Environment Variables

Configure in your Space Settings β†’ Variables and secrets:

Variable Description Required
ADMIN_TOKEN Secret token for admin operations Yes (for admin)
HF_LABEL_REPO Hub dataset for label storage (e.g., user/labels) No
HF_WRITE_TOKEN Token with write permissions to dataset repo No
HF_READ_TOKEN Token with read permissions (defaults to write token) No

Usage Examples

Web Interface

  1. Navigate to the Space URL
  2. Upload an image in the Classification tab
  3. Adjust top-k results (default: 10)
  4. View ranked predictions with confidence scores

API Usage

Standard Classification:

import requests

response = requests.post(
    "YOUR_SPACE_URL/api/classify_image",
    files={"image": open("photo.jpg", "rb")},
    data={"top_k": 5}
)
results = response.json()

Base64 Input:

import base64
import requests

with open("photo.jpg", "rb") as f:
    img_base64 = base64.b64encode(f.read()).decode()

response = requests.post(
    "YOUR_SPACE_URL/api/classify_base64",
    json={
        "image": img_base64,
        "top_k": 10
    }
)
results = response.json()

πŸ”§ Admin Operations

Label Management

Authenticated admins can perform the following operations:

Add Labels

{
  "op": "upsert_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "items": [
    {"id": 100, "name": "bicycle", "prompt": "a photo of a bicycle"},
    {"id": 101, "name": "airplane", "prompt": "a photo of an airplane"}
  ]
}

Reload Specific Version

{
  "op": "reload_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "version": 5
}

Remove Labels

{
  "op": "remove_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "ids": [100, 101]
}

Label Deduplication

  • Automatic case-insensitive name deduplication
  • Prevents duplicate entries (e.g., "cat", "Cat", "CAT" treated as same)
  • ID-based deduplication for consistent label management

πŸ“¦ Hub Integration

When configured with HF_LABEL_REPO and tokens, the system automatically:

  1. Saves Snapshots: Each label update creates versioned snapshots

    • snapshots/v{N}/embeddings.safetensors: Pre-computed text embeddings
    • snapshots/v{N}/meta.json: Label metadata and model info
    • snapshots/latest.json: Points to current version
  2. Loads on Startup: Fetches latest snapshot or specified version

  3. Fallback: Uses local items.json if Hub unavailable

🎨 Default Label Catalog

The bundled items.json includes 50+ kid-friendly objects with:

  • Unique IDs and display names
  • CLIP-optimized prompts
  • Category metadata
  • Fun facts and rarity ratings

Categories include animals, toys, food, vehicles, nature, and everyday objects.

⚑ Performance Optimization

  • GPU Acceleration: Automatic CUDA detection with float16 inference
  • CPU Fallback: Graceful degradation with float32 precision
  • Embedding Cache: Pre-computed text embeddings updated on label changes
  • Re-parameterization: MobileOne blocks optimized for inference speed
  • Batch Processing: Efficient matrix operations for multi-label scoring

πŸ” Security Considerations

  • Token Protection: Admin operations require ADMIN_TOKEN
  • Private Datasets: Keep label repos private for sensitive applications
  • Input Validation: Automatic sanitization of uploaded images
  • Memory Management: Images processed and discarded after inference

πŸ“„ License

  • Model Weights: Apple Sample Code License (ASCL)
  • Interface Code: MIT License

🀝 Contributing

Contributions welcome! Areas for improvement:

  • Additional label management features
  • Performance optimizations
  • Extended API capabilities
  • Multi-language support

πŸ“š Resources