clipspace

Sleeping

App Files Files Community

clipspace / README.md

borso271

Enhance README with comprehensive documentation

c190603 3 months ago

preview code

raw

history blame contribute delete

6.03 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: MobileCLIP Image Classifier
emoji: 📸
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

📸 MobileCLIP-B Image Classifier

Zero-shot image classification powered by Apple's MobileCLIP-B model, served through an interactive Gradio web interface. This application enables real-time image classification against a dynamic set of text labels, with support for admin-managed label updates and optional Hugging Face Hub persistence.

🎯 Key Features

Core Capabilities

🖼️ Zero-Shot Classification: Upload any image for instant classification without model retraining
🏷️ Dynamic Label Management: Add, remove, and update classification labels on-the-fly
📊 Interactive Results: Visual confidence scores with sortable data tables
⚡ Optimized Performance: Sub-30ms inference on GPU with re-parameterized MobileOne blocks
🔒 Secure Admin Panel: Token-protected label management interface
☁️ Hub Persistence: Optional versioned label storage on Hugging Face Hub

API Access

REST API: Fully accessible via Gradio's automatic API endpoints
Base64 Support: Direct base64 image input for backend integration
Batch Processing: Efficient handling of multiple classification requests

🏗️ Architecture

Components

app.py: Main Gradio interface with public/admin tabs and API endpoints
handler.py: Core model management, inference logic, and label operations
reparam.py: MobileOne re-parameterization for optimized inference
items.json: Default label catalog with metadata

Model Details

Architecture: MobileCLIP-B with re-parameterized MobileOne image encoder
Text Encoder: Optimized CLIP text transformer
Embedding Cache: Pre-computed text embeddings for fast inference
Device Support: Automatic GPU/CPU detection with float16 optimization

🚀 Quick Start

Environment Variables

Configure in your Space Settings → Variables and secrets:

Variable	Description	Required
`ADMIN_TOKEN`	Secret token for admin operations	Yes (for admin)
`HF_LABEL_REPO`	Hub dataset for label storage (e.g., `user/labels`)	No
`HF_WRITE_TOKEN`	Token with write permissions to dataset repo	No
`HF_READ_TOKEN`	Token with read permissions (defaults to write token)	No

Usage Examples

Web Interface

Navigate to the Space URL
Upload an image in the Classification tab
Adjust top-k results (default: 10)
View ranked predictions with confidence scores

API Usage

Standard Classification:

import requests

response = requests.post(
    "YOUR_SPACE_URL/api/classify_image",
    files={"image": open("photo.jpg", "rb")},
    data={"top_k": 5}
)
results = response.json()

Base64 Input:

import base64
import requests

with open("photo.jpg", "rb") as f:
    img_base64 = base64.b64encode(f.read()).decode()

response = requests.post(
    "YOUR_SPACE_URL/api/classify_base64",
    json={
        "image": img_base64,
        "top_k": 10
    }
)
results = response.json()

🔧 Admin Operations

Label Management

Authenticated admins can perform the following operations:

Add Labels

{
  "op": "upsert_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "items": [
    {"id": 100, "name": "bicycle", "prompt": "a photo of a bicycle"},
    {"id": 101, "name": "airplane", "prompt": "a photo of an airplane"}
  ]
}

Reload Specific Version

{
  "op": "reload_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "version": 5
}

Remove Labels

{
  "op": "remove_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "ids": [100, 101]
}

Label Deduplication

Automatic case-insensitive name deduplication
Prevents duplicate entries (e.g., "cat", "Cat", "CAT" treated as same)
ID-based deduplication for consistent label management

📦 Hub Integration

When configured with HF_LABEL_REPO and tokens, the system automatically:

Saves Snapshots: Each label update creates versioned snapshots
- snapshots/v{N}/embeddings.safetensors: Pre-computed text embeddings
- snapshots/v{N}/meta.json: Label metadata and model info
- snapshots/latest.json: Points to current version
Loads on Startup: Fetches latest snapshot or specified version
Fallback: Uses local items.json if Hub unavailable

🎨 Default Label Catalog

The bundled items.json includes 50+ kid-friendly objects with:

Unique IDs and display names
CLIP-optimized prompts
Category metadata
Fun facts and rarity ratings

Categories include animals, toys, food, vehicles, nature, and everyday objects.

⚡ Performance Optimization

GPU Acceleration: Automatic CUDA detection with float16 inference
CPU Fallback: Graceful degradation with float32 precision
Embedding Cache: Pre-computed text embeddings updated on label changes
Re-parameterization: MobileOne blocks optimized for inference speed
Batch Processing: Efficient matrix operations for multi-label scoring

🔐 Security Considerations

Token Protection: Admin operations require ADMIN_TOKEN
Private Datasets: Keep label repos private for sensitive applications
Input Validation: Automatic sanitization of uploaded images
Memory Management: Images processed and discarded after inference

📄 License

Model Weights: Apple Sample Code License (ASCL)
Interface Code: MIT License

🤝 Contributing

Contributions welcome! Areas for improvement:

Additional label management features
Performance optimizations
Extended API capabilities
Multi-language support