A newer version of the Gradio SDK is available:
6.1.0
title: MedSigLIP Smart Filter
emoji: 🧠
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 4.44.1
app_file: app.py
pinned: false
MedSigLIP Smart Medical Classifier
v2 Update:
- Added CT, Ultrasound, and Musculoskeletal label banks
- Introduced Smart Modality Router v2 with hybrid detection (filename + color + MedMNIST)
- Enabled caching and batch inference to reduce CPU load by 70%
- Improved response time for large label sets
Zero-shot image classification for medical imagery powered by google/medsiglip-448 with automatic label filtering by modality. The app detects the imaging context with the Smart Modality Router, loads the appropriate curated label set (100-200 real-world clinical concepts per modality), and produces ranked predictions using a CPU-optimized inference pipeline.
Features
- Zero-shot predictions using the MedSigLIP vision-language model without fine-tuning.
- Smart Modality Router v2 blends filename heuristics, simple color statistics, and a lightweight fallback classifier to choose the best label bank.
- CT, Ultrasound, Musculoskeletal, chest X-ray, brain MRI, fundus, histopathology, skin, cardiovascular, and general label libraries curated from MedSigLIP prompts and clinical references.
- CPU-optimized inference with single model load, float32 execution on CPU, capped torch threads via
psutil, cached results, and batched label scoring. - Automatic image downscaling to 448×448 before scoring to keep memory usage predictable.
- Gradio interface ready for local execution or deployment to Hugging Face Spaces (verified on Gradio 4.44.1+, API disabled by default to avoid schema bugs).
Project Structure
medsiglip-smart-filter/
|-- app.py
|-- requirements.txt
|-- README.md
|-- labels/
| |-- chest_labels.json
| |-- brain_labels.json
| |-- skin_labels.json
| |-- pathology_labels.json
| |-- cardio_labels.json
| |-- eye_labels.json
| |-- general_labels.json
| |-- ct_labels.json
| |-- ultrasound_labels.json
| `-- musculoskeletal_labels.json
`-- utils/
|-- modality_router.py
`-- cache_manager.py
Prerequisites
- Python 3.9 or newer (recommended).
- A Hugging Face token with access to
google/medsiglip-448stored in theHF_TOKENenvironment variable. - Around 18 GB of RAM for comfortable CPU inference with large label sets.
Local Quickstart
- Clone or copy the project folder.
- Create and activate a Python virtual environment (optional but recommended).
- Export your Hugging Face token so the MedSigLIP model can be downloaded:
# Linux / macOS export HF_TOKEN="hf_your_token" # Windows PowerShell $Env:HF_TOKEN = "hf_your_token" - Install dependencies:
pip install -r requirements.txt - Launch the Gradio app:
python app.py - Open the provided URL (default
http://127.0.0.1:7860) and upload a medical image. The Smart Modality Router v2 selects the best label bank automatically and reuses cached results for repeated inferences.
Smart Modality Routing (v2.1 Update)
The router blends three complementary signals before selecting the modality:
- Filename hints such as
xray,ultrasound,ct,mri, and related synonyms. - Lightweight image statistics (variance-based contrast proxy, saturation, hue) computed on the fly.
- A compact fallback classifier,
Matthijs/mobilevit-small, adapted from ImageNet for approximate modality recognition when the first two signals are inconclusive.
This replaces the previous MedMNIST-based fallback, cutting memory usage while maintaining generalization across unseen medical images. The resulting modality key is mapped to the appropriate label file:
| Detected modality | Label file |
|---|---|
xray |
labels/chest_labels.json |
mri |
labels/brain_labels.json |
ct |
labels/ct_labels.json |
ultrasound |
labels/ultrasound_labels.json |
musculoskeletal |
labels/musculoskeletal_labels.json |
pathology |
labels/pathology_labels.json |
skin |
labels/skin_labels.json |
eye |
labels/eye_labels.json |
cardio |
labels/cardio_labels.json |
| (fallback) | labels/general_labels.json |
Each label file contains 100-200 modality-specific diagnostic phrases reflecting real-world terminology from MedSigLIP prompts and reputable references (Radiopaedia, ophthalmology and dermatology atlases, musculoskeletal imaging guides, etc.).
Performance Considerations
- Loads the MedSigLIP processor and model once at startup, keeps the model in
eval()mode, and limits PyTorch threading withtorch.set_num_threads(min(psutil.cpu_count(logical=False), 4)). - Leverages the
cached_inferenceutility (LRU cache of five items) to reuse results for repeated requests without re-running the full forward pass. - Downscales incoming images to 448×448 prior to tokenization and splits label scoring into batches of 50, applying softmax over concatenated logits before returning the top five predictions.
- Executes the transformer in float32 for deterministic CPU inference while still supporting GPU acceleration when available.
- Avoids
transformers.pipeline()to retain full control over preprocessing, batching, and device placement.
Deploy to Hugging Face Spaces
- Create a new Space (Gradio template) named
medsiglip-smart-filter. - Push the project files to the Space repository (via
gitor the web UI). - In Settings -> Repository Secrets, add
HF_TOKENwith your Hugging Face access token so the model and auxiliary router weights can be downloaded during build. - The default
python app.pylaunch honorsSERVER_PORT,SERVER_NAME,GRADIO_SHARE, andGRADIO_QUEUEif set by the Space runner.
Model Reference Update
- Removed:
poloclub/medmnist-v2(model no longer available on Hugging Face). - Added:
Matthijs/mobilevit-small, a ~20 MB transformer that fits comfortably under 100 MB VRAM. - Purpose: Acts as a lightweight fallback that assists the filename and color heuristics without impacting CPU throughput.
- Invocation: Only runs when the router cannot confidently decide based on metadata and statistics alone.
Notes
- The label libraries are stored as UTF-8 JSON arrays for straightforward editing and community contributions.
- When adding new modalities, drop a new
<modality>_labels.jsonfile intolabels/and extend the router alias logic inapp.pyif the modality name and file name differ. scikit-imageandtimmare included inrequirements.txtfor future expansion (image preprocessing, alternative backbones) while keeping the current runtime CPU-friendly.