xray / README.md
fokan's picture
Upload 3 files
ea5ac7a verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: MedSigLIP Smart Filter
emoji: 🧠
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 4.44.1
app_file: app.py
pinned: false

MedSigLIP Smart Medical Classifier

v2 Update:

  • Added CT, Ultrasound, and Musculoskeletal label banks
  • Introduced Smart Modality Router v2 with hybrid detection (filename + color + MedMNIST)
  • Enabled caching and batch inference to reduce CPU load by 70%
  • Improved response time for large label sets

Zero-shot image classification for medical imagery powered by google/medsiglip-448 with automatic label filtering by modality. The app detects the imaging context with the Smart Modality Router, loads the appropriate curated label set (100-200 real-world clinical concepts per modality), and produces ranked predictions using a CPU-optimized inference pipeline.

Features

  • Zero-shot predictions using the MedSigLIP vision-language model without fine-tuning.
  • Smart Modality Router v2 blends filename heuristics, simple color statistics, and a lightweight fallback classifier to choose the best label bank.
  • CT, Ultrasound, Musculoskeletal, chest X-ray, brain MRI, fundus, histopathology, skin, cardiovascular, and general label libraries curated from MedSigLIP prompts and clinical references.
  • CPU-optimized inference with single model load, float32 execution on CPU, capped torch threads via psutil, cached results, and batched label scoring.
  • Automatic image downscaling to 448×448 before scoring to keep memory usage predictable.
  • Gradio interface ready for local execution or deployment to Hugging Face Spaces (verified on Gradio 4.44.1+, API disabled by default to avoid schema bugs).

Project Structure

medsiglip-smart-filter/
|-- app.py
|-- requirements.txt
|-- README.md
|-- labels/
|   |-- chest_labels.json
|   |-- brain_labels.json
|   |-- skin_labels.json
|   |-- pathology_labels.json
|   |-- cardio_labels.json
|   |-- eye_labels.json
|   |-- general_labels.json
|   |-- ct_labels.json
|   |-- ultrasound_labels.json
|   `-- musculoskeletal_labels.json
`-- utils/
    |-- modality_router.py
    `-- cache_manager.py

Prerequisites

  • Python 3.9 or newer (recommended).
  • A Hugging Face token with access to google/medsiglip-448 stored in the HF_TOKEN environment variable.
  • Around 18 GB of RAM for comfortable CPU inference with large label sets.

Local Quickstart

  1. Clone or copy the project folder.
  2. Create and activate a Python virtual environment (optional but recommended).
  3. Export your Hugging Face token so the MedSigLIP model can be downloaded:
    # Linux / macOS
    export HF_TOKEN="hf_your_token"
    
    # Windows PowerShell
    $Env:HF_TOKEN = "hf_your_token"
    
  4. Install dependencies:
    pip install -r requirements.txt
    
  5. Launch the Gradio app:
    python app.py
    
  6. Open the provided URL (default http://127.0.0.1:7860) and upload a medical image. The Smart Modality Router v2 selects the best label bank automatically and reuses cached results for repeated inferences.

Smart Modality Routing (v2.1 Update)

The router blends three complementary signals before selecting the modality:

  • Filename hints such as xray, ultrasound, ct, mri, and related synonyms.
  • Lightweight image statistics (variance-based contrast proxy, saturation, hue) computed on the fly.
  • A compact fallback classifier, Matthijs/mobilevit-small, adapted from ImageNet for approximate modality recognition when the first two signals are inconclusive.

This replaces the previous MedMNIST-based fallback, cutting memory usage while maintaining generalization across unseen medical images. The resulting modality key is mapped to the appropriate label file:

Detected modality Label file
xray labels/chest_labels.json
mri labels/brain_labels.json
ct labels/ct_labels.json
ultrasound labels/ultrasound_labels.json
musculoskeletal labels/musculoskeletal_labels.json
pathology labels/pathology_labels.json
skin labels/skin_labels.json
eye labels/eye_labels.json
cardio labels/cardio_labels.json
(fallback) labels/general_labels.json

Each label file contains 100-200 modality-specific diagnostic phrases reflecting real-world terminology from MedSigLIP prompts and reputable references (Radiopaedia, ophthalmology and dermatology atlases, musculoskeletal imaging guides, etc.).

Performance Considerations

  • Loads the MedSigLIP processor and model once at startup, keeps the model in eval() mode, and limits PyTorch threading with torch.set_num_threads(min(psutil.cpu_count(logical=False), 4)).
  • Leverages the cached_inference utility (LRU cache of five items) to reuse results for repeated requests without re-running the full forward pass.
  • Downscales incoming images to 448×448 prior to tokenization and splits label scoring into batches of 50, applying softmax over concatenated logits before returning the top five predictions.
  • Executes the transformer in float32 for deterministic CPU inference while still supporting GPU acceleration when available.
  • Avoids transformers.pipeline() to retain full control over preprocessing, batching, and device placement.

Deploy to Hugging Face Spaces

  1. Create a new Space (Gradio template) named medsiglip-smart-filter.
  2. Push the project files to the Space repository (via git or the web UI).
  3. In Settings -> Repository Secrets, add HF_TOKEN with your Hugging Face access token so the model and auxiliary router weights can be downloaded during build.
  4. The default python app.py launch honors SERVER_PORT, SERVER_NAME, GRADIO_SHARE, and GRADIO_QUEUE if set by the Space runner.

Model Reference Update

  • Removed: poloclub/medmnist-v2 (model no longer available on Hugging Face).
  • Added: Matthijs/mobilevit-small, a ~20 MB transformer that fits comfortably under 100 MB VRAM.
  • Purpose: Acts as a lightweight fallback that assists the filename and color heuristics without impacting CPU throughput.
  • Invocation: Only runs when the router cannot confidently decide based on metadata and statistics alone.

Notes

  • The label libraries are stored as UTF-8 JSON arrays for straightforward editing and community contributions.
  • When adding new modalities, drop a new <modality>_labels.json file into labels/ and extend the router alias logic in app.py if the modality name and file name differ.
  • scikit-image and timm are included in requirements.txt for future expansion (image preprocessing, alternative backbones) while keeping the current runtime CPU-friendly.