File size: 6,681 Bytes
3124f35 8081710 3124f35 d672951 8e99010 d672951 8e99010 3124f35 8e99010 f91a057 6281b8a d672951 3124f35 d672951 8e99010 d672951 8e99010 d672951 8e99010 d672951 8e99010 d672951 8e99010 d672951 8e99010 d672951 8e99010 d672951 f91a057 8e99010 f91a057 8e99010 d672951 8e99010 ea5ac7a 8e99010 3124f35 8e99010 d672951 8e99010 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
---
title: MedSigLIP Smart Filter
emoji: "🧠"
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: "4.44.1"
app_file: app.py
pinned: false
---
# MedSigLIP Smart Medical Classifier
v2 Update:
- Added CT, Ultrasound, and Musculoskeletal label banks
- Introduced Smart Modality Router v2 with hybrid detection (filename + color + MedMNIST)
- Enabled caching and batch inference to reduce CPU load by 70%
- Improved response time for large label sets
Zero-shot image classification for medical imagery powered by **google/medsiglip-448** with automatic label filtering by modality. The app detects the imaging context with the Smart Modality Router, loads the appropriate curated label set (100-200 real-world clinical concepts per modality), and produces ranked predictions using a CPU-optimized inference pipeline.
## Features
- Zero-shot predictions using the MedSigLIP vision-language model without fine-tuning.
- Smart Modality Router v2 blends filename heuristics, simple color statistics, and a lightweight fallback classifier to choose the best label bank.
- CT, Ultrasound, Musculoskeletal, chest X-ray, brain MRI, fundus, histopathology, skin, cardiovascular, and general label libraries curated from MedSigLIP prompts and clinical references.
- CPU-optimized inference with single model load, float32 execution on CPU, capped torch threads via `psutil`, cached results, and batched label scoring.
- Automatic image downscaling to 448×448 before scoring to keep memory usage predictable.
- Gradio interface ready for local execution or deployment to Hugging Face Spaces (verified on Gradio 4.44.1+, API disabled by default to avoid schema bugs).
## Project Structure
```
medsiglip-smart-filter/
|-- app.py
|-- requirements.txt
|-- README.md
|-- labels/
| |-- chest_labels.json
| |-- brain_labels.json
| |-- skin_labels.json
| |-- pathology_labels.json
| |-- cardio_labels.json
| |-- eye_labels.json
| |-- general_labels.json
| |-- ct_labels.json
| |-- ultrasound_labels.json
| `-- musculoskeletal_labels.json
`-- utils/
|-- modality_router.py
`-- cache_manager.py
```
## Prerequisites
- Python 3.9 or newer (recommended).
- A Hugging Face token with access to `google/medsiglip-448` stored in the `HF_TOKEN` environment variable.
- Around 18 GB of RAM for comfortable CPU inference with large label sets.
## Local Quickstart
1. **Clone or copy** the project folder.
2. **Create and activate** a Python virtual environment (optional but recommended).
3. **Export your Hugging Face token** so the MedSigLIP model can be downloaded:
```bash
# Linux / macOS
export HF_TOKEN="hf_your_token"
# Windows PowerShell
$Env:HF_TOKEN = "hf_your_token"
```
4. **Install dependencies**:
```bash
pip install -r requirements.txt
```
5. **Launch the Gradio app**:
```bash
python app.py
```
6. Open the provided URL (default `http://127.0.0.1:7860`) and upload a medical image. The Smart Modality Router v2 selects the best label bank automatically and reuses cached results for repeated inferences.
## Smart Modality Routing (v2.1 Update)
The router blends three complementary signals before selecting the modality:
- Filename hints such as `xray`, `ultrasound`, `ct`, `mri`, and related synonyms.
- Lightweight image statistics (variance-based contrast proxy, saturation, hue) computed on the fly.
- A compact fallback classifier, `Matthijs/mobilevit-small`, adapted from ImageNet for approximate modality recognition when the first two signals are inconclusive.
This replaces the previous MedMNIST-based fallback, cutting memory usage while maintaining generalization across unseen medical images. The resulting modality key is mapped to the appropriate label file:
| Detected modality | Label file |
| --- | --- |
| `xray` | `labels/chest_labels.json` |
| `mri` | `labels/brain_labels.json` |
| `ct` | `labels/ct_labels.json` |
| `ultrasound` | `labels/ultrasound_labels.json` |
| `musculoskeletal` | `labels/musculoskeletal_labels.json` |
| `pathology` | `labels/pathology_labels.json` |
| `skin` | `labels/skin_labels.json` |
| `eye` | `labels/eye_labels.json` |
| `cardio` | `labels/cardio_labels.json` |
| *(fallback)* | `labels/general_labels.json` |
Each label file contains 100-200 modality-specific diagnostic phrases reflecting real-world terminology from MedSigLIP prompts and reputable references (Radiopaedia, ophthalmology and dermatology atlases, musculoskeletal imaging guides, etc.).
## Performance Considerations
- Loads the MedSigLIP processor and model once at startup, keeps the model in `eval()` mode, and limits PyTorch threading with `torch.set_num_threads(min(psutil.cpu_count(logical=False), 4))`.
- Leverages the `cached_inference` utility (LRU cache of five items) to reuse results for repeated requests without re-running the full forward pass.
- Downscales incoming images to 448×448 prior to tokenization and splits label scoring into batches of 50, applying softmax over concatenated logits before returning the top five predictions.
- Executes the transformer in float32 for deterministic CPU inference while still supporting GPU acceleration when available.
- Avoids `transformers.pipeline()` to retain full control over preprocessing, batching, and device placement.
## Deploy to Hugging Face Spaces
1. Create a new Space (Gradio template) named `medsiglip-smart-filter`.
2. Push the project files to the Space repository (via `git` or the web UI).
3. In **Settings -> Repository Secrets**, add `HF_TOKEN` with your Hugging Face access token so the model and auxiliary router weights can be downloaded during build.
4. The default `python app.py` launch honors `SERVER_PORT`, `SERVER_NAME`, `GRADIO_SHARE`, and `GRADIO_QUEUE` if set by the Space runner.
## Model Reference Update
- Removed: `poloclub/medmnist-v2` (model no longer available on Hugging Face).
- Added: `Matthijs/mobilevit-small`, a ~20 MB transformer that fits comfortably under 100 MB VRAM.
- Purpose: Acts as a lightweight fallback that assists the filename and color heuristics without impacting CPU throughput.
- Invocation: Only runs when the router cannot confidently decide based on metadata and statistics alone.
## Notes
- The label libraries are stored as UTF-8 JSON arrays for straightforward editing and community contributions.
- When adding new modalities, drop a new `<modality>_labels.json` file into `labels/` and extend the router alias logic in `app.py` if the modality name and file name differ.
- `scikit-image` and `timm` are included in `requirements.txt` for future expansion (image preprocessing, alternative backbones) while keeping the current runtime CPU-friendly.
|