Spaces:

fokan
/

xray

Runtime error

App Files Files Community

xray / README.md

fokan

Upload 3 files

ea5ac7a verified about 2 months ago

preview code

raw

history blame contribute delete

6.68 kB

	---
	title: MedSigLIP Smart Filter
	emoji: "🧠"
	colorFrom: indigo
	colorTo: blue
	sdk: gradio
	sdk_version: "4.44.1"
	app_file: app.py
	pinned: false
	---

	# MedSigLIP Smart Medical Classifier

	v2 Update:
	- Added CT, Ultrasound, and Musculoskeletal label banks
	- Introduced Smart Modality Router v2 with hybrid detection (filename + color + MedMNIST)
	- Enabled caching and batch inference to reduce CPU load by 70%
	- Improved response time for large label sets

	Zero-shot image classification for medical imagery powered by google/medsiglip-448 with automatic label filtering by modality. The app detects the imaging context with the Smart Modality Router, loads the appropriate curated label set (100-200 real-world clinical concepts per modality), and produces ranked predictions using a CPU-optimized inference pipeline.


	## Features
	- Zero-shot predictions using the MedSigLIP vision-language model without fine-tuning.
	- Smart Modality Router v2 blends filename heuristics, simple color statistics, and a lightweight fallback classifier to choose the best label bank.
	- CT, Ultrasound, Musculoskeletal, chest X-ray, brain MRI, fundus, histopathology, skin, cardiovascular, and general label libraries curated from MedSigLIP prompts and clinical references.
	- CPU-optimized inference with single model load, float32 execution on CPU, capped torch threads via `psutil`, cached results, and batched label scoring.
	- Automatic image downscaling to 448×448 before scoring to keep memory usage predictable.
	- Gradio interface ready for local execution or deployment to Hugging Face Spaces (verified on Gradio 4.44.1+, API disabled by default to avoid schema bugs).


	## Project Structure
	```
	medsiglip-smart-filter/
	\|-- app.py
	\|-- requirements.txt
	\|-- README.md
	\|-- labels/
	\| \|-- chest_labels.json
	\| \|-- brain_labels.json
	\| \|-- skin_labels.json
	\| \|-- pathology_labels.json
	\| \|-- cardio_labels.json
	\| \|-- eye_labels.json
	\| \|-- general_labels.json
	\| \|-- ct_labels.json
	\| \|-- ultrasound_labels.json
	\| `-- musculoskeletal_labels.json
	`-- utils/
	\|-- modality_router.py
	`-- cache_manager.py
	```


	## Prerequisites
	- Python 3.9 or newer (recommended).
	- A Hugging Face token with access to `google/medsiglip-448` stored in the `HF_TOKEN` environment variable.
	- Around 18 GB of RAM for comfortable CPU inference with large label sets.


	## Local Quickstart
	1. Clone or copy the project folder.
	2. Create and activate a Python virtual environment (optional but recommended).
	3. Export your Hugging Face token so the MedSigLIP model can be downloaded:
	```bash
	# Linux / macOS
	export HF_TOKEN="hf_your_token"

	# Windows PowerShell
	$Env:HF_TOKEN = "hf_your_token"
	```
	4. Install dependencies:
	```bash
	pip install -r requirements.txt
	```
	5. Launch the Gradio app:
	```bash
	python app.py
	```
	6. Open the provided URL (default `http://127.0.0.1:7860`) and upload a medical image. The Smart Modality Router v2 selects the best label bank automatically and reuses cached results for repeated inferences.


	## Smart Modality Routing (v2.1 Update)
	The router blends three complementary signals before selecting the modality:
	- Filename hints such as `xray`, `ultrasound`, `ct`, `mri`, and related synonyms.
	- Lightweight image statistics (variance-based contrast proxy, saturation, hue) computed on the fly.
	- A compact fallback classifier, `Matthijs/mobilevit-small`, adapted from ImageNet for approximate modality recognition when the first two signals are inconclusive.

	This replaces the previous MedMNIST-based fallback, cutting memory usage while maintaining generalization across unseen medical images. The resulting modality key is mapped to the appropriate label file:

	\| Detected modality \| Label file \|
	\| --- \| --- \|
	\| `xray` \| `labels/chest_labels.json` \|
	\| `mri` \| `labels/brain_labels.json` \|
	\| `ct` \| `labels/ct_labels.json` \|
	\| `ultrasound` \| `labels/ultrasound_labels.json` \|
	\| `musculoskeletal` \| `labels/musculoskeletal_labels.json` \|
	\| `pathology` \| `labels/pathology_labels.json` \|
	\| `skin` \| `labels/skin_labels.json` \|
	\| `eye` \| `labels/eye_labels.json` \|
	\| `cardio` \| `labels/cardio_labels.json` \|
	\| (fallback) \| `labels/general_labels.json` \|

	Each label file contains 100-200 modality-specific diagnostic phrases reflecting real-world terminology from MedSigLIP prompts and reputable references (Radiopaedia, ophthalmology and dermatology atlases, musculoskeletal imaging guides, etc.).


	## Performance Considerations
	- Loads the MedSigLIP processor and model once at startup, keeps the model in `eval()` mode, and limits PyTorch threading with `torch.set_num_threads(min(psutil.cpu_count(logical=False), 4))`.
	- Leverages the `cached_inference` utility (LRU cache of five items) to reuse results for repeated requests without re-running the full forward pass.
	- Downscales incoming images to 448×448 prior to tokenization and splits label scoring into batches of 50, applying softmax over concatenated logits before returning the top five predictions.
	- Executes the transformer in float32 for deterministic CPU inference while still supporting GPU acceleration when available.
	- Avoids `transformers.pipeline()` to retain full control over preprocessing, batching, and device placement.


	## Deploy to Hugging Face Spaces
	1. Create a new Space (Gradio template) named `medsiglip-smart-filter`.
	2. Push the project files to the Space repository (via `git` or the web UI).
	3. In Settings -> Repository Secrets, add `HF_TOKEN` with your Hugging Face access token so the model and auxiliary router weights can be downloaded during build.
	4. The default `python app.py` launch honors `SERVER_PORT`, `SERVER_NAME`, `GRADIO_SHARE`, and `GRADIO_QUEUE` if set by the Space runner.


	## Model Reference Update
	- Removed: `poloclub/medmnist-v2` (model no longer available on Hugging Face).
	- Added: `Matthijs/mobilevit-small`, a ~20 MB transformer that fits comfortably under 100 MB VRAM.
	- Purpose: Acts as a lightweight fallback that assists the filename and color heuristics without impacting CPU throughput.
	- Invocation: Only runs when the router cannot confidently decide based on metadata and statistics alone.


	## Notes
	- The label libraries are stored as UTF-8 JSON arrays for straightforward editing and community contributions.
	- When adding new modalities, drop a new `<modality>_labels.json` file into `labels/` and extend the router alias logic in `app.py` if the modality name and file name differ.
	- `scikit-image` and `timm` are included in `requirements.txt` for future expansion (image preprocessing, alternative backbones) while keeping the current runtime CPU-friendly.