Spaces:

LChambon
/

NAF

Running on Zero

App Files Files Community

NAF / README.md

LChambon

Update README.md

bb088ff verified 20 days ago

preview code

raw

history blame contribute delete

2.89 kB

	---
	title: NAF Zero-Shot Feature Upsampling
	emoji: 🎯
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 6.0.1
	app_file: app.py
	pinned: false
	license: apache-2.0
	---


	# 🎯 NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering

	This Space demonstrates NAF (Neighborhood Attention Filtering), a method for upsampling features from Vision Foundation Models to any resolution without model-specific training.

	## 🚀 Features

	- Universal Upsampling: Works with any Vision Foundation Model (DINOv2, DINOv3, RADIO, DINO, SigLIP, etc.)
	- Arbitrary Resolutions: Upsample features to any target resolution while maintaining aspect ratio
	- Zero-Shot: No model-specific training or fine-tuning required
	- Interactive Demo: Upload your own images or try sample images from various domains

	## 🎨 How to Use

	1. Upload an Image: Click "Upload Your Image" or select from sample images
	2. Choose a Model: Select a Vision Foundation Model from the dropdown
	3. Set Resolution: Choose the target resolution for upsampled features (64-512)
	4. Click "Upsample Features": See the comparison between low and high-resolution features

	## 📊 Visualization

	The output shows three panels:
	- Left: Your input image
	- Center: Low-resolution features from the backbone (PCA visualization)
	- Right: High-resolution features upsampled by NAF

	Features are visualized using PCA for the first 3 principal components as RGB channels.

	## 🔬 Supported Models

	- DINOv3: Latest self-supervised vision models
	- RADIO v2.5: High-performance vision backbones
	- DINOv2: Self-supervised learning with registers
	- DINO: Original self-supervised ViT
	- SigLIP: Contrastive vision-language models

	## 📖 Learn More

	- Paper: [NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering](https://arxiv.org/abs/2501.01535)
	- Code: [GitHub Repository](https://github.com/valeoai/NAF)
	- Organization: [Valeo.ai](https://www.valeo.com/en/valeo-ai/)

	## 💡 Use Cases

	NAF enables better feature representations for:
	- Dense prediction tasks (segmentation, depth estimation)
	- High-resolution visual understanding
	- Feature matching and correspondence
	- Vision-language alignment

	## ⚙️ Technical Details

	- Input: Images up to 512px (maintains aspect ratio)
	- Processing: Backbone feature extraction → NAF upsampling
	- Output: High-resolution features at target resolution
	- Device: Runs on CPU (free tier) or GPU (faster inference)

	## 🤝 Citation

	If you use NAF in your research, please cite:

	```bibtex
	@article{chambon2025naf,
	title={NAF: Zero-Shot Feature Upsampling via Neighborhood Attention Filtering},
	author={Chambon, Lucas and others},
	journal={arXiv preprint arXiv:2501.01535},
	year={2025}
	}
	```

	## 📜 License

	This demo is released under the Apache 2.0 license.