Generate image segmentation overlays and maps
Generate 3D models from images
Annotate and describe images with text prompts
a tiny vision language model