Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 32
Where the model is tasked with identifying the object in a region mentioned in a query. we utilize the validation split of the LVIS dataset