Top-Down Compression: Revisit Efficient Vision Token Projection for Visual Instruction Tuning Paper • 2505.11945 • Published May 17 • 5
Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation Paper • 2408.13149 • Published Aug 23, 2024 • 1