google/owlv2-base-patch16-ensemble Zero-Shot Object Detection • Updated Oct 31, 2024 • 864k • 88
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59
Video Collection Stability AI's suite of image-to-video models • 5 items • Updated 18 days ago • 70
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Paper • 2408.16767 • Published Aug 29, 2024 • 30