OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows Paper • 2412.01169 • Published Dec 2, 2024 • 12
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following Paper • 2312.06738 • Published Dec 11, 2023
Hierarchical Open-vocabulary Universal Image Segmentation Paper • 2307.00764 • Published Jul 3, 2023
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data Paper • 2402.05892 • Published Feb 8, 2024
xT: Nested Tokenization for Larger Context in Large Images Paper • 2403.01915 • Published Mar 4, 2024
Aligning Diffusion Models by Optimizing Human Utility Paper • 2404.04465 • Published Apr 6, 2024 • 14
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning Paper • 2212.14532 • Published Dec 30, 2022 • 1
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 50
Leaving Reality to Imagination: Robust Classification via Generated Datasets Paper • 2302.02503 • Published Feb 5, 2023
ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling Paper • 2307.01909 • Published Jul 4, 2023
Peering Through Preferences: Unraveling Feedback Acquisition for Aligning Large Language Models Paper • 2308.15812 • Published Aug 30, 2023 • 1
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts Paper • 2310.02255 • Published Oct 3, 2023 • 2
Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation Paper • 2305.14327 • Published May 23, 2023
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models Paper • 2401.13311 • Published Jan 24, 2024 • 10