Item-Language Model for Conversational Recommendation Paper • 2406.02844 • Published Jun 5, 2024 • 12
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs Paper • 2406.02886 • Published Jun 5, 2024 • 11
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Paper • 2406.02900 • Published Jun 5, 2024 • 14
Searching Priors Makes Text-to-Video Synthesis Better Paper • 2406.03215 • Published Jun 5, 2024 • 14
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes Paper • 2406.02897 • Published Jun 5, 2024 • 16
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Paper • 2406.03184 • Published Jun 5, 2024 • 22
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning Paper • 2406.03344 • Published Jun 5, 2024 • 21
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM Paper • 2406.02884 • Published Jun 5, 2024 • 18
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration Paper • 2406.01014 • Published Jun 3, 2024 • 34
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 40
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation Paper • 2406.02511 • Published Jun 4, 2024 • 11
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation Paper • 2406.02509 • Published Jun 4, 2024 • 10
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots Paper • 2406.02523 • Published Jun 4, 2024 • 12
Guiding a Diffusion Model with a Bad Version of Itself Paper • 2406.02507 • Published Jun 4, 2024 • 17
I4VGen: Image as Stepping Stone for Text-to-Video Generation Paper • 2406.02230 • Published Jun 4, 2024 • 18
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models Paper • 2406.02430 • Published Jun 4, 2024 • 34