Papers
arxiv:2412.18416

Muse: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles

Published on Dec 24, 2024
Authors:
,
,
,
,
,

Abstract

Current conversational recommendation systems focus predominantly on text. However, real-world recommendation settings are generally multimodal, causing a significant gap between existing research and practical applications. To address this issue, we propose Muse, the first multimodal conversational recommendation dataset. Muse comprises 83,148 utterances from 7,000 conversations centered around the Clothing domain. Each conversation contains comprehensive multimodal interactions, rich elements, and natural dialogues. Data in Muse are automatically synthesized by a multi-agent framework powered by multimodal large language models (MLLMs). It innovatively derives user profiles from real-world scenarios rather than depending on manual design and history data for better scalability, and then it fulfills conversation simulation and optimization. Both human and LLM evaluations demonstrate the high quality of conversations in Muse. Additionally, fine-tuning experiments on three MLLMs demonstrate Muse's learnable patterns for recommendations and responses, confirming its value for multimodal conversational recommendation. Our dataset and codes are available at https://anonymous.4open.science/r/Muse-0086.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.18416 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.18416 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.18416 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.