arxiv:2501.05122
Flo Schneider
floschne
AI & ML interests
Large Vision-Language Models, Cross-modal Retrieval
Recent Activity
authored
a paper
about 6 hours ago
Why do LLaVA Vision-Language Models Reply to Images in English?
authored
a paper
about 6 hours ago
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal
Models Across Multilingual and Multicultural Vision-Language Tasks
authored
a paper
about 6 hours ago
Multilingual and Explainable Text Detoxification with Parallel Corpora
Organizations
models
None public yet
datasets
14
floschne/wismir3
Viewer
•
Updated
•
301k
•
65
floschne/xflickrco_1k
Viewer
•
Updated
•
8k
•
32
•
1
floschne/xflickrco
Viewer
•
Updated
•
16k
•
55
•
1
floschne/xgqa_1k
Viewer
•
Updated
•
8k
•
35
floschne/xvnli
Viewer
•
Updated
•
5.82k
•
32
floschne/xgqa
Viewer
•
Updated
•
77.3k
•
64
floschne/xm3600_1k
Updated
•
64
floschne/xm3600
Updated
•
59
•
5
floschne/m5b_vlod
Viewer
•
Updated
•
1.42k
•
31
floschne/m5b_vgr
Viewer
•
Updated
•
1.43k
•
32