--- license: mit --- # Probing Visual Language Priors in VLMs ## ImageDPO Finetuned Model This page provides the **ImageDPO** finetuned checkpoint for LLaVA-v1.5-7B used in [Probing Visual Language Priors in VLMs](https://arxiv.org/abs/2501.00569). We offer the **merged model weights** for use. ## Usage First, install the [LLaVA-v1.5 codebase](https://github.com/LLaVA-VL/LLaVA-Plus-Codebase). Run the following command to have a try: ```bash python -m llava.eval.run_llava \ --model-path ViLP/LLaVA-v1.5-7b-ImageDPO \ --image-file 'images/llava_logo.png' \ --query 'Please caption this image.' \ --conv-mode llava_v1 ``` ## Citation Information Please consider citing ***ViLP*** paper, if you find our resource helpful! ```bibtex @article{luo2024probing, title={Probing Visual Language Priors in VLMs}, author={Luo, Tiange and Cao, Ang and Lee, Gunhee and Johnson, Justin and Lee, Honglak}, journal={arXiv preprint arXiv:2501.00569}, year={2024} } ```