---
license: mit
---

# Probing Visual Language Priors in VLMs


## ImageDPO Finetuned Model  

This page provides the **ImageDPO** finetuned checkpoint for LLaVA-v1.5-7B used in [Probing Visual Language Priors in VLMs](https://arxiv.org/abs/2501.00569). We offer the **merged model weights** for use.  

## Usage  

First, install the [LLaVA-v1.5 codebase](https://github.com/LLaVA-VL/LLaVA-Plus-Codebase).    

Run the following command to have a try:

```bash
python -m llava.eval.run_llava \
    --model-path ViLP/LLaVA-v1.5-7b-ImageDPO \
    --image-file 'images/llava_logo.png' \
    --query 'Please caption this image.' \
    --conv-mode llava_v1
```


## Citation Information

Please consider citing ***ViLP*** paper, if you find our resource helpful!

```bibtex
@article{luo2024probing,
      title={Probing Visual Language Priors in VLMs},
      author={Luo, Tiange and Cao, Ang and Lee, Gunhee and Johnson, Justin and Lee, Honglak},
      journal={arXiv preprint arXiv:2501.00569},
      year={2024}
}
```