Commit
·
06fb61a
1
Parent(s):
61db6a0
Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ You can use the raw model for masked language modeling, but it's mostly intended
|
|
14 |
Note that this model is primarily aimed at being fine-tuned on tasks such as visuo-linguistic sequence classification or visual question answering. We used this model to fine-tuned on a multi-translated version of the visual question answering task - [VQA v2](https://visualqa.org/challenge.html). Since Conceptual-12M is a dataset scraped from the internet, it will involve some biases which will also affect all fine-tuned versions of this model.
|
15 |
|
16 |
### How to use❓
|
17 |
-
You can use this model directly with a pipeline for masked language modeling:
|
18 |
```python
|
19 |
>>> from torchvision.io import read_image
|
20 |
>>> import numpy as np
|
|
|
14 |
Note that this model is primarily aimed at being fine-tuned on tasks such as visuo-linguistic sequence classification or visual question answering. We used this model to fine-tuned on a multi-translated version of the visual question answering task - [VQA v2](https://visualqa.org/challenge.html). Since Conceptual-12M is a dataset scraped from the internet, it will involve some biases which will also affect all fine-tuned versions of this model.
|
15 |
|
16 |
### How to use❓
|
17 |
+
You can use this model directly with a pipeline for masked language modeling. You will need to clone the model from [here](https://github.com/gchhablani/multilingual-vqa). An example of usage is shown below:
|
18 |
```python
|
19 |
>>> from torchvision.io import read_image
|
20 |
>>> import numpy as np
|