lamm-mit
/

Cephalo-Idefics2-vision-3x8b-beta

Model card Files Files and versions Community

mjbuehler commited on Jun 9, 2024

Commit

912d4e0

verified ·

1 Parent(s): 13d864f

Update README.md

Browse files

Files changed (1) hide show

README.md +18 -13

README.md CHANGED Viewed

@@ -50,6 +50,11 @@ The model has 20b parameters (3 experts, each 8b each, 8b active parameters duri
 ```python
 pip install transformers -U
 ```
 ```python
 import torch
@@ -115,7 +120,7 @@ The image of ants climbing over a vertical surface highlights their ability to a
 ## Make a Idefics-2-MoE model from scratch using several pre-trained models
-Download .py files that implement the Phi-3-V and the Mixture-of-Expert Vision model
 ```python
 pip install huggingface_hub
@@ -168,10 +173,10 @@ DEVICE='cuda'
 model_id_1='lamm-mit/Cephalo-Idefics-2-vision-8b-beta'
-model_1 = Idefics2ForConditionalGeneration.from_pretrained(  model_id_1,
-                                                           torch_dtype=torch.bfloat16, #if your GPU allows
-                                                           _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
-                                                           trust_remote_code=True,
                                                           )
 processor = AutoProcessor.from_pretrained(
     f"{model_id_1}",
@@ -188,18 +193,18 @@ Now, load the rest of the models:
 ```python
 model_id_2='HuggingFaceM4/idefics2-8b-chatty'
-model_2 = Idefics2ForConditionalGeneration.from_pretrained(  model_id_2,
-                                                           torch_dtype=torch.bfloat16, #if your GPU allows
-                                                           _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
-                                                           trust_remote_code=True,
                                                           )
 model_id_3='HuggingFaceM4/idefics2-8b'
-model_3 = Idefics2ForConditionalGeneration.from_pretrained(  model_id_3,
-                                                           torch_dtype=torch.bfloat16, #if your GPU allows
-                                                           _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
-                                                           trust_remote_code=True,
                                                           )
 ```
 Put on device:

 ```python
 pip install transformers -U
 ```
+Install  FlashAttention-2
+```python
+pip install flash-attn --no-build-isolation
+```
 ```python
 import torch
 ## Make a Idefics-2-MoE model from scratch using several pre-trained models
+Download .py files that implement the Idefics-2 Mixture-of-Expert Vision model:
 ```python
 pip install huggingface_hub
 model_id_1='lamm-mit/Cephalo-Idefics-2-vision-8b-beta'
+model_1 = Idefics2ForConditionalGeneration.from_pretrained( model_id_1,
+                                                            torch_dtype=torch.bfloat16, #if your GPU allows
+                                                            _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
+                                                            trust_remote_code=True,
                                                           )
 processor = AutoProcessor.from_pretrained(
     f"{model_id_1}",
 ```python
 model_id_2='HuggingFaceM4/idefics2-8b-chatty'
+model_2 = Idefics2ForConditionalGeneration.from_pretrained( model_id_2,
+                                                            torch_dtype=torch.bfloat16, #if your GPU allows
+                                                            _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
+                                                            trust_remote_code=True,
                                                           )
 model_id_3='HuggingFaceM4/idefics2-8b'
+model_3 = Idefics2ForConditionalGeneration.from_pretrained( model_id_3,
+                                                            torch_dtype=torch.bfloat16, #if your GPU allows
+                                                            _attn_implementation="flash_attention_2", #make sure Flash Attention 2 is installed
+                                                            trust_remote_code=True,
                                                           )
 ```
 Put on device: