michaeldinzinger commited on
Commit
513644b
·
1 Parent(s): f23f5c5

Update README.md with text 2

Browse files
Files changed (1) hide show
  1. README.md +36 -6
README.md CHANGED
@@ -10790,8 +10790,10 @@ license: mit
10790
  <h4 align="center">
10791
  <p>
10792
  <a href="#acknowledgement">Acknowledgement</a>
10793
- <a href=#this-model>This Model</a> |
10794
  <a href=#usage>Usage</a> |
 
 
10795
  <p>
10796
  </h4>
10797
 
@@ -10803,12 +10805,37 @@ Furthermore, we want to acknowledge the team of Marqo, who has worked on the ide
10803
 
10804
  ## Combination of Embedding Models
10805
 
10806
- - Embedding models become more powerful and applicabole in many use cases, but the next big challenge is to make them also more efficient in terms of resource consumption.
10807
- - Our ambition is to experiment with the combination of two models to see if we can achieve a better performance with less resources. Early results have shown that models that are different to each others can complement each other and lead to better results. For coming up with a good combination, the selection of models is crucial, and the diversity (in terms of MTEB performance, architecture, training data, etc.) is a important part of it.
10808
- - What kind of combination do we use? We have combined the embeddings of two models by concatenating them, the most straightforward technique of combination. Before concatenation, it is important to normalize the embeddings to make sure that the embeddings are in the same scale.
10809
 
10810
- - We have combined the [Snowflake/snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5) and [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) models to create this model. The combined model produces an embedding with 1152 dimensions (768+384) and has a total of 142M parameters (109+33).
10811
- - This model combination performs well on the MTEB Leaderboard and is a good starting point for further experiments. However, we are aware that the combination of models is a complex topic and it is no good to only search for combinations that can climb the leaderboard. Yet, it is remarkable that the mere concatenation of two models can lead to an increase of Average nDCG@10 on the MTEB English Retrieval benchmark from 55.14 to 56.5, a climb of few spots in the leaderboard what is otherwise achieved with extensive efforts in engineering. Furthermore, it is interesting that the combination presented by the [Chimera model](https://huggingface.co/Marqo/marqo-chimera-arctic-bge-m) is performing significantly worse on the Leaderboard, even though the employed models are - on their own - more potent than pair of models combined in this repository. The reasons might be manifold, could it be the different number of model parameters, differences in the training process or maybe it just boils down to the fact how well two models can complement each other on the specific tasks of the respective benchmark. Anyways, we are looking forward to further experiments and discussions on this topic.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10812
 
10813
  ## Usage
10814
 
@@ -10935,3 +10962,6 @@ mteb==1.12.94
10935
  copyright = {Creative Commons Attribution 4.0 International}
10936
  }
10937
  ```
 
 
 
 
10790
  <h4 align="center">
10791
  <p>
10792
  <a href="#acknowledgement">Acknowledgement</a>
10793
+ <a href=#combination-of-embedding-models>Combination of Embedding Models</a> |
10794
  <a href=#usage>Usage</a> |
10795
+ <a href=#citation>Citation</a> |
10796
+ <a href=#license>License</a>
10797
  <p>
10798
  </h4>
10799
 
 
10805
 
10806
  ## Combination of Embedding Models
10807
 
10808
+ ### Overview
10809
+ Embedding models have become increasingly powerful and applicable across various use cases. However, the next significant challenge lies in enhancing their efficiency in terms of resource consumption. Our goal is to experiment with combining two embedding models to achieve better performance with fewer resources.
 
10810
 
10811
+ ### Key Insights
10812
+ 1. **Diversity Matters**: Initial findings suggest that combining models with differing characteristics can complement each other, resulting in improved outcomes. To design an effective combination, the diversity of the models—evaluated by factors like MTEB performance, architecture, and training data—is crucial.
10813
+ 2. **Combination Technique**:
10814
+ - We combine the embeddings of two models using the most straightforward approach: concatenation.
10815
+ - Prior to concatenation, we normalize the embeddings to ensure they are on the same scale. This step is vital for achieving coherent and meaningful results.
10816
+
10817
+ ### Implementation
10818
+ We combined the following models:
10819
+ - **[Snowflake/snowflake-arctic-embed-m-v1.5](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5)**
10820
+ - **[BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)**
10821
+
10822
+ #### Model Details
10823
+ - **Output Embedding Dimensions**: 1152 (768 + 384)
10824
+ - **Total Parameters**: 142M (109M + 33M)
10825
+
10826
+ ### Results
10827
+ This combination demonstrated notable performance on the **MTEB Leaderboard**, offering a promising foundation for further experimentation:
10828
+ - **Performance Improvement**: The average nDCG@10 on the MTEB English Retrieval benchmark increased from **55.14 to 56.5**, climbing several spots on the leaderboard—a feat often requiring extensive engineering efforts.
10829
+ - **Comparison with Chimera Model**:
10830
+ Interestingly, the **[Chimera model](https://huggingface.co/Marqo/marqo-chimera-arctic-bge-m)**, which employs more potent models individually, performs worse on the leaderboard. This raises questions about:
10831
+ - The role of parameter count.
10832
+ - Differences in training processes.
10833
+ - How effectively two models complement each other for specific benchmark tasks.
10834
+
10835
+ ### Future Directions
10836
+ While the results are promising, we acknowledge the complexity of model combinations and the importance of focusing on more than leaderboard rankings. The simplicity of concatenating embeddings yielding tangible gains emphasizes the potential for further exploration in this area.
10837
+
10838
+ We look forward to conducting additional experiments and engaging in discussions to deepen our understanding of effective model combinations.
10839
 
10840
  ## Usage
10841
 
 
10962
  copyright = {Creative Commons Attribution 4.0 International}
10963
  }
10964
  ```
10965
+
10966
+ ## License
10967
+ Notice that Arctic M (v1.5) is licensed under [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) and BGE Small (en; v1.5) is licensed under [MIT](https://opensource.org/licenses/MIT) license. Please refer to the licenses of the original models for more details.