Slower inference compared to non modernbert model

#10
by hveigz - opened

Hello, great work on these models! Still, I'm wondering why I'm getting slower inference on a mac M1 Pro (both on cpu and mps) with this model, when compared to Alibaba-NLP/gte-multilingual-reranker-base ? Shouldn't this modernbert version be much faster?

Sign up or log in to comment