Slower inference compared to non modernbert model
#10
by
hveigz
- opened
Hello, great work on these models! Still, I'm wondering why I'm getting slower inference on a mac M1 Pro (both on cpu and mps) with this model, when compared to Alibaba-NLP/gte-multilingual-reranker-base ? Shouldn't this modernbert version be much faster?