Slower inference compared to non modernbert model

#10

by hveigz - opened 4 days ago

Discussion

hveigz

4 days ago

•

edited 4 days ago

Hello, great work on these models! Still, I'm wondering why I'm getting slower inference on a mac M1 Pro (both on cpu and mps) with this model, when compared to Alibaba-NLP/gte-multilingual-reranker-base ? Shouldn't this modernbert version be much faster?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment