ModernBERT-base for Extractive QA
This is a single-model solution for SQuAD-like QA based on ModernBERT (Warner et al., 2024). ModernBERT is an up-to-date drop-in replacement for BERT-like Language Models. It is an Encoder-only, Pre-Norm Transformer with GeGLU activations pre-trained with Masked Language Modeling (MLM) on sequences of up to 1,024 tokens on a corpus of 2 trillion tokens of English text and code. ModernBERT adopted many recent best practices, i.e., increased masked rating, pre-normalization, no bias terms, etc, and it also seems to have the best performance in NLU tasks among base-sized encoder-only models, like BERT, RoBERTa, DeBERTa, etc. The available implementation of ModernBERT also utilizes Flash Attention, which makes it substantially faster compared to the outdated implementations of the rest, e.g., ModernBERT-base seems to run 3-4x faster compared to DeBERTa-V3-base.
- Downloads last month
- 35
Model tree for kiddothe2b/ModernBERT-base-squad2
Base model
answerdotai/ModernBERT-baseDataset used to train kiddothe2b/ModernBERT-base-squad2
Evaluation results
- Exact Match on squad_v2validation set self-reported81.294
- F1 on squad_v2validation set self-reported84.485