File size: 2,842 Bytes
b81fd31 64eba9b b81fd31 65ef566 b81fd31 4476113 b81fd31 fe11b54 b81fd31 4476113 b81fd31 65ef566 b81fd31 c0e5cba b81fd31 fe11b54 b81fd31 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
---
license: mit
---
# Model Card for FIRST
<!-- Provide a quick summary of what the model is/does. -->
FIRST is a language models trained for listwise reranking tasks leveraging the output logits of the first generated identifier to directly produce a ranked ordering of candidates. Built on the Zephyr-7B-β model, FIRST undergoes single-stage fine-tuning on a converted alphabetic version of the RankZephyr dataset. (i,e RankGPT-4 reorderings of OpenAI's Ada2 orderings for 5k queries) More details can be found in the paper.
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Model type:** Fine-tuned on listwise reranking data from Zephyr-7B-β model.
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** [https://github.com/gangiswag/llm-reranker](https://github.com/gangiswag/llm-reranker)
- **Paper:** [https://arxiv.org/abs/2406.15657](https://arxiv.org/abs/2406.15657)
### Evaluations
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
At the time of release, FIRST demonstrates superior performance across a variety of reranking datasets. The table below provides a detailed performance comparison against other LLM rerankers on the BEIR benchmark. (More details can be found in the paper.)
| Reranker | Training Data | Avg. | Climate FEVER | DBPedia | FEVER | FiQA | Hotpot QA | MS Marco | NFCorpus | NQ | Sci-docs | Sci-fact | Trec-COVID |
|---------------|----------------|-------|---------------|---------|-------|-------|-----------|----------|----------|-------|----------|----------|------------|
| Rank Vicuna | GPT 3.5 | 50.7 | **28.2** | 50.0 | 81.0 | 35.9 | 73.5 | 36.7 | 33.1 | 58.6 | 18.4 | 70.5 | 71.3 |
| Rank Zephyr | GPT 3.5 + 3.5 | 53.7 | 25.6 | 50.0 | 80.1 | **42.2** | 71.6 | 42.7 | **37.7** | 65.6 | **20.5** | **76.7** | 78.4 |
| **FIRST** | GPT-4 | **54.3** | 26.7 | **50.9**| **81.7**| **42.2** | **74.2** | **44.4** | 37.4 | **66.4**| 20.4 | 74.6 | **78.8** |
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
If you find FIRST useful for your work, please consider citing our paper:
```bibtex
@article{reddy2024first,
title={FIRST: Faster Improved Listwise Reranking with Single Token Decoding},
author={Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},
journal={arXiv preprint arXiv:2406.15657},
year={2024}
} |