File size: 2,842 Bytes

b81fd31
 
 
 
 
 
 
64eba9b
b81fd31
 
 
 
65ef566
 
b81fd31
4476113
b81fd31
fe11b54
b81fd31
 
 
 
4476113
b81fd31
 
 
 
 
65ef566
b81fd31
 
 
 
c0e5cba
b81fd31
fe11b54
b81fd31

---
license: mit
---

# Model Card for FIRST

<!-- Provide a quick summary of what the model is/does. -->
FIRST is a language models trained for listwise reranking tasks leveraging the output logits of the first generated identifier to directly produce a ranked ordering of candidates. Built on the Zephyr-7B-β model, FIRST undergoes single-stage fine-tuning on a converted alphabetic version of the RankZephyr dataset. (i,e RankGPT-4 reorderings of OpenAI's Ada2 orderings for 5k queries) More details can be found in the paper.

### Model Description

<!-- Provide a longer summary of what this model is. -->
- **Model type:** Fine-tuned on listwise reranking data from Zephyr-7B-β model.
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** [https://github.com/gangiswag/llm-reranker](https://github.com/gangiswag/llm-reranker)
- **Paper:** [https://arxiv.org/abs/2406.15657](https://arxiv.org/abs/2406.15657)


### Evaluations

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
At the time of release, FIRST demonstrates superior performance across a variety of reranking datasets. The table below provides a detailed performance comparison against other LLM rerankers on the BEIR benchmark. (More details can be found in the paper.)
| Reranker      | Training Data  | Avg.  | Climate FEVER | DBPedia | FEVER | FiQA  | Hotpot QA | MS Marco | NFCorpus | NQ    | Sci-docs | Sci-fact | Trec-COVID |
|---------------|----------------|-------|---------------|---------|-------|-------|-----------|----------|----------|-------|----------|----------|------------|
| Rank Vicuna   | GPT 3.5        | 50.7  | **28.2**      | 50.0    | 81.0  | 35.9  | 73.5      | 36.7     | 33.1     | 58.6  | 18.4     | 70.5     | 71.3       |
| Rank Zephyr   | GPT 3.5 + 3.5  | 53.7  | 25.6          | 50.0    | 80.1  | **42.2**  | 71.6  | 42.7     | **37.7** | 65.6  | **20.5** | **76.7** | 78.4       |
| **FIRST**     | GPT-4          | **54.3**  | 26.7          | **50.9**| **81.7**| **42.2**  | **74.2**  | **44.4** | 37.4     | **66.4**| 20.4     | 74.6     | **78.8**  |

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
If you find FIRST useful for your work, please consider citing our paper:
```bibtex
@article{reddy2024first,
  title={FIRST: Faster Improved Listwise Reranking with Single Token Decoding},
  author={Reddy, Revanth Gangi and Doo, JaeHyeok and Xu, Yifei and Sultan, Md Arafat and Swain, Deevya and Sil, Avirup and Ji, Heng},
  journal={arXiv preprint arXiv:2406.15657},
  year={2024}
}