Retrieval-augmented Large Language Models for Financial Time Series Forecasting
Abstract
Stock movement prediction, a fundamental task in financial time-series forecasting, requires identifying and retrieving critical influencing factors from vast amounts of time-series data. However, existing text-trained or numeric similarity-based retrieval methods fall short in handling complex financial analysis. To address this, we propose the first retrieval-augmented generation (RAG) framework for financial time-series forecasting, featuring three key innovations: a fine-tuned 1B parameter large language model (StockLLM) as the backbone, a novel candidate selection method leveraging LLM feedback, and a training objective that maximizes similarity between queries and historically significant sequences. This enables our retriever, FinSeer, to uncover meaningful patterns while minimizing noise in complex financial data. We also construct new datasets integrating financial indicators and historical stock prices to train FinSeer and ensure robust evaluation. Experimental results demonstrate that our RAG framework outperforms bare StockLLM and random retrieval, highlighting its effectiveness, while FinSeer surpasses existing retrieval methods, achieving an 8\% higher accuracy on BIGDATA22 and retrieving more impactful sequences. This work underscores the importance of tailored retrieval models in financial forecasting and provides a novel framework for future research.
Community
We will soon release the embedder along with the codes in https://huggingface.co/TheFinAI.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting (2024)
- Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting (2025)
- Unveiling the Potential of Text in High-Dimensional Time Series Forecasting (2025)
- TimeRAG: BOOSTING LLM Time Series Forecasting via Retrieval-Augmented Generation (2024)
- The Tabular Foundation Model TabPFN Outperforms Specialized Time Series Forecasting Models Based on Simple Features (2025)
- MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation (2024)
- Time Series Language Model for Descriptive Caption Generation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper