Update README.md
Browse files
README.md
CHANGED
@@ -138,15 +138,18 @@ We provide our training datasets:
|
|
138 |
|
139 |
Please refer to our blog and research paper for more technical details of Satori.
|
140 |
- [Blog](https://satori-reasoning.github.io/blog/satori/)
|
141 |
-
- [Paper](https://
|
142 |
|
143 |
# **Citation**
|
144 |
If you find our model and data helpful, please cite our paper:
|
145 |
```
|
146 |
-
@
|
147 |
-
|
148 |
-
|
149 |
-
|
150 |
-
|
|
|
|
|
|
|
151 |
}
|
152 |
```
|
|
|
138 |
|
139 |
Please refer to our blog and research paper for more technical details of Satori.
|
140 |
- [Blog](https://satori-reasoning.github.io/blog/satori/)
|
141 |
+
- [Paper](https://arxiv.org/pdf/2502.02508)
|
142 |
|
143 |
# **Citation**
|
144 |
If you find our model and data helpful, please cite our paper:
|
145 |
```
|
146 |
+
@misc{shen2025satorireinforcementlearningchainofactionthought,
|
147 |
+
title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
|
148 |
+
author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
|
149 |
+
year={2025},
|
150 |
+
eprint={2502.02508},
|
151 |
+
archivePrefix={arXiv},
|
152 |
+
primaryClass={cs.CL},
|
153 |
+
url={https://arxiv.org/abs/2502.02508},
|
154 |
}
|
155 |
```
|