Text Generation
Transformers
Safetensors
qwen2
conversational
text-generation-inference
Inference Endpoints
chaoscodes commited on
Commit
64e0325
·
verified ·
1 Parent(s): 3055553

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -138,15 +138,18 @@ We provide our training datasets:
138
 
139
  Please refer to our blog and research paper for more technical details of Satori.
140
  - [Blog](https://satori-reasoning.github.io/blog/satori/)
141
- - [Paper](https://satori-reasoning.github.io/blog/satori/)
142
 
143
  # **Citation**
144
  If you find our model and data helpful, please cite our paper:
145
  ```
146
- @article{TBD,
147
- title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
148
- author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
149
- journal={arXiv preprint arXiv: TBD},
150
- year={2025}
 
 
 
151
  }
152
  ```
 
138
 
139
  Please refer to our blog and research paper for more technical details of Satori.
140
  - [Blog](https://satori-reasoning.github.io/blog/satori/)
141
+ - [Paper](https://arxiv.org/pdf/2502.02508)
142
 
143
  # **Citation**
144
  If you find our model and data helpful, please cite our paper:
145
  ```
146
+ @misc{shen2025satorireinforcementlearningchainofactionthought,
147
+ title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
148
+ author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
149
+ year={2025},
150
+ eprint={2502.02508},
151
+ archivePrefix={arXiv},
152
+ primaryClass={cs.CL},
153
+ url={https://arxiv.org/abs/2502.02508},
154
  }
155
  ```