jieliu
/

Storm-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jieliu commited on Jun 17, 2024

Commit

2a37ce4

·

verified ·

1 Parent(s): 2a2de07

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -33,18 +33,18 @@ Recent studies show that DPO benefits from iterative training with online prefer
 ## Performance
 Our 7B model achieves a **50.5%** length-controlled win rate against GPT-4 Preview on AlpacaEval 2.0.
 <p align="center">
-    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/Tj_a1QntAxkhy2SXbOdmT.png" width="30%">
 </p>
 Our model's LC win rate improves over iterations without significantly changing the response length, indicating better alignment with human values without length bias. The final trained model (iteration 3) achieves a 50.5% LC win rate, making it the first open-source model to surpass the baseline model GPT-4 Preview.
 In addition to regular decoding, we also test beam search and best-of-n sampling on top of our trained model. Beam search over our trained model shows a 5% improvement over regular decoding, Best-of-n sampling with Starling-RM-34B achieves 61.6% LC Win rate and outperforms GPT-4 Omni.
 <p align="center">
-    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/GGa28vaREaVq099MPdqcP.png" width="50%">
 </p>
 We observe no significant degradation in traditional NLP tasks from the Huggingface Open LLM Leaderboard.
 <p align="center">
-    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/8KEm_Ladg7Kqko8mC63SN.png" width="50%">
 </p>

 ## Performance
 Our 7B model achieves a **50.5%** length-controlled win rate against GPT-4 Preview on AlpacaEval 2.0.
 <p align="center">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/Tj_a1QntAxkhy2SXbOdmT.png" width="60%">
 </p>
 Our model's LC win rate improves over iterations without significantly changing the response length, indicating better alignment with human values without length bias. The final trained model (iteration 3) achieves a 50.5% LC win rate, making it the first open-source model to surpass the baseline model GPT-4 Preview.
 In addition to regular decoding, we also test beam search and best-of-n sampling on top of our trained model. Beam search over our trained model shows a 5% improvement over regular decoding, Best-of-n sampling with Starling-RM-34B achieves 61.6% LC Win rate and outperforms GPT-4 Omni.
 <p align="center">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/GGa28vaREaVq099MPdqcP.png" width="100%">
 </p>
 We observe no significant degradation in traditional NLP tasks from the Huggingface Open LLM Leaderboard.
 <p align="center">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/639be86b59473c6ae02ef9c4/8KEm_Ladg7Kqko8mC63SN.png" width="100%">
 </p>