CodeI/O Collection Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated 25 days ago • 6
VersaPRM Collection Collection of VersaPRMs using various training configurations • 8 items • Updated 30 days ago • 1
SEABO: A Simple Search-Based Method for Offline Imitation Learning Paper • 2402.03807 • Published Feb 6, 2024
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 28 days ago • 142
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation Paper • 2306.03615 • Published Jun 6, 2023
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Paper • 2410.14660 • Published Oct 18, 2024
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Paper • 2412.10713 • Published Dec 14, 2024
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 28 days ago • 142
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style2 Viewer • Updated Feb 4 • 6.82k • 55
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-72B-Instruct-style1 Viewer • Updated Feb 4 • 6.82k • 54
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 93
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 102
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 87
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-7B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 108
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 123
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-Math-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 80
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 121
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 104
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style2 Viewer • Updated Jan 9 • 6.82k • 121
yxsllgz-uts-org/Math_Consistency-Probability-Qwen2.5-1.5B-Instruct-style1 Viewer • Updated Jan 9 • 6.82k • 104