niklasm222/Qwen2.5-3B-Instruct-1K_subset-GRPO-gsm8k-prolog-prover-v1 Text Generation • Updated 2 days ago • 21
niklasm222/Qwen2.5-3B-Instruct-GRPO-3K-gsm8k-prolog-numerical Text Generation • Updated 7 days ago • 1
niklasm222/Qwen2.5-3B-Instruct-GRPO-500-gsm8k-prolog-numerical Text Generation • Updated 7 days ago • 6