niklasm222/Qwen2.5-3B-Instruct-1K_subset-GRPO-gsm8k-prolog-prover-v1 Text Generation • Updated 1 day ago • 21
evoreign/GRPO-vllm-Meta-Llama-3.1-8B-Instruct-indonesian-legal-finetune Text Generation • Updated about 22 hours ago • 2