Qwen2.5-3B-Instruct-GRPO-basic-sampling_temp_05 / pytorch_model.bin.index.json

Commit History

Trained with Unsloth
1fc0051
verified

kenhktsui commited on