Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Paper
•
2503.07572
•
Published
•
25
None defined yet.
transformers
in dedicated releases!v4.49.0-SmolVLM-2
and v4.49.0-SigLIP-2
.log_completions=True
log_completions_hub_repo='your-username/repo-name'