Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
SambaNova
fal
Replicate
Hyperbolic
Novita
Fireworks
Nebius AI Studio
Together AI
HF Inference API
Misc
Reset Misc
grpo
Inference Endpoints
text-generation-inference
AutoTrain Compatible
4-bit precision
Carbon Emissions
Eval Results
8-bit precision
Misc with no match
Merge
custom_code
text-embeddings-inference
Mixture of Experts
Apply filters
Models
535
Full-text search
Edit filters
Sort: Trending
Active filters:
grpo
Clear all
mesbahuddin1989/SmolLM2-135M-Instruct-GRPO
Text Generation
•
Updated
5 days ago
•
4
Bradley/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
4 days ago
•
101
AdamLucek/Qwen2.5-3B-Instruct-GRPO-2K-GSM8K
Text Generation
•
Updated
4 days ago
•
22
L0rsch/Llama3.1_GRPO_float16
Text Generation
•
Updated
5 days ago
•
1
araziziml/Qwen2-0.5B-GRPO
Text Generation
•
Updated
5 days ago
•
3
mradermacher/Neeru_RL-GGUF
Updated
5 days ago
•
236
Jlonge4/phi4-r1-merge
Text Generation
•
Updated
5 days ago
•
14
caijanfeng/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
3 days ago
•
6
shuheikurita/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
4 days ago
Typiiing/Qwen-2.5-3B-Simple-RL
Text Generation
•
Updated
4 days ago
•
4
xingzhou422/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
39 minutes ago
•
1
classtag/20250215081122
Updated
4 days ago
caijanfeng/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
4 days ago
taozihuahua/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
2 days ago
•
1
mradermacher/Qwen-2.5-3B-Simple-RL-GGUF
Updated
2 days ago
•
292
StevenTse7340/LoRA-1
Text Generation
•
Updated
4 days ago
seaside2003/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
3 days ago
saemin21/DeepSeek-R1-Zero-Qwen-7B-GRPO-Non-Instruct
Text Generation
•
Updated
2 days ago
•
3
classtag/20250215171308
Updated
4 days ago
GuiHaokun/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
4 days ago
•
1
shuheikurita/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
2 days ago
Jlonge4/phi4-r1-guard-v1
Text Generation
•
Updated
4 days ago
•
3
CoexistAI/deep_ft6_grp_16bit
Text Generation
•
Updated
about 15 hours ago
•
40
lzy337/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
2 days ago
•
1
JZLearnAI/Qwen25_3B_R0
Updated
4 days ago
•
40
reantverveatt/lama
Updated
3 days ago
•
100
grounded-ai/phi4-r1-guard
Text Generation
•
Updated
3 days ago
•
56
likewendy/Qwen2.5-3B-sex-GPRO-float16
Text Generation
•
Updated
3 days ago
•
22
wenyl/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-0.1
Text Generation
•
Updated
3 days ago
•
8
Mingsmilet/Qwen-2.5-7B-Simple-GRPO
Text Generation
•
Updated
2 days ago
•
2
Previous
1
...
13
14
15
16
17
18
Next