Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Nebius AI Studio
Replicate
Hyperbolic
Novita
fal
Together AI
Fireworks
SambaNova
HF Inference API
Misc
Reset Misc
grpo
Inference Endpoints
text-generation-inference
AutoTrain Compatible
4-bit precision
Carbon Emissions
Eval Results
8-bit precision
Misc with no match
Merge
custom_code
text-embeddings-inference
Mixture of Experts
Apply filters
Models
591
Full-text search
Edit filters
Sort: Trending
Active filters:
grpo
Clear all
likewendy/Qwen2.5-3B-sex-GPRO-float16
Text Generation
•
Updated
5 days ago
•
23
wenyl/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-0.1
Text Generation
•
Updated
5 days ago
•
8
Mingsmilet/Qwen-2.5-7B-Simple-GRPO
Text Generation
•
Updated
4 days ago
•
3
wenyl/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-0.0
Text Generation
•
Updated
5 days ago
•
6
andrewsiah/Qwen-2.5-1.5B-Instruct-Datamix
Text Generation
•
Updated
5 days ago
caijanfeng/Qwen2.5-7B-Open-R1-GRPO
Text Generation
•
Updated
1 day ago
•
2
junqin/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
5 days ago
SvalTek/ColdBrew-test-4bit
Text Generation
•
Updated
5 days ago
•
14
xyj787878/Qwen2.5-0.5B-GRPO-kuakua
Reinforcement Learning
•
Updated
5 days ago
•
5
mradermacher/Qwen2.5-3B-sex-GPRO-float16-GGUF
Updated
5 days ago
•
575
wenyl/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-0.4
Text Generation
•
Updated
4 days ago
•
5
Kabster/Llama3.2_MedRes_16bit_v1_5k_alpaca
Text Generation
•
Updated
5 days ago
•
13
MaziyarPanahi/falcon3-3b-reasoning-v0.1
Text Generation
•
Updated
4 days ago
•
3
SpaceGhost/SpaceGhost-8B-GRPO
Updated
3 days ago
saemin21/Qwen-2.5-1.5B-Simple-RL
Text Generation
•
Updated
4 days ago
•
1
jiaying0220/Qwen2.5-3B-GRPO-2_15_25
Text Generation
•
Updated
3 days ago
•
3
Emilio407/Dolphin3.0-Qwen2.5-0.5B-GRPO-V1
Text Generation
•
Updated
1 day ago
•
4
changjiakawhi/Qwen2.5-1.5B-Open-R1-Distill-GRPO
Text Generation
•
Updated
about 14 hours ago
saemin21/Qwen-2.5-1.5B-Simple-RL-Non-Instruct
Text Generation
•
Updated
4 days ago
•
3
WSX/Qwen2.5-1.5B-Open-R1-GRPO-FC
Text Generation
•
Updated
about 13 hours ago
•
4
RRoy233/Qwen2.5-1.5B-Open-R1-GRPO-inter
Text Generation
•
Updated
1 day ago
•
15
mradermacher/Qwen2.5-0.5B-GRPO-kuakua-GGUF
Updated
4 days ago
•
254
solarcloud/Qwen2-0.5B-GRPO-test
Updated
4 days ago
mradermacher/falcon3-3b-reasoning-v0.1-GGUF
Updated
4 days ago
•
246
kekema19/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
about 22 hours ago
•
3
lzy337/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
3 days ago
•
1
ddd123da/Qwen-2.5-7B-Simple-RL
Text Generation
•
Updated
4 days ago
•
2
luoxiaojun1992/Qwen2.5-3B-Instruct-gsm8k-merged_16bit
Text Generation
•
Updated
4 days ago
zzzch/Qwen2.5-0.5B-Open-R1-GRPO
Text Generation
•
Updated
4 days ago
•
5
ztt0821/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
Updated
2 days ago
Previous
1
...
14
15
16
17
18
...
20
Next