jiayi's picture

4

jiayi

mrzjy

·

mrzjy

AI & ML interests

Gamer. AI Engineer

Recent Activity

posted an update 7 days ago

A very small project: Introducing CreativeTinyZero: https://huggingface.co/mrzjy/Qwen2.5-1.5B-GRPO-Creative-Ad-Generation Unlike the impressive DeepSeek-R1(-Zero), this project focuses on a pure reinforcement learning (RL) experiment applied to an open-domain task: creative advertisement generation. Objective: - To investigate the feasibility of applying R1-like methods to an open-domain task without a verifiable ground-truth reward, while at least demonstrating its potential. - To explore whether <think> and <answer> rewards can be explicitly designed to provide strong guidance through RL based on human prior knowledge. Note: - Our goal is not to induce self-reflective thinking, but to align with human thought processes purely through RL, without any supervised fine-tuning (SFT) on any constructed dataset. Despite its small size, the resulting 1.5B-GRPO model demonstrates intriguing generative capabilities—though it's still far from perfect.

updated a model 7 days ago

mrzjy/Qwen2.5-1.5B-GRPO-Creative-Ad-Generation

published a dataset 7 days ago

mrzjy/creative-ad-prompts-zh

View all activity

Organizations

mrzjy's activity

New activity in mrzjy/splash-art-gacha-collection-10k about 2 months ago

Dataset Viewer issue: JobManagerCrashedError

#1 opened about 2 months ago by

New activity in mrzjy/honkai_impact_3rd_chinese_dialogue_corpus 2 months ago

for the dataset

#1 opened 2 months ago by

New activity in mrzjy/fanfiction_meta 5 months ago

[bot] Conversion to Parquet

#1 opened 5 months ago by

parquet-converter

New activity in mrzjy/Chinese_interactive_novels_3k 7 months ago

[bot] Conversion to Parquet

#1 opened 7 months ago by

parquet-converter