S P Sharan's picture

2 10 1

S P Sharan

Syzygianinfern0

·

https://spsharan.com/

AI & ML interests

LLMs, multimodal research, robotics

Recent Activity

upvoted a paper 3 days ago

s1: Simple test-time scaling

reacted to di-zhang-fdu's post with 👍 3 months ago

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models! https://github.com/SimpleBerry/LLaMA-O1/ What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF? Just a little bite of strawberry!🍓 Past related works: https://huggingface.co/papers/2410.02884 https://huggingface.co/papers/2406.07394

upvoted a paper 6 months ago

Transformer Explainer: Interactive Learning of Text-Generative Models

View all activity

Organizations

Syzygianinfern0's activity

upvoted a paper 3 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 7 days ago • 88

reacted to di-zhang-fdu's post with 👍 3 months ago

Post

6401

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF?
Just a little bite of strawberry!🍓

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

2 replies

·

upvoted a paper 6 months ago

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 157

upvoted a paper 10 months ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 107

upvoted 2 papers over 1 year ago

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Paper • 2307.16789 • Published Jul 31, 2023 • 99

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Paper • 2307.04725 • Published Jul 10, 2023 • 64

authored a paper over 1 year ago

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

Paper • 2305.00909 • Published Apr 28, 2023

upvoted 5 papers over 1 year ago

Kosmos-2: Grounding Multimodal Large Language Models to the World

Paper • 2306.14824 • Published Jun 26, 2023 • 34

Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference

Paper • 2306.12509 • Published Jun 21, 2023 • 14

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142

SayTap: Language to Quadrupedal Locomotion

Paper • 2306.07580 • Published Jun 13, 2023 • 7

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 112

New activity in Tribbiani/vicuna-7b almost 2 years ago

Is this an untouched vicuna weight?

#1 opened almost 2 years ago by

Syzygianinfern0

liked a Space almost 2 years ago

Multimodal-CoT

New activity in codeparrot/github-code about 2 years ago

Request: DOI

#4 opened about 2 years ago by

Syzygianinfern0