Steffen Röcker's picture

Steffen Röcker PRO

sroecker

·

https://x.com/sroecker

AI & ML interests

Local models

Recent Activity

liked a model 2 days ago

NovaSky-AI/Sky-T1-32B-Preview

liked a model 2 days ago

EleutherAI/sae-llama-3.1-8b-64x

liked a model 2 days ago

Goodfire/Llama-3.1-8B-Instruct-SAE-l19

View all activity

Organizations

sroecker's activity

upvoted a paper 4 days ago

KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model

Paper • 2501.01028 • Published 11 days ago • 10

upvoted a collection 4 days ago

KaLM-embedding

5 items • Updated 10 days ago • 19

upvoted an article 4 days ago

Article

Synthetic Data Generation with FastData and Hugging Face

By

•

5 days ago

• 12

upvoted an article 6 days ago

Article

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

By

•

10 days ago

• 29

upvoted a collection 8 days ago

Dolphin 3.0

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 7 items • Updated 8 days ago • 52

upvoted an article 8 days ago

Article

Upgrading Kokoro: natural TTS for short bursts

By

•

Nov 22, 2024

• 18

upvoted an article 10 days ago

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

By

•

10 days ago

• 37

upvoted a collection 10 days ago

GLiNER

Knowledgator GLiNER models for information extraction • 8 items • Updated Dec 9, 2024 • 9

upvoted an article 12 days ago

Article

Fine-tune ModernBERT for text classification using synthetic data

By

•

14 days ago

• 22

upvoted a collection 13 days ago

4chan data (public)

1 item • Updated 14 days ago • 1

upvoted a paper 13 days ago

YuLan-Mini: An Open Data-efficient Language Model

Paper • 2412.17743 • Published 20 days ago • 61

upvoted a collection 13 days ago

YuLan-Mini

A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details. • 5 items • Updated 15 days ago • 10

upvoted an article 18 days ago

Article

🌁#81: Key AI Concepts to Follow in 2025

By

•

20 days ago

• 24

upvoted a collection 24 days ago

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 8 items • Updated 26 days ago • 48

upvoted a paper 24 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 24 days ago • 339

upvoted a collection 26 days ago

Smol but mighty

A collection of smoll but mighty models • 10 items • Updated 25 days ago • 4

upvoted 4 collections about 1 month ago

LLaMat

Foundational Large Language Models for Materials Research • 6 items • Updated Dec 13, 2024 • 3

DeepSeek-VL2

4 items • Updated 26 days ago • 36

InternVL2.5

Better than InternVL 2.0 • 18 items • Updated 3 days ago • 79

Bad Data Toolbox

PleIAs collection of models for the data processing of challenging document and data sources. • 5 items • Updated Jul 18, 2024 • 15