Hugging Face Science

company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

science's activity

fdaudensย 
posted an update about 2 hours ago
view post
Post
265
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mโ€”nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. ๐Ÿš€

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version โ€” 1M downloads alone.
eliebakย 
updated a Space 1 day ago
lewtunย 
posted an update 2 days ago
view post
Post
4730
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

๐Ÿงช Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

๐Ÿง  Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

๐Ÿ”ฅ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
  • 1 reply
ยท
m-ricย 
posted an update 3 days ago
view post
Post
2149
Today we make the biggest release in smolagents so far: ๐˜„๐—ฒ ๐—ฒ๐—ป๐—ฎ๐—ฏ๐—น๐—ฒ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€, ๐˜„๐—ต๐—ถ๐—ฐ๐—ต ๐—ฎ๐—น๐—น๐—ผ๐˜„๐˜€ ๐˜๐—ผ ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ ๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ณ๐˜‚๐—น ๐˜„๐—ฒ๐—ฏ ๐—ฏ๐—ฟ๐—ผ๐˜„๐˜€๐—ถ๐—ป๐—ด ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€! ๐Ÿฅณ

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while ๐Ÿคฏ (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog ๐Ÿ‘‰ https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here ๐Ÿ‘‰ https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
ยท
anditoย 
posted an update 4 days ago
view post
Post
1389
๐—œ๐—ป๐˜๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐—ถ๐—ป๐—ด ๐˜๐—ต๐—ฒ ๐˜„๐—ผ๐—ฟ๐—น๐—ฑ'๐˜€ ๐˜€๐—บ๐—ฎ๐—น๐—น๐—ฒ๐˜€๐˜ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—น๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น!

Weโ€™re thrilled to share ๐—ฆ๐—บ๐—ผ๐—น๐—ฉ๐—Ÿ๐—  (256M & 500M)โ€”the smallest Visual Language Models ever built. Think: running on <1GB of GPU memoryโ€”you can fine-tune it on your laptop and run it on your toaster!

Why Itโ€™s Game-Changing:
- ๐—ข๐˜‚๐˜๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐˜€ ๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ๐—ฟ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€: Even the 256M model surpasses our SOTA 80B-parameter model from just 17 months ago. Over 300x reduction!
๐— ๐—ถ๐—ด๐—ต๐˜๐˜† ๐—˜๐—ณ๐—ณ๐—ถ๐—ฐ๐—ถ๐—ฒ๐—ป๐—ฐ๐˜†: The 256M version delivers 80% of our 2.2B modelโ€™s performance, and the 500M version hits 90%
๐—Ÿ๐—ถ๐—ด๐—ต๐˜๐—ป๐—ถ๐—ป๐—ด-๐—™๐—ฎ๐˜€๐˜ ๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต: SmolVLM integrates with ColiPali for state-of-the-art retrieval speedsโ€”on par with models 10x bigger. That means cheaper, faster indexing and real-world impact.

Whatโ€™s New Under the Hood:
- ๐—ก๐—ฒ๐˜„ ๐—ฉ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—˜๐—ป๐—ฐ๐—ผ๐—ฑ๐—ฒ๐—ฟ: Smaller overall size (400M -> 93M), but with higher resolution.
- ๐—›๐—ถ๐—ด๐—ต๐—ฒ๐—ฟ ๐—ฃ๐—ถ๐˜…๐—ฒ๐—น๐˜€/๐—ง๐—ผ๐—ธ๐—ฒ๐—ป: 4096 vs. 1820โ€”more efficient image processing.
- ๐—ฆ๐—บ๐—ฎ๐—ฟ๐˜ ๐—ง๐—ผ๐—ธ๐—ฒ๐—ป๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป: Faster training and a performance boost.

Check our blog: https://huggingface.co/blog/smolervlm
The models: HuggingFaceTB/smolvlm-256m-and-500m-6791fafc5bb0ab8acc960fb0
The demo: HuggingFaceTB/SmolVLM-256M-Demo
  • 1 reply
ยท
fdaudensย 
posted an update 6 days ago
fdaudensย 
posted an update 7 days ago
view post
Post
1789
Reminder: Donโ€™t. Use. ChatGPT. As. A. Calculator. Seriously. ๐Ÿค–

Loved listening to @sasha on Hard Forkโ€”it really made me think.

A few takeaways that hit home:
- Individual culpability only gets you so far. The real priority: demanding accountability and transparency from companies.
- Evaluate if generative AI is the right tool for certain tasks (like search) before using it.

Curious about the full conversation? https://www.nytimes.com/2025/01/17/podcasts/hardfork-tiktok-rednote-environment.html. Give it a listenโ€”itโ€™s worth it! ๐ŸŒ
  • 1 reply
ยท
ariG23498ย 
posted an update 8 days ago
ariG23498ย 
posted an update 11 days ago
m-ricย 
posted an update 11 days ago
view post
Post
1165
๐— ๐—ถ๐—ป๐—ถ๐— ๐—ฎ๐˜…'๐˜€ ๐—ป๐—ฒ๐˜„ ๐— ๐—ผ๐—˜ ๐—Ÿ๐—Ÿ๐—  ๐—ฟ๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ฒ๐˜€ ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ-๐—ฆ๐—ผ๐—ป๐—ป๐—ฒ๐˜ ๐—น๐—ฒ๐˜ƒ๐—ฒ๐—น ๐˜„๐—ถ๐˜๐—ต ๐Ÿฐ๐—  ๐˜๐—ผ๐—ธ๐—ฒ๐—ป๐˜€ ๐—ฐ๐—ผ๐—ป๐˜๐—ฒ๐˜…๐˜ ๐—น๐—ฒ๐—ป๐—ด๐˜๐—ต ๐Ÿ’ฅ

This work from Chinese startup @MiniMax-AI introduces a novel architecture that achieves state-of-the-art performance while handling context windows up to 4 million tokens - roughly 20x longer than current models. The key was combining lightning attention, mixture of experts (MoE), and a careful hybrid approach.

๐—ž๐—ฒ๐˜† ๐—ถ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜๐˜€:

๐Ÿ—๏ธ MoE with novel hybrid attention:
โ€ฃ Mixture of Experts with 456B total parameters (45.9B activated per token)
โ€ฃ Combines Lightning attention (linear complexity) for most layers and traditional softmax attention every 8 layers

๐Ÿ† Outperforms leading models across benchmarks while offering vastly longer context:
โ€ฃ Competitive with GPT-4/Claude-3.5-Sonnet on most tasks
โ€ฃ Can efficiently handle 4M token contexts (vs 256K for most other LLMs)

๐Ÿ”ฌ Technical innovations enable efficient scaling:
โ€ฃ Novel expert parallel and tensor parallel strategies cut communication overhead in half
โ€ฃ Improved linear attention sequence parallelism, multi-level padding and other optimizations achieve 75% GPU utilization (that's really high, generally utilization is around 50%)

๐ŸŽฏ Thorough training strategy:
โ€ฃ Careful data curation and quality control by using a smaller preliminary version of their LLM as a judge!

Overall, not only is the model impressive, but the technical paper is also really interesting! ๐Ÿ“
It has lots of insights including a great comparison showing how a 2B MoE (24B total) far outperforms a 7B model for the same amount of FLOPs.

Read it in full here ๐Ÿ‘‰ MiniMax-01: Scaling Foundation Models with Lightning Attention (2501.08313)
Model here, allows commercial use <100M monthly users ๐Ÿ‘‰ MiniMaxAI/MiniMax-Text-01
m-ricย 
posted an update 12 days ago
view post
Post
2413
๐—ช๐—ฒ'๐˜ƒ๐—ฒ ๐—ท๐˜‚๐˜€๐˜ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐˜€๐—บ๐—ผ๐—น๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ ๐˜ƒ๐Ÿญ.๐Ÿฏ.๐Ÿฌ ๐Ÿš€, and it comes with a major feature: you can now log agent runs using OpenTelemetry to inspect them afterwards! ๐Ÿ“Š

This interactive format is IMO much easier to inspect big multi-step runs than endless console logs.

The setup is very easy, in a few lines of code.

Find a tutorial here ๐Ÿ‘‰ https://huggingface.co/docs/smolagents/tutorials/inspect_runs
  • 4 replies
ยท
fdaudensย 
posted an update 12 days ago
view post
Post
1745
AI agents are coming. But who's in control?

@meg , one of the best researchers in AI ethics, makes a critical point about autonomy: fully autonomous systems carry unknowable risks because they operate on computer logic rather than human logic.

The solution? Build systems that support & assist rather than override human decisions.

I highly recommend reading the blog post written by Meg, @evijit @sasha and @giadap . They define different levels of agent autonomy & provide a values-based analysis of risks, benefits, and uses of AI agents to help you make better decisions.

๐Ÿ‘‰ https://huggingface.co/blog/ethics-soc-7

fdaudensย 
posted an update 14 days ago
view post
Post
2300
๐Ÿ”ฅ The AI Agent hype is real! This blog post deep dives into everything you need to know before deploying them: from key definitions to practical recommendations. A must-read for anyone building the future of autonomous systems.

๐Ÿ“Š Key insight: A clear table breaking down the 5 levels of AI agents - from simple processors to fully autonomous systems. Essential framework for understanding where your agent stands on the autonomy spectrum

โš–๏ธ Deep analysis of 15 core values reveals critical trade-offs: accuracy, privacy, safety, equity & more. The same features that make agents powerful can make them risky. Understanding these trade-offs is crucial for responsible deployment

๐ŸŽฏ 6 key recommendations for the road ahead:
- Create rigorous evaluation protocols
- Study societal effects
- Understand ripple effects
- Improve transparency
- Open source can make a positive difference
- Monitor base model evolution

Read the blog post: https://huggingface.co/blog/ethics-soc-7 Brillant work by @meg @evijit @sasha @giadap
megย 
posted an update 14 days ago
view post
Post
2935
๐Ÿ’ซ...And we're live!๐Ÿ’ซ Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents"
https://huggingface.co/blog/ethics-soc-7
Our analyses found:
- There's a spectrum of "agent"-ness
- *Safety* is a key issue, leading to many other value-based concerns
Read for details & what to do next!
With @evijit , @giadap , and @sasha
m-ricย 
posted an update 15 days ago
view post
Post
615
๐—ข๐—ฆ-๐—š๐—ฒ๐—ป๐—ฒ๐˜€๐—ถ๐˜€: ๐—ป๐—ฒ๐˜„ ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฝ๐—ฟ๐—ผ๐—ฝ๐—ผ๐˜€๐—ฒ๐˜€ ๐—ฎ ๐—ป๐—ผ๐˜ƒ๐—ฒ๐—น ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐—ฑ๐—ฎ๐˜๐—ฎ ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—บ๐—ฒ๐˜๐—ต๐—ผ๐—ฑ ๐—ณ๐—ผ๐—ฟ ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ-๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ-๐—จ๐˜€๐—ฒ-๐—น๐—ถ๐—ธ๐—ฒ ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€, ๐˜„๐—ถ๐˜๐—ต ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ฒ๐˜€๐˜€๐—ถ๐˜ƒ๐—ฒ ๐—ฟ๐—ฒ๐˜€๐˜‚๐—น๐˜๐˜€! ๐Ÿ”ฅ

The main bottleneck in building GUI agents it to find training data.
GUI Agent trajectories are not easy to get by. Crowdsourcing trajectories, then manually annotating them, could be an option, but at scale, it's hard to do

You could use synthetic data generation (ask 1000s small existing GUI agents to solve tasks, keep only successful runs). But then it's hard to come up with many high level-tasks.

โžก๏ธ Well, a novel technique was just published that creates a new promising paradigm for synthetic data generation: Shanghai AI Lab researchers propose OS-Genesis, a novel way to create training data for GUI agents that flips the traditional approach on its head. Instead of starting with predefined tasks and having humans or machines execute them, OS-Genesis first explores the interface naturally, then derives meaningful tasks from those interactions.

๐Ÿ” Exploration-driven vs task-driven approach:
โ€ฃ Instead of starting with tasks, OS-Genesis first explores GUIs by clicking and interacting
โ€ฃ It then reverse-engineers high-level tasks from successful interaction patterns
โ€ฃ This leads to more natural and diverse training data than predefined tasks

๐ŸŽฏ Novel reward model for trajectory quality:
โ€ฃ Rather than discarding incomplete trajectories, OS-Genesis scores them based on coherence and completion
โ€ฃ This preserves valuable partial successes that would otherwise be wasted

๐Ÿ† Superior results across environments:
โ€ฃ Nearly doubles performance on AndroidWorld (9.8% โ†’ 17.4%)

By the way, this field of GUI agents is still in infancy, so you can still make a difference with "low-cost" setups: their paper gets SOTA results with only 8xA100!

Read the paper here ๐Ÿ‘‰ OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis (2412.19723)
BrigitteTousiย 
posted an update 18 days ago
view post
Post
1042
Community fine-tuned models are more carbon efficient than the models they are derived from! ๐Ÿฅณ๐ŸŒฟ

@alozowski @clefourrier @SaylorTwift @albertvillanova evaluated COโ‚‚ emissions associated with model inference for over 3000 models on the Open LLM Leaderboard. Interesting trends and new insights emerged...๐Ÿ‘€

Blog Post: https://huggingface.co/blog/leaderboard-emissions-analysis

Leaderboard: open-llm-leaderboard/open_llm_leaderboard
m-ricย 
posted an update 20 days ago
view post
Post
5053
Since I published it on GitHub a few days ago,
Hugging Face's new agentic library ๐˜€๐—บ๐—ผ๐—น๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ has gathered nearly 4k stars ๐Ÿคฏ

โžก๏ธ But we are just getting started on agents: so we are hiring an ML Engineer to join me and double down on this effort!

The plan is to build GUI agents: agents that can act on your computer with mouse & keyboard, like Claude Computer Use.

We will make it work better, and fully open. โœจ

Sounds like something you'd like to do? Apply here ๐Ÿ‘‰ https://apply.workable.com/huggingface/j/AF1D4E3FEB/
ยท