merge-crew (Merge Crew)

mlabonne

posted an update 11 days ago

Post

3292

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

mlabonne

authored a paper 2 months ago

Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation

Paper • 2410.08371 • Published Oct 10, 2024 • 1

SyedAbdul

authored a paper 3 months ago

SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments

Paper • 2410.11331 • Published Oct 15, 2024 • 7

KennethEnevoldsen

authored 4 papers 5 months ago

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

Paper • 2406.13469 • Published Jun 19, 2024

saattrupdan

authored a paper 6 months ago

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

Paper • 2406.13469 • Published Jun 19, 2024

mlabonne

posted an update 7 months ago

Post

19000

Large models are surprisingly bad storytellers.

I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."

Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.

In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.

Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.

I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.

Do you know why smaller models outperform the frontier models here?

44 replies

·

buzzcraft

authored 2 papers 7 months ago

SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries

Paper • 2406.01273 • Published Jun 3, 2024 • 1

Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG

Paper • 2406.01280 • Published Jun 3, 2024 • 2

KennethEnevoldsen

authored 3 papers 8 months ago

Augmenty: A Python Library for Structured Text Augmentation

Paper • 2312.05520 • Published Dec 9, 2023

DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition

Paper • 2402.18209 • Published Feb 28, 2024 • 1

The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding

Paper • 2406.02396 • Published Jun 4, 2024

mlabonne

posted an update 8 months ago

Post

18911

✂️ Uncensor any LLM with abliteration

I wrote an article about abliteration and how NeuralDaredevil-8B was created. Beyond removing alignment, I believe it's an interesting technique with a lot of potential. It's basically fine-tuning without retraining.

In this article, we see how it works, implement it in Google Colab, and heal the abliterated model to recover the performance drop due to this technique. The final model is an uncensored and high-quality model with the highest MMLU score on the Open LLM Leaderboard (8B category).

https://huggingface.co/blog/mlabonne/abliteration

26 replies

·

birgermoell

authored 4 papers 8 months ago

Evaluating Large Language Models with Human Feedback: Establishing a Swedish Benchmark

Paper • 2405.14006 • Published May 22, 2024

You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish

Paper • 2405.13379 • Published May 22, 2024

A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents

Paper • 2102.12302 • Published Feb 24, 2021

Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support

Paper • 2405.09300 • Published May 15, 2024

birgermoell

authored a paper 9 months ago

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

Paper • 2404.19622 • Published Apr 30, 2024 • 2

Merge Crew

AI & ML interests

merge-crew's activity

Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation

SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

DaCy: A Unified Framework for Danish NLP

Danish Foundation Models

$S^3$ -- Semantic Signal Separation

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries

Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG

Augmenty: A Python Library for Structured Text Augmentation

DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition

The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding

Evaluating Large Language Models with Human Feedback: Establishing a Swedish Benchmark

You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish

A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents

Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support

Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

AI & ML interests

Team members 14

merge-crew's activity