Open-Source AI Meetup

community

AI & ML interests

Open science and open source

Recent Activity

SFEvent's activity

ehristoforu 
posted an update 22 days ago
view post
Post
2967
✒️ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🤯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

🤗 For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.
julien-c 
posted an update about 1 month ago
view post
Post
8242
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
julien-c 
posted an update about 1 month ago
view post
Post
2509
wow 😮

INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.

PrimeIntellect/INTELLECT-1-Instruct
ehristoforu 
posted an update 6 months ago
view post
Post
4120
😏 Hello from Project Fluently Team!

✨ Finally we can give you some details about Supple Diffusion. We worked on it for a long time and we have little left, we apologize that we had to increase the work time.

🛠️ Some technical information. The first version will be the Small version (there will also be Medium, Large, Huge, possibly Tiny), it will be based on the SD1 architecture, that is, one text encoder, U-net, VAE. Now about each component, the first is a text encoder, it will be a CLIP model (perhaps not CLIP-L-path14), CLIP was specially retrained by us in order to achieve the universality of the model in understanding completely different styles and to simplify the prompt as much as possible. Next, we did U-net, U-net in a rather complicated way, first we trained different parts (types) of data with different U-nets, then we carried out merging using different methods, then we trained DPO and SPO using methods, and then we looked at the remaining shortcomings and further trained model, details will come later. We left VAE the same as in SD1 architecture.

🙌 Compatibility. Another goal of the Supple model series is full compatibility with Auto1111 and ComfyUI already at the release stage, the model is fully supported by these interfaces and the diffusers library and does not require adaptation, your usual Sampling methods are also compatible, such as DPM++ 2M Karras, DPM++ SDE and others.

🧐 Today, without demo images (there wasn’t much time), final work is underway on the model and we are already preparing to develop the Medium version, the release of the Small version will most likely be in mid-August or earlier.

😻 Feel free to ask your questions in the comments below the post, we will be happy to answer them, have a nice day!
  • 1 reply
·
mrm8488 
posted an update 7 months ago
view post
Post
4862
🚨Exciting news for the Multilingual Synthetic Data Community!🚨

I’ve taken inspiration from the MAGPIE paper on Llama-3-8B-instruct and extended its capabilities. Here’s what’s new!

🗞 The MAGPIE paper showcased that if you use the instruction-tuned version (Llama-3-8B-instruct) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B) on this dataset, you can improve even the it-tuned version

🤔 While reading a script by Sebastian Raschka, PhD, I wondered: Could these advancements be replicated in other languages? Specifically, could they benefit non-English datasets?

🎉 And the answer is YES! At least for Spanish. I've successfully adapted the techniques for Spanish, proving the model's flexibility and multilingual capabilities.

👩‍💻 To make this accessible, I created a basic script (heavily inspired by the Sebastian Raschka one) that allows you to generate similar datasets using ollama models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!
[Script](https://gist.github.com/mrm8488/4650a5e3cc45523798a527a3446eb312)


🔍 Explore the datasets 📚 generated using our new script!

- [Llama-3-8B](https://huggingface.co/datasets/mrm8488/dataset_llama3_5000_samples_es_4231_filtered)
- [Phi-3-medium](https://huggingface.co/datasets/mrm8488/dataset_phi3-medium_5000_samples_es_3906_filtered)
- [Phi-3-mini](https://huggingface.co/datasets/mrm8488/dataset_phi3_5000_samples_es_3282_filtered)


Note: These datasets have basic filtering. Apply additional quality filters before using them to fine-tune large language models.

Inspiration and base script:
https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/05_dataset-generation/llama3-ollama.ipynb
https://www.linkedin.com/feed/update/urn:li:activity:7210982019751661568/
·
ehristoforu 
posted an update 7 months ago
view post
Post
6365
🤗 Hello from the Project Fluently team!

🥏 We are ready to announce a new series of Supple Diffusion models, these are new generation diffusion models (about 1-2 weeks left before release).

🦾 The new series aims to take diffusion models to the next level, with performance and versatility as the main goal.

🧐 How will our models be better than others? Firstly, we worked on the CLIP models, now they understand your requests better, it will become easier to process. Secondly, we trained the models with high quality, even better than all our previous ones. Thirdly, you won’t have to keep 20 models on your disk; only 4-6 will be enough.

🗺️ Roadmap:
1. Create Supple Diffusion Small
2. Creating Supple Diffusion Medium
3. Create Supple Diffusion Large

🎆 Our models are universal for realism, and for cartoons, and for anime, and for caricatures.

💖 The project really needs your support and your recommendations and reviews, please do not hesitate to write comments under this post, thank you!

🖼️ Below are demo images made with the pre-release version of Supple Diffusion Small.
·
ehristoforu 
posted an update 7 months ago
view post
Post
3862
🦾 Hello, I present Visionix Alpha - a new hyper-realistic model based on SDXL. The main difference from all existing realism models is the attention to detail, that is, I improved not only hyperrealism, but also the overall aesthetics, anatomy, the beauty of nature, and more, and the model also has the most different faces. This model is suitable not only for realistic photos, but also for generating 2.5d anime, realistic cartoons and more.

🤗 Model on HF: ehristoforu/Visionix-alpha
🥏 Model on CivitAI: https://civitai.com/models/505719
🪄 Playground (with base and inpaint model): ehristoforu/Visionix-Playground

✏️ Inpaint version on HF: ehristoforu/Visionix-alpha-inpainting
🖋️ Inpaint version on CivitAI: https://civitai.com/models/505719?modelVersionId=563519
  • 1 reply
·
ehristoforu 
posted an update 7 months ago
ehristoforu 
posted an update 7 months ago
ehristoforu 
posted an update 7 months ago
ehristoforu 
posted an update 7 months ago
ehristoforu 
posted an update 8 months ago
view post
Post
2130
😐 Hello, there are a couple of interesting things. The first is that I will soon release several pretty cool SDXL models, the second is a little sad, I conducted long-term tests of training and merging of XL models and realized that XL will not improve soon, the architecture will not allow us to continue pushing realism and other interesting things into it, the entire community has brought XL closer to the maximum ideal on its architecture.
julien-c 
posted an update 8 months ago
view post
Post
5184
Hey it was good meeting you yesterday @MaziyarPanahi 🔥

thanks @mishig for setting this up

Let's make the Hub as useful as possible for the community ❤️
  • 1 reply
·
ehristoforu 
posted an update 8 months ago
view post
Post
2877
🤗 SDXL Flash

✨️ Introducing the new fast model SDXL Flash (Mini), we learned that all fast XL models work fast, but the quality decreases, and we also made a fast model, but it is not as fast as LCM, Turbo, Lightning and Hyper, but the quality is higher. Below you will see the study with steps and cfg.

🚀 Features of mini model:
It weighs less, consumes less video memory and other resources, and the quality has not dropped much.

👑 Our faster than regular model is better in quality than the coolest modern models such as JuggernautXL X, FluentlyXL v4 and others.

SDXL Flash: sd-community/sdxl-flash
SDXL Flash Mini: sd-community/sdxl-flash-mini
·
mrm8488 
posted an update 8 months ago
view post
Post
5731
Working on a concept GPT-2 (small) that uses KANs instead of MLPs.
The ckpt and training code will be soon on the hub.
·
ehristoforu 
posted an update 9 months ago
pcuenq 
posted an update 9 months ago
view post
Post
4790
OpenELM in Core ML

Apple recently released a set of efficient LLMs in sizes varying between 270M and 3B parameters. Their quality, according to benchmarks, is similar to OLMo models of comparable size, but they required half the pre-training tokens because they use layer-wise scaling, where the number of attention heads increases in deeper layers.

I converted these models to Core ML, for use on Apple Silicon, using this script: https://gist.github.com/pcuenca/23cd08443460bc90854e2a6f0f575084. The converted models were uploaded to this community in the Hub for anyone that wants to integrate inside their apps: corenet-community/openelm-core-ml-6630c6b19268a5d878cfd194

The conversion was done with the following parameters:
- Precision: float32.
- Sequence length: fixed to 128.

With swift-transformers (https://github.com/huggingface/swift-transformers), I'm getting about 56 tok/s with the 270M on my M1 Max, and 6.5 with the largest 3B model. These speeds could be improved by converting to float16. However, there's some precision loss somewhere and generation doesn't work in float16 mode yet. I'm looking into this and will keep you posted! Or take a look at this issue if you'd like to help: https://github.com/huggingface/swift-transformers/issues/95

I'm also looking at optimizing inference using an experimental kv cache in swift-transformers. It's a bit tricky because the layers have varying number of attention heads, but I'm curious to see how much this feature can accelerate performance in this model family :)

Regarding the instruct fine-tuned models, I don't know the chat template that was used. The models use the Llama 2 tokenizer, but the Llama 2 chat template, or the default Alignment Handbook one that was used to train, are not recognized. Any ideas on this welcome!
·
julien-c 
posted an update 9 months ago
view post
Post
6883
text-generation-inference (TGI) is now fully open-source again!

Along with text-embeddings-inference.

We just switched both of those repos' license back to Apache 2. 🔥
julien-c 
posted an update 10 months ago
view post
Post
3172
Very glad to welcome @josefprusa , pioneer of 3D printing and open source hardware, founder of https://www.prusa3d.com/, to the HF Hub 👋

AI applied to 3D printing could be big.
  • 1 reply
·