Probing ViTs

non-profit
Activity Feed

AI & ML interests

We are interested to study the representations learned by Vision Transformers.

Recent Activity

probing-vits's activity

sayakpaul 
posted an update about 4 hours ago
view post
Post
320
We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co/blog/video_gen
ariG23498 
posted an update 8 days ago
ariG23498 
posted an update 11 days ago
sayakpaul 
posted an update about 1 month ago
sayakpaul 
posted an update about 1 month ago
view post
Post
2133
In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.
  • 1 reply
·
sayakpaul 
posted an update about 2 months ago
view post
Post
2103
Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences
  • 7 replies
·
sayakpaul 
posted an update about 2 months ago
view post
Post
2139
The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script 🤗

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
ariG23498 
posted an update about 2 months ago
sayakpaul 
posted an update about 2 months ago
sayakpaul 
posted an update 2 months ago
view post
Post
2657
It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
  • 1 reply
·
ariG23498 
posted an update 3 months ago
sayakpaul 
posted an update 4 months ago
view post
Post
2764
Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:

* Decrease the rank of a LoRA
* Increase the rank of a LoRA

The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to torch.compile() them.

Check it out here:
sayakpaul/flux-lora-resizing
  • 1 reply
·
ariG23498 
posted an update 5 months ago
sayakpaul 
posted an update 6 months ago
sayakpaul 
posted an update 6 months ago
view post
Post
4494
Flux.1-Dev like images but in fewer steps.

Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged

Enjoy the Monday 🤗
·
sayakpaul 
posted an update 6 months ago
view post
Post
3801
With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.

We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.

We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.

Diffusers 🤝 Quanto ❤️

This was a juicy collaboration between @dacorvo and myself.

Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
·