AI & ML interests

None defined yet.

Recent Activity

EuroPython2022's activity

awacke1Β 
posted an update 16 days ago
view post
Post
2799
Deep Research Evaluator was asked:
" design a coral defense mechanism that upon sensing say an acid that's causing coral reefs to have a carbon dioxide issue it develops... please create a plan and a design for this\n
"
It picks these three as best combined solution.

1. [Reef-insight: A framework for reef habitat mapping with clustering methods via remote sensing]...
2. Phone a friend: [Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics]...
3. World Solve: [Dependence of Physiochemical Features on Marine Chlorophyll Analysis with Learning Techniques]


To design a system that allows coralows coral reefs to respond to increased acidity levels in their environment, we can create a network of pH sensors and dispersal units that can detect changes in pH levels and release a base solution to neutralize the acid.

1. pH Sensors: The first component of the system would be a network of pH sensors placed strategically throughout the coral reef. These sensors would be small, durable, and able to withstand the harsh conditions of the ocean. They would be placed at various depths and locations within the reef to ensure accurate and comprehensive monitoring of pH levels.
2. Base Dispersal Units: Once the pH sensors detect a decrease in pH levels, they would trigger the base dispersal units to release a base solution into the water. These units would be strategically placed around the reef and would be able to release a controlled amount of base solution to neutralize the acidity in the water.
3. Water Dispersal Mechanism: The base dispersal units would be connected to a water dispersal mechanism that would allow the base solution to be distributed evenly around the reef. This could be achieved through a series of pipes or channels that would distribute the base solution in a controlled and targeted manner.
awacke1Β 
posted an update 25 days ago
akhaliqΒ 
posted an update about 1 month ago
view post
Post
6902
Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat
  • 2 replies
Β·
reach-vbΒ 
posted an update about 2 months ago
view post
Post
4489
VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: https://huggingface.co/jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! πŸ”₯
akhaliqΒ 
posted an update 2 months ago
view post
Post
7931
QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat
  • 1 reply
Β·
akhaliqΒ 
posted an update 2 months ago
view post
Post
3883
New model drop in anychat

allenai/Llama-3.1-Tulu-3-8B is now available

try it here: akhaliq/anychat
reach-vbΒ 
posted an update 2 months ago
view post
Post
4345
Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI TΓΌlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar ( OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! πŸ€—
akhaliqΒ 
posted an update 2 months ago
view post
Post
2854
anychat

supports chatgpt, gemini, perplexity, claude, meta llama, grok all in one app

try it out there: akhaliq/anychat
reach-vbΒ 
posted an update 2 months ago
view post
Post
4408
What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! πŸ€—
awacke1Β 
posted an update 2 months ago
view post
Post
1001
πŸ•ŠοΈHopeπŸ•ŠοΈ and βš–οΈJusticeβš–οΈ AI
🚲 Stolen bike in Denver FOUND - Sometimes hope & justice DO prevail.

🎬 So I Created an AI+Art+Music tribute:
-🧠 AI App that Evaluates GPT-4o vs Claude:
awacke1/RescuerOfStolenBikes
https://x.com/Aaron_Wacker/status/1857640877986033980?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1857640877986033980%7Ctwgr%5E203a5022b0eb4c41ee8c1dd9f158330216ac5be1%7Ctwcon%5Es1_c10&ref_url=https%3A%2F%2Fpublish.twitter.com%2F%3Furl%3Dhttps%3A%2F%2Ftwitter.com%2FAaron_Wacker%2Fstatus%2F1857640877986033980

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">QT your πŸ•ŠοΈHopeπŸ•ŠοΈ and βš–οΈJusticeβš–οΈ art🎨<br><br>🚲 Stolen bike in Denver FOUND! <br> - Sometimes hope &amp; justice DO prevail! <br><br>🎬 Created an AI+Art+Music tribute: <br> -🧠 AI App that Evaluates GPT-4o vs Claude: <a href="https://t.co/odrYdaeizZ">https://t.co/odrYdaeizZ</a><br> <a href="https://twitter.com/hashtag/GPT?src=hash&amp;ref_src=twsrc%5Etfw">#GPT</a> <a href="https://twitter.com/hashtag/Claude?src=hash&amp;ref_src=twsrc%5Etfw">#Claude</a> <a href="https://twitter.com/hashtag/Huggingface?src=hash&amp;ref_src=twsrc%5Etfw">#Huggingface</a> <a href="https://twitter.com/OpenAI?ref_src=twsrc%5Etfw">@OpenAI</a> <a href="https://twitter.com/AnthropicAI?ref_src=twsrc%5Etfw">@AnthropicAI</a> <a href="https://t.co/Q9wGNzLm5C">pic.twitter.com/Q9wGNzLm5C</a></p>&mdash; Aaron Wacker (@Aaron_Wacker) <a href="https://twitter.com/Aaron_Wacker/status/1857640877986033980?ref_src=twsrc%5Etfw">November 16, 2024</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>


#GPT #Claude #Huggingface
@OpenAI
@AnthropicAI
reach-vbΒ 
posted an update 3 months ago
view post
Post
1667
Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! πŸ”₯

> Pure language modeling approach to TTS
> Zero-shot voice cloning
> LLaMa architecture w/ Audio tokens (WavTokenizer)
> BONUS: Works on-device w/ llama.cpp ⚑

Three-step approach to TTS:

> Audio tokenization using WavTokenizer (75 tok per second)
> CTC forced alignment for word-to-audio token mapping
> Structured prompt creation w/ transcription, duration, audio tokens

The model is extremely impressive for 350M parameters! Kudos to the
OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM πŸ€—

Check out the models here: OuteAI/outetts-6728aa71a53a076e4ba4817c
reach-vbΒ 
posted an update 3 months ago
view post
Post
3010
Smol models ftw! AMD released AMD OLMo 1B - beats OpenELM, tiny llama on MT Bench, Alpaca Eval - Apache 2.0 licensed πŸ”₯

> Trained with 1.3 trillion (dolma 1.7) tokens on 16 nodes, each with 4 MI250 GPUs

> Three checkpoints:

- AMD OLMo 1B: Pre-trained model
- AMD OLMo 1B SFT: Supervised fine-tuned on Tulu V2, OpenHermes-2.5, WebInstructSub, and Code-Feedback datasets
- AMD OLMo 1B SFT DPO: Aligned with human preferences using Direct Preference Optimization (DPO) on UltraFeedback dataset

Key Insights:
> Pre-trained with less than half the tokens of OLMo-1B
> Post-training steps include two-phase SFT and DPO alignment
> Data for SFT:
- Phase 1: Tulu V2
- Phase 2: OpenHermes-2.5, WebInstructSub, and Code-Feedback

> Model checkpoints on the Hub & Integrated with Transformers ⚑️

Congratulations & kudos to AMD on a brilliant smol model release! πŸ€—

amd/amd-olmo-6723e7d04a49116d8ec95070
awacke1Β 
posted an update 3 months ago
view post
Post
1956
Since 2022 I have been trying to understand how to support advancement of the two best python patterns for AI development which are:
1. Streamlit
2. Gradio

The reason I chose them in this order was the fact that the streamlit library had the timing drop on gradio by being available with near perfection about a year or two before training data tap of GPT.

Nowadays its important that if you want current code to be right on generation it requires understanding of consistency in code method names so no manual intervention is required with each try.

With GPT and Claude being my top two for best AI pair programming models, I gravitate towards streamlit since aside from common repeat errors on cache and experimental functions circa 2022 were not solidified.
Its consistency therefore lacks human correction needs. Old dataset error situations are minimal.

Now, I seek to make it consistent on gradio side. Why? Gradio lapped streamlit for blocks paradigm and API for free which are I feel are amazing features which change software engineering forever.

For a few months I thought BigCode would become the new best model due to its training corpus datasets, yet I never felt it got to market as the next best AI coder model.

I am curious on Gradio's future and how. If the two main models (GPT and Claude) pick up the last few years, I could then code with AI without manual intervention. As it stands today Gradio is better if you could get the best coding models to not repeatedly confuse old syntax as current syntax yet we do live in an imperfect world!

Is anyone using an AI pair programming model that rocks with Gradio's latest syntax? I would like to code with a model that knows how to not miss the advancements and syntax changes that gradio has had in the past few years. Trying grok2 as well.

My IDE coding love is HF. Its hands down faster (100x) than other cloud paradigms. Any tips on models best for gradio coding I can use?

--Aaron
Β·
reach-vbΒ 
posted an update 3 months ago
view post
Post
2482
What a great day for Open Science! @AIatMeta released models, datasets, and code for many of its research artefacts! πŸ”₯

1. Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. A new developer suite will be added to make it easier for developers to build with SAM 2.

Model checkpoints: reach-vb/sam-21-6702d40defe7611a8bafa881

2. Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance.

Model checkpoints: facebook/layerskip-666b25c50c8ae90e1965727a

3. SALSA: New code enables researchers to benchmark AI-based attacks to validate security for post-quantum cryptography.

Repo: https://github.com/facebookresearch/LWE-benchmarking

4. Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale.

Repo: https://github.com/facebookresearch/lingua

5. Meta Open Materials: New open source models and the largest dataset to accelerate AI-driven discovery of new inorganic materials.

Model checkpoints: fairchem/OMAT24

6. MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder covering 80 languages.

Model checkpoint: facebook/MEXMA

7. Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations.

Model checkpoint: facebook/Self-taught-evaluator-llama3.1-70B

8. Meta Spirit LM: An open-source language model for seamless speech and text integration.

Repo: https://github.com/facebookresearch/spiritlm
  • 3 replies
Β·
awacke1Β 
posted an update 3 months ago
view post
Post
705
Today I was able to solve a very difficult coding session with GPT-4o which ended up solving integrations on a very large scale. So I decided to look a bit more into how its reasoners work. Below is a fun markdown emoji outline about what I learned today and what I'm pursuing.

Hope you enjoy! Cheers, Aaron.

Also here are my favorite last 4 spaces I am working on:
1. GPT4O: awacke1/GPT-4o-omni-text-audio-image-video
2. Claude:
awacke1/AnthropicClaude3.5Sonnet-ACW
3. MSGraph M365: awacke1/MSGraphAPI
4. Azure Cosmos DB: Now with Research AI! awacke1/AzureCosmosDBUI

# πŸš€ OpenAI's O1 Models: A Quantum Leap in AI

## 1. πŸ€” From 🦜 to 🧠: O1's Evolution

- **Thinking AI**: O1 ponders before replying; GPT models just predict. πŸ’‘

## 2. πŸ“š AI Memory: πŸ’Ύ + 🧩 = 🧠

- **Embeddings & Tokens**: Words ➑️ vectors, building knowledge. πŸ“–

## 3. πŸ” Swift Knowledge Retrieval

- **Vector Search & Indexing**: O1 finds info fast, citing reliable sources. πŸ”ŽπŸ“–

## 4. 🌳 Logic Trees with Mermaid Models

- **Flowchart Reasoning**: O1 structures thoughts like diagrams. 🎨🌐

## 5. πŸ’» Coding Mastery

- **Multilingual & Current**: Speaks many code languages, always up-to-date. πŸ’»πŸ”„

## 6. πŸ† Breaking Records

- **92.3% MMLU Score**: O1 outperforms humans, setting new AI standards. πŸ…

## 7. πŸ’‘ Versatile Applications

- **Ultimate Assistant**: From fixing code to advancing research. πŸ› οΈπŸ”¬

## 8. 🏁 Racing Toward AGI

- **OpenAI Leads**: O1 brings us closer to true AI intelligence. πŸš€

## 9. πŸ€– O1's Reasoning Pillars

- **🧠 Chain of Thought**: Step-by-step logic.
- **🎲 MCTS**: Simulates options, picks best path.
- **πŸ” Reflection**: Self-improves autonomously.
- **πŸ‹οΈβ€β™‚οΈ Reinforcement Learning**: Gets smarter over time.

---

*Stay curious, keep coding!* πŸš€
awacke1Β 
posted an update 4 months ago
view post
Post
583
I have finally completed a working full Azure and Microsoft MS Graph API implementation which can use all the interesting MS AI features in M365 products to manage CRUD patterns for the graph features across products.

This app shows initial implementation of security, authentication, scopes, and access to Outlook, Calendar, Tasks, Onedrive and other apps for CRUD pattern as AI agent service skills to integrate with your AI workflow.


Below are initial screens showing integration:

URL: awacke1/MSGraphAPI
Discussion: awacke1/MSGraphAPI#5

Best of AI on @Azure and @Microsoft on @HuggingFace :
https://huggingface.co/microsoft
https://www.microsoft.com/en-us/research/
---
Aaron
reach-vbΒ 
posted an update 4 months ago
view post
Post
5492
Multimodal Ichigo Llama 3.1 - Real Time Voice AI πŸ”₯

> WhisperSpeech X Llama 3.1 8B
> Trained on 50K hours of speech (7 languages)
> Continually trained on 45hrs 10x A1000s
> MLS -> WhisperVQ tokens -> Llama 3.1
> Instruction tuned on 1.89M samples
> 70% speech, 20% transcription, 10% text
> Apache 2.0 licensed ⚑

Architecture:
> WhisperSpeech/ VQ for Semantic Tokens
> Llama 3.1 8B Instruct for Text backbone
> Early fusion (Chameleon)

I'm super bullish on HomeBrew/ Jan and early fusion, audio and text, multimodal models!

(P.S. Play with the demo on Hugging Face: jan-hq/Ichigo-llama3.1-s-instruct)