Multimodal 💬 - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗 - UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark
LLMs 📖 - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)
Audio 🗣️ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO
Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images
Artificial Kuramoto Oscillatory Neurons (AKOrN) differ from traditional artificial neurons by oscillating, rather than just turning on or off. Each neuron is represented by a rotating vector on a sphere, influenced by its connections to other neurons. This behavior is based on the Kuramoto model, which describes how oscillators (like neurons) tend to synchronize, similar to pendulums swinging in unison.
Key points:
Oscillating Neurons: Each AKOrN’s rotation is influenced by its connections, and they try to synchronize or oppose each other. Synchronization: When neurons synchronize, they "bind," allowing the network to represent complex concepts (e.g., "a blue square toy") by compressing information. Updating Mechanism: Neurons update their rotations based on connected neurons, input stimuli, and their natural frequency, using a Kuramoto update formula. Network Structure: AKOrNs can be used in various network layers, with iterative blocks combining Kuramoto layers and feature extraction modules. Reasoning: This model can perform reasoning tasks, like solving Sudoku puzzles, by adjusting neuron interactions. Advantages: AKOrNs offer robust feature binding, reasoning capabilities, resistance to adversarial data, and well-calibrated uncertainty estimation. In summary, AKOrN's oscillatory neurons and synchronization mechanisms enable the network to learn, reason, and handle complex tasks like image classification and object discovery with enhanced robustness and flexibility.