Kuldeep Singh Sidhu

singhsidhukuldeep

https://singhsidhukuldeep.github.io

AI & ML interests

😃 TOP 3 on HuggingFace for posts 🤗 Seeking contributors for a completely open-source 🚀 Data Science platform! singhsidhukuldeep.github.io

Recent Activity

posted an update about 8 hours ago

While everyone is buzzing about DeepSeek AI R1's groundbreaking open-source release, ByteDance has quietly launched something remarkable - Trae, an adaptive AI IDE that's redefining the development experience and unlike competitors like Cursor, it' completely FREE! Trae is a sophisticated development environment built on Microsoft's VSCode foundation(with a nice skin on top), offering unlimited free access to both OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet models. Technical Highlights: - Real-time AI pair programming with comprehensive codebase understanding - Natural language commands for code generation and project-level development - Intelligent task decomposition for automated planning and execution - Seamless VS Code and Cursor configuration compatibility - Multi-language support with specialized optimization for English and Chinese interfaces Currently available for macOS (Windows version in development), Trae is distributed through ByteDance's Singapore subsidiary, Spring (SG) Pte. What sets it apart is its ability to handle mixed-language workflows and enhanced localization features that address common pain points in existing IDEs. The AI assistant can generate code snippets, optimize logic, and even create entire projects from scratch through natural language prompts. It also features an innovative AI Chat system accessible via keyboard shortcuts for real-time coding assistance. For developers looking to enhance their productivity without breaking the bank, Trae offers enterprise-grade AI capabilities completely free during its initial release. This move by ByteDance signals a significant shift in the AI IDE landscape, challenging established players with a robust, accessible alternative. Try it at trae.ai

posted an update 1 day ago

Exciting Research Alert: Revolutionizing Long-Context Language Models! A groundbreaking paper from researchers at University of Edinburgh and Apple introduces ICR² (In-context Retrieval and Reasoning), addressing a critical challenge in long-context language models (LCLMs). Key Innovations: - A novel benchmark that realistically evaluates LCLMs' ability to process and reason with extended contexts - Three innovative approaches that significantly improve LCLM performance: - Retrieve-then-generate fine-tuning - Retrieval-attention probing - Joint retrieval head training The most impressive result? Their best approach, implemented on Mistral-7B with just 32K token limit, achieves performance comparable to GPT-4 while using significantly fewer parameters. Technical Deep Dive: The team's approach leverages attention head mechanisms to filter and denoise long contexts during decoding. Their retrieve-then-generate method implements a two-step process where the model first identifies relevant passages before generating responses. The architecture includes dedicated retrieval heads working alongside generation heads, enabling joint optimization during training. What sets this apart is their innovative use of the Gumbel-TopK trick for differentiable retrieval and their sophisticated attention probing mechanism that identifies and utilizes retrieval-focused attention heads. Impact: This research fundamentally changes how we approach long-context processing in LLMs, offering a more efficient alternative to traditional RAG pipelines while maintaining high performance.

posted an update 4 days ago

Exciting breakthrough in Text Embeddings: Introducing LENS (Lexicon-based EmbeddiNgS)! A team of researchers from University of Amsterdam, University of Technology Sydney, and Tencent have developed a groundbreaking approach that outperforms dense embeddings on the Massive Text Embedding Benchmark (MTEB). >> Key Technical Innovations: - LENS consolidates vocabulary space through token embedding clustering, addressing the inherent redundancy in LLM tokenizers - Implements bidirectional attention and innovative pooling strategies to unlock the full potential of LLMs - Each dimension corresponds to token clusters instead of individual tokens, creating more coherent and compact embeddings - Achieves competitive performance with just 4,000-8,000 dimensional embeddings, matching the size of dense counterparts >> Under the Hood: The framework applies KMeans clustering to token embeddings from the language modeling head, replacing original embeddings with cluster centroids. This reduces dimensionality while preserving semantic relationships. >> Results: - Outperforms dense embeddings on MTEB benchmark - Achieves state-of-the-art performance when combined with dense embeddings on BEIR retrieval tasks - Demonstrates superior performance across clustering, classification, and retrieval tasks This work opens new possibilities for more efficient and interpretable text embeddings. The code will be available soon.

View all activity

Organizations

singhsidhukuldeep's activity

upvoted an article 6 months ago

Article

Making LLMs lighter with AutoGPTQ and transformers

Aug 23, 2023

• 38

upvoted an article 8 months ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

•

Apr 24, 2024

• 61

upvoted an article 9 months ago

Article

Train custom AI models with the trainer API and adapt them to 🤗

•

Jun 29, 2024

• 33