Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.14402

about 16 hours ago

SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights

Paper • 2410.09008 • Published Oct 11, 2024 • 17
answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 5.42M • 784
answerdotai/ModernBERT-large

Fill-Mask • Updated Jan 15 • 269k • 360
microsoft/phi-4

Text Generation • Updated 14 days ago • 535k • • 1.88k

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 35
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 39

"Physics of Language Models" series

Physics of Language Models: Part 1, Context-Free Grammar

Paper • 2305.13673 • Published May 23, 2023 • 7
Physics of Language Models: Part 3.2, Knowledge Manipulation

Paper • 2309.14402 • Published Sep 25, 2023 • 7
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Paper • 2404.05405 • Published Apr 8, 2024 • 10
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

Paper • 2309.14316 • Published Sep 25, 2023 • 8

Papers - CoT - Chain of Thought

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 39
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 105
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 52
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20, 2024 • 13

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 55
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Paper • 2005.11401 • Published May 22, 2020 • 10
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 35
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 13

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs