URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 5 days ago • 45
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 56
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published Nov 12, 2024 • 63
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published Nov 12, 2024 • 63
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models Paper • 2308.06507 • Published Aug 12, 2023 • 1
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models Paper • 2406.20015 • Published Jun 28, 2024 • 1
A Survey on the Honesty of Large Language Models Paper • 2409.18786 • Published Sep 27, 2024 • 32 • 3
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 55
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast Paper • 2405.14507 • Published May 23, 2024
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 55 • 2
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 55 • 2
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14, 2024 • 55
Specialist or Generalist? Instruction Tuning for Specific NLP Tasks Paper • 2310.15326 • Published Oct 23, 2023