jzhihui
's Collections
useful-paper
updated
PIA: Your Personalized Image Animator via Plug-and-Play Modules in
Text-to-Image Models
Paper
•
2312.13964
•
Published
•
18
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
257
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive
Generation
Paper
•
2312.12491
•
Published
•
69
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model
Paper
•
2401.02330
•
Published
•
14
TinyLlama: An Open-Source Small Language Model
Paper
•
2401.02385
•
Published
•
90
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper
•
2401.02038
•
Published
•
62
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper
•
2401.15024
•
Published
•
69
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
E-Branchformer
Paper
•
2401.16658
•
Published
•
13
H2O-Danube-1.8B Technical Report
Paper
•
2401.16818
•
Published
•
17
MobileLLM: Optimizing Sub-billion Parameter Language Models for
On-Device Use Cases
Paper
•
2402.14905
•
Published
•
127
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper
•
2402.16153
•
Published
•
56
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
606
Sora: A Review on Background, Technology, Limitations, and Opportunities
of Large Vision Models
Paper
•
2402.17177
•
Published
•
88
BurstAttention: An Efficient Distributed Attention Framework for
Extremely Long Sequences
Paper
•
2403.09347
•
Published
•
20
Transformer-Lite: High-efficiency Deployment of Large Language Models on
Mobile Phone GPUs
Paper
•
2403.20041
•
Published
•
34
TextCraftor: Your Text Encoder Can be Image Quality Controller
Paper
•
2403.18978
•
Published
•
13
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
Interleaved Visual-Textual Tokens
Paper
•
2404.03413
•
Published
•
25
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
•
2404.02258
•
Published
•
104
Long-context LLMs Struggle with Long In-context Learning
Paper
•
2404.02060
•
Published
•
36
Social Skill Training with Large Language Models
Paper
•
2404.04204
•
Published
•
15
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
•
2404.04167
•
Published
•
12
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Paper
•
2404.05674
•
Published
•
14
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video
Understanding
Paper
•
2404.05726
•
Published
•
21
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact
Language Model
Paper
•
2404.01331
•
Published
•
25
EdgeFusion: On-Device Text-to-Image Generation
Paper
•
2404.11925
•
Published
•
21
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper
•
2404.14047
•
Published
•
45
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
•
2404.14700
•
Published
•
29
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
254
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
126
Octopus v4: Graph of language models
Paper
•
2404.19296
•
Published
•
116
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Paper
•
2406.01014
•
Published
•
31