SmallThinker A Family of Efficient Large Language Models Natively Trained for Local Deployment
August 25, 2025 · Last updated on August 26, 2025 · 2 min · KKKZOZ
STI Turbocharge NLP Inference at the Edge via Elastic Pipelining
August 25, 2025 · Last updated on August 26, 2025 · 2 min · KKKZOZ
EdgeMoE Empowering Sparse Large Language Models on Mobile Devices
August 24, 2025 · Last updated on August 26, 2025 · 2 min · KKKZOZ
HeteroLLM Accelerating Large Language Model Inference on Mobile SoCs with Heterogeneous AI Accelerators
August 24, 2025 · Last updated on August 26, 2025 · 4 min · KKKZOZ
A Survey of Resource-efficient LLM and Multimodal Foundation Models
August 21, 2025 · Last updated on August 26, 2025 · 3 min · KKKZOZ
H2O Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
August 21, 2025 · Last updated on August 26, 2025 · 1 min · KKKZOZ
LLM as a System Service on Mobile Devices
August 18, 2025 · Last updated on August 26, 2025 · 4 min · KKKZOZ
KV-Runahead Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
August 17, 2025 · Last updated on August 25, 2025 · 2 min · KKKZOZ
Ring Attention with Blockwise Transformers for Near-Infinite Context
August 17, 2025 · Last updated on August 25, 2025 · 7 min · KKKZOZ
Striped Attention Faster Ring Attention for Causal Transformers
August 17, 2025 · Last updated on August 19, 2025 · 3 min · KKKZOZ
TPI-LLM Serving 70B-scale LLMs Efficiently on Low-resource Mobile Devices
August 17, 2025 · Last updated on August 26, 2025 · 2 min · KKKZOZ
LLM.int8() 8-bit Matrix Multiplication for Transformers at Scale
August 12, 2025 · Last updated on August 25, 2025 · 2 min · KKKZOZ
Git Essentials
August 8, 2025 · Last updated on August 25, 2025 · 12 min · KKKZOZ
LLM Generated Content
August 8, 2025 · Last updated on August 12, 2025 · 4 min · KKKZOZ
VSCode Essentials
August 6, 2025 · Last updated on August 10, 2025 · 4 min · KKKZOZ
Deja Vu Contextual Sparsity for Efficient LLMs at Inference Time
August 4, 2025 · Last updated on August 26, 2025 · 3 min · KKKZOZ
Fast On-device LLM Inference with NPUs
August 4, 2025 · Last updated on August 19, 2025 · 3 min · KKKZOZ
LLM Preliminaries
August 4, 2025 · Last updated on August 25, 2025 · 12 min · KKKZOZ
[Pinned] Transactions Papers Index
August 1, 2025 · Last updated on August 3, 2025 · 1 min · KKKZOZ
[Pinned] Distributed Papers Index
August 1, 2025 · Last updated on August 3, 2025 · 1 min · KKKZOZ