LLM as a System Service on Mobile Devices
Intensive Reading Author Info Wangsong Yin - Google Scholar Mengwei Xu Background 论文首先提出了 LLMaaS: LLM as a system service on mobile devices (LLMaaS): The mobile OS exposes an LLM and its inference infrastructure as a system feature to mobile apps, akin to the location or notification services. LLMaaS 的提出主要有以下原因: LLMaaS needs only one copy of LLM weights in memory. 不同应用程序应该去调用由系统维护的同一个大模型,而不是自己单独去加载一个 A system-level LLM can be better customized for on-device accelerator and enjoy the performance gain over commodity hardware. 在系统层面去做大模型的管理和推理更接近底层,能够更好地利用底层的硬件资源 这篇文章要解决的核心问题是 How to efficiently manage the LLM contexts ...