news

Mar 21, 2024 Our paper “ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models” has been accepted to OSDI 2024. Preprint available on ArXiv. Code will be released soon. Stay tuned! :sparkles: :smile:
Jan 25, 2024 We released two new papers on ArXiv! Check them out: ServerlessLLM and MoE-Infinite