news
Mar 21, 2024 | Our paper “ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models” has been accepted to OSDI 2024. Preprint available on ArXiv. Code will be released soon. Stay tuned! |
---|---|
Jan 25, 2024 | We released two new papers on ArXiv! Check them out: ServerlessLLM and MoE-Infinite |