I am a Ph.D. student in Computer Science at The University of Edinburgh, supervised by Prof. Luo Mai. I received my B.Eng. degree in Computer Science and Technology from Sun Yat-sen University in June 2021. I was supervised by Prof. Di Wu at Sun Yat-sen University as a member of Yat-sen Honor School.

I study the intersection of machine learning and distributed systems. My goal is to build efficient systems for the large-scale deployment of machine learning models. My current research focuses on the efficient inference of large language models in serverless computing clusters.


May 16, 2024 I’m selected as one of the ML and Systems Rising Stars! Thanks to everyone who has supported me along the way! I’ll be attending the workshop at NVIDIA’s headquarters in Santa Clara, CA, on July 15-16.
Mar 21, 2024 Our paper “ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models” has been accepted to OSDI 2024. Preprint available on ArXiv. Code will be released soon. Stay tuned! :sparkles: :smile:
Jan 25, 2024 We released two new papers on ArXiv! Check them out: ServerlessLLM and MoE-Infinite


  1. ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models
    Yao FuLeyang XueYeqi Huang, Andrei-Octavian Brabete , Dmitrii UstiugovYuvraj Patel, and Luo Mai
    OSDI, 2024
  2. MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving
    Leyang XueYao Fu , Zhan Lu , Luo Mai, and Mahesh Marina
    arXiv preprint arXiv:2401.14361, 2024
  3. TorchOpt: An Efficient Library for Differentiable Optimization
    Jie Ren*, Xidong Feng* , Bo Liu* , Xuehai Pan* , Yao FuLuo Mai, and Yaodong Yang
    JMLR, 2023
  4. Optimizing the numbers of queries and replies in convex federated learning with differential privacy
    Yipeng Zhou , Xuezheng Liu , Yao Fu , Di Wu , Jessie Hui Wang , and Shui Yu
    IEEE Transactions on Dependable and Secure Computing, 2023
  5. Ekko: A Large-Scale deep learning recommender system with Low-Latency model update
    Chijun Sima*Yao Fu*Man-Kit Sit, Liyi Guo , Xuri Gong , Feng Lin , Junyu Wu , Yongsheng Li , Haidong Rong , Pierre-Louis Aublin , and Luo Mai
    OSDI, 2022