publications

*co-primary authors

2024

  1. ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models
    Yao FuLeyang XueYeqi Huang, Andrei-Octavian Brabete , Dmitrii UstiugovYuvraj Patel, and Luo Mai
    OSDI, 2024
  2. MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving
    Leyang XueYao Fu , Zhan Lu , Luo Mai, and Mahesh Marina
    arXiv preprint arXiv:2401.14361, 2024

2023

  1. TorchOpt: An Efficient Library for Differentiable Optimization
    Jie Ren*, Xidong Feng* , Bo Liu* , Xuehai Pan* , Yao FuLuo Mai, and Yaodong Yang
    JMLR, 2023
  2. Optimizing the numbers of queries and replies in convex federated learning with differential privacy
    Yipeng Zhou , Xuezheng Liu , Yao Fu , Di Wu , Jessie Hui Wang , and Shui Yu
    IEEE Transactions on Dependable and Secure Computing, 2023

2022

  1. Ekko: A Large-Scale deep learning recommender system with Low-Latency model update
    Chijun Sima*Yao Fu*Man-Kit Sit, Liyi Guo , Xuri Gong , Feng Lin , Junyu Wu , Yongsheng Li , Haidong Rong , Pierre-Louis Aublin , and Luo Mai
    OSDI, 2022