Tao Luo

I am a final-year CS Ph.D. candidate at the University of Pennsylvania advised by Profs. Boon Thau Loo and Vincent Liu.

My research advances GPU scheduling for AI infrastructure, focusing on agentic reinforcement learning (RL) and LLM serving. During my research internship at Alibaba, I enabled multi-tenancy for the production RL framework ROLL, supporting both multi-LoRA and full fine-tuning. I also pioneered an AI coding methodology (中文/English) for AI infra development.

Previously at Columbia University (M.S.), I coined privacy budget scheduling, pioneering the research on scheduling ML training under differential privacy constraints. I was advised by Prof. Asaf Cidon and collaborated broadly with Profs. Ethan Katz-Bassett, Ryan Stutsman, Mathias Lécuyer, and Roxana Geambasu.

Before academia, I developed quantitative investment algorithms in the financial industry. I hold a B.S. in Financial Mathematics from Southern University of Science and Technology, as a member of its founding cohort.

Selected Projects

Scheduling for Agentic RL @Alibaba/ROLL

  • Proposed a Partial Time-Sharing GPU scheduling algorithm for RL jobs.
  • Extended the scheduling logic to support multi-LoRA training.
  • Re-architected system and scheduling for multi-tenant full fine-tuning.
  • Deployed in production (100B+ parameters, 1000+ GPUs): Amap(高德地图), iFlow CLI, Qoder IDE, and Alibaba Ads.

GPU Multiplexing for Heterogeneous LLM Serving @UPenn

  • Eliminated head-of-line blocking via novel LLM serving architecture, raising token throughput by 1.6×.
  • Engineered efficient multi-model KV cache management and robust NCCL concurrency controls.
  • Optimized sharding, replication, placement, and scheduling strategies.

Serving Multimodal LLMs via Shared Backbone @UPenn

  • Contributed GPU multiplexing implementations to the shared backbone architecture for multimodal LLM serving.
  • Proposed system design and implementation details by leveraging expertise in vLLM and Ray.
  • Identified the applicability of Coflow scheduling, influencing the project’s scheduling approach.

Privacy Budget Scheduling in Machine Learning Training @Columbia

  • Scheduled more jobs than FCFS under identical privacy budgets.
  • Proposed a dynamic algorithm DPF (Dominant Private Block Fairness) based on DRF (dominant resource fairness).
  • Developed rigorous proofs for the game-theory properties of the new algorithm.

Honors & Service

  • Program Committee: ACM Symposium on Cloud Computing 2025
  • Manjushri Fellowship, University of Pennsylvania, 2021
  • Financial Risk Manager (FRM) Certification, 2015
  • China Merchant Bank Scholarship, 2012-2014
  • Pioneering Undergraduate Fellowship, 2011-2014
  • First Prize, China High School Biology Olympiad, 2010