Tao Luo

I am a CS Ph.D. candidate at the University of Pennsylvania (defending 2026), advised by Profs. Boon Thau Loo and Vincent Liu. I build agent infrastructure spanning LLM post-training (RL, LoRA fine-tuning), agent rollouts, and model serving.

I am seeking full-time industry roles starting in 2026.

At Alibaba, I designed and shipped Partial Overlapping, a runtime scheduling mechanism for asynchronous agentic RL in ROLL that expands agent rollouts to idle training GPUs (3.5x rollout throughput; featured in the ROME technical report), leveraging coding agents extensively (Claude Code, Codex) with zero human-written code (first feature to ship this way in Alibaba’s flagship post-training framework). It now powers agentic RL post-training of multiple production agents at 100B+ parameters and thousands of GPUs. I also designed and built RLix GitHub stars, an orchestration layer for concurrent agentic RL pipelines (2.6x rollout throughput in SWE-agent RL training; 200+ GitHub stars, including those from NVIDIA, Google, xAI, Anthropic, ByteDance, Zhipu AI, etc.). My work spans vLLM, Megatron-LM, and Ray.

My work has appeared at OSDI, SOSP, and SoCC. During my Ph.D. at Penn, I led ParaFlex, a multiplexed heterogeneous LLM serving system that eliminates head-of-line blocking via stage-aligned parallelism. Earlier, during M.S. study at Columbia University, I introduced Privacy Budget Scheduling, and developed DPF as the first scheduling algorithm for ML training under differential privacy constraints.

Before academia, I spent ~4 years in quant investment, developing strategies and building infrastructure. I hold a B.S. in Financial Mathematics from Southern University of Science and Technology.

Selected Projects

Agentic RL Post-Training Infrastructure @Alibaba, DAMO Academy

  • Proposed and shipped Partial Overlapping, a runtime scheduling mechanism for asynchronous agentic RL that expands agent rollouts to idle training GPUs, improving rollout throughput by 3.5x (featured in the ROME technical report).
  • Leveraged coding agents extensively (Claude Code, Codex) to design, implement, and debug Partial Overlapping with zero human-written code (first high-priority feature in alibaba/ROLL shipped this way); featured in technical blogs (English/Chinese) as a case study for AI-assisted systems engineering.
  • Deployed in production for agentic RL post-training of models with 100s of billions of parameters on 1000s of GPUs, including Qoder IDE (coding agent), iFlow CLI (terminal agent), Amap (travel-planning agent), and Alimama (ads).
  • Extended Partial Overlapping to async multi-LoRA fine-tuning via per-adapter optimizers on a shared Megatron base model.
  • Designed and built RLix GitHub stars, an orchestration layer for concurrent agentic RL pipelines that enables elastic GPU sharing and higher cluster utilization with minimal changes to training recipes (2.6x rollout throughput in SWE-agent RL training; 200+ GitHub stars, including those from NVIDIA, Google, xAI, Anthropic, ByteDance, Zhipu AI, etc.).

ParaFlex: Multiplexed Heterogeneous LLM Serving via Stage-Aligned Parallelism @University of Pennsylvania

  • Proposed a novel LLM serving architecture that eliminates head-of-line blocking and improves token throughput by 1.6x.
  • Built multi-model KV cache management and robust NCCL concurrency controls.
  • Optimized sharding, replication, placement, and scheduling algorithms for heterogeneous serving workloads.
  • SoCC’25 paper

Privacy Budget Scheduling in ML Training @Columbia University

  • Introduced Privacy Budget Scheduling and showed how to schedule 2x more jobs than FCFS under the same privacy budget.
  • Developed DPF (Dominant Private Block Fairness), the first scheduling algorithm for ML training under differential privacy constraints, derived from DRF, and proved its game-theoretic properties.
  • OSDI’21 paper

Honors & Service

  • Program Committee: ACM Symposium on Cloud Computing 2025
  • Manjushri Fellowship, University of Pennsylvania, 2021
  • Financial Risk Manager (FRM) Certification, 2015
  • China Merchant Bank Scholarship, 2012-2014
  • Pioneering Undergraduate Fellowship, 2011-2014
  • First Prize, China High School Biology Olympiad, 2010