Tao Luo
I am a CS Ph.D. candidate at the University of Pennsylvania (defending 2026), advised by Profs. Boon Thau Loo and Vincent Liu. I build agent infrastructure spanning LLM post-training (RL, LoRA fine-tuning), agent rollouts, and model serving.
I am seeking full-time industry roles starting in 2026.
At Alibaba, I designed and shipped Partial Overlapping, a runtime scheduling mechanism for asynchronous agentic RL in ROLL that expands agent rollouts to idle training GPUs (3.5x rollout throughput; featured in the ROME technical report), leveraging coding agents extensively (Claude Code, Codex) with zero human-written code (first feature to ship this way in Alibaba’s flagship post-training framework). It now powers agentic RL post-training of multiple production agents at 100B+ parameters and thousands of GPUs. I also designed and built RLix , an orchestration layer for concurrent agentic RL pipelines (2.6x rollout throughput in SWE-agent RL training; 200+ GitHub stars, including those from NVIDIA, Google, xAI, Anthropic, ByteDance, Zhipu AI, etc.). My work spans vLLM, Megatron-LM, and Ray.
My work has appeared at OSDI, SOSP, and SoCC. During my Ph.D. at Penn, I led ParaFlex, a multiplexed heterogeneous LLM serving system that eliminates head-of-line blocking via stage-aligned parallelism. Earlier, during M.S. study at Columbia University, I introduced Privacy Budget Scheduling, and developed DPF as the first scheduling algorithm for ML training under differential privacy constraints.
Before academia, I spent ~4 years in quant investment, developing strategies and building infrastructure. I hold a B.S. in Financial Mathematics from Southern University of Science and Technology.
Selected Projects
Agentic RL Post-Training Infrastructure @Alibaba, DAMO Academy
- Proposed and shipped Partial Overlapping, a runtime scheduling mechanism for asynchronous agentic RL that expands agent rollouts to idle training GPUs, improving rollout throughput by 3.5x (featured in the ROME technical report).
- Leveraged coding agents extensively (Claude Code, Codex) to design, implement, and debug Partial Overlapping with zero human-written code (first high-priority feature in alibaba/ROLL shipped this way); featured in technical blogs (English/Chinese) as a case study for AI-assisted systems engineering.
- Deployed in production for agentic RL post-training of models with 100s of billions of parameters on 1000s of GPUs, including Qoder IDE (coding agent), iFlow CLI (terminal agent), Amap (travel-planning agent), and Alimama (ads).
- Extended Partial Overlapping to async multi-LoRA fine-tuning via per-adapter optimizers on a shared Megatron base model.
- Designed and built RLix
, an orchestration layer for concurrent agentic RL pipelines that enables elastic GPU sharing and higher cluster utilization with minimal changes to training recipes (2.6x rollout throughput in SWE-agent RL training; 200+ GitHub stars, including those from NVIDIA, Google, xAI, Anthropic, ByteDance, Zhipu AI, etc.).
ParaFlex: Multiplexed Heterogeneous LLM Serving via Stage-Aligned Parallelism @University of Pennsylvania
- Proposed a novel LLM serving architecture that eliminates head-of-line blocking and improves token throughput by 1.6x.
- Built multi-model KV cache management and robust NCCL concurrency controls.
- Optimized sharding, replication, placement, and scheduling algorithms for heterogeneous serving workloads.
- SoCC’25 paper
Privacy Budget Scheduling in ML Training @Columbia University
- Introduced Privacy Budget Scheduling and showed how to schedule 2x more jobs than FCFS under the same privacy budget.
- Developed DPF (Dominant Private Block Fairness), the first scheduling algorithm for ML training under differential privacy constraints, derived from DRF, and proved its game-theoretic properties.
- OSDI’21 paper
Honors & Service
- Program Committee: ACM Symposium on Cloud Computing 2025
- Manjushri Fellowship, University of Pennsylvania, 2021
- Financial Risk Manager (FRM) Certification, 2015
- China Merchant Bank Scholarship, 2012-2014
- Pioneering Undergraduate Fellowship, 2011-2014
- First Prize, China High School Biology Olympiad, 2010
