Oct 2024 – Present
Apple Batch
, a managed service for batch compute powering data processing
and ML workloads. The service features cross-cloud and cross-region support, along with heterogeneous resource discovery and
management. Designed a truly serverless architecture for batch workloads in a modern, cloud-native way.Aug 2023 – Oct 2024
Batch Inference Service
, supporting large-scale workloads for 150+ teams.
Designed and deployed on AWS EKS with a multi-cluster GPU infrastructure. Led API server design, implemented priority scheduling
and preemption using Apache YuniKorn for GPU resource management. Achieved an average GPU utilization of 80%+.Apr 2022 to Aug 2023