Software Engineer, AIML, Apple

Oct 2024 – Present

  • Leading the development of Apple Batch, a managed service for batch compute powering data processing and ML workloads. The service features cross-cloud and cross-region support, along with heterogeneous resource discovery and management. Designed a truly serverless architecture for batch workloads in a modern, cloud-native way.

Aug 2023 – Oct 2024

  • Built Apple’s first internal Batch Inference Service, supporting large-scale workloads for 150+ teams. Designed and deployed on AWS EKS with a multi-cluster GPU infrastructure. Led API server design, implemented priority scheduling and preemption using Apache YuniKorn for GPU resource management. Achieved an average GPU utilization of 80%+.

Apr 2022 to Aug 2023

  • Scaled Apache Spark on Kubernetes at Apple, enabling large-scale batch processing across EKS and GKE. Worked on Batch Processing Gateway, leveraging Apache YuniKorn for resource management and scheduling. Leveraging Karpenter for autoscaling, instance lifecycle management, and Kubernetes version upgrades.