Software Engineer, BigInsights, IBM

Jun 2011 - Sep 2017

Back to portfolio

This photo was taken at the gate of IBM China Development Lab in 2017.

I joined IBM in 2011 when I just graduated from Peking University. I was one of the startup member of BigInsights project, an IBM offering Apache Hadoop based BigData platform. We started from scratch, in the first few years, I was focusing on creating a cluster provisioning and monitoring system, written by Java and Shell code. I started to lead the installation backend team since 3rd quarter of 2011. I took the full ownership of this component, and lead a 3~4 people team for production development. I worked with some front-end folks in Canada and some other component owners in U.S closely on end-to-end integration. At that time, I engaged more than 50 customers and handles lots of urgent issues during which earned a lot of experiences. My work includes

A push-down pattern master2slave framework
A flexible and light-weighted templating system
Cluster Upgrade/Rollback
CDH/HDP Overlays
Kerberos Support

From 2015, I moved on and switched my role to open source Hadoop team lead. I lead up a 3 people team to work on the IBM distribution of Hadoop. My responsibility was to ensure IBM distribution of Hadoop fully satisfies customer’s requirements. I’ve done a lot of bug fixes as well improvements, and I was also very active contributing these to apache community. In 2 years time, I’ve contributed more than 100 patches to HDFS, YARN and Hadoop common.

Began from the 1st quarter of 2017, I started work on Ozone project. This project is quite famous because of its ambition of fixing long-standing HDFS scalability issue, by introducing a general block storage layer. I’ve done many work including

Core API such as list key/bucket/volume
Metadata store
Async key deletion protocol
Garbage collection

At Sep 2017, I was selected as an Apache Hadoop Committer on account of the continua contributions to the project.