My previous research was on system optimizations for Graph Neural Networks (GNNs). I have developed an efficient GNN system that significantly outperforms popular frameworks on a single GPU and on large-scale clusters. This system capitalizes on the specific software characteristics of GNNs, including the sparse pattern of graph structure and the mathematical equivalence of GNN computation. Additionally, it explores the hardware features of accelerators (e.g., TensorCore of GPU) and the interconnectedness of clusters (e.g., RDMA). To ensure versatility, I have utilized compiler techniques to automatically apply the optimizations to multiple user-defined models.
Recently, my research interest has gravitated towards Large Language Model (LLM) performance optimization, including model serving and fine-tuning.
B.Eng. in Computer Science and Technology, 2016-2020
PhD Student in Computer Science and Technology, 2020-