FastDecode: High-Throughput GPU-Efficient LLM Serving using Heterogeneous Pipelines

Publication
CoRR
Jiaao He
Jiaao He
Ph.D. Student
Jidong Zhai
Jidong Zhai
Professor
(长聘教授、博士生导师)