Large language model

MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline Scheduling on Cost-Effective Accelerators

The training of large language models (LLMs) typically needs costly GPUs, such as NVIDIA A100 or H100. They possess substantial …

Zhenbo Sun, Shengqi Chen, Yuanwei Wang, Jian Sha, Guanyu Feng, Wenguang Chen