Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning

Abstract

Recent studies have found cloud environments increasingly appealing for executing HPC applications, including tightly coupled parallel simulations. At the same time, while public clouds offer elastic, on-demand resource provisioning and pay-as-you-go pricing, individual users setting up their on-demand virtual clusters may not be able to take full advantage of common cost-saving opportunities, such as reserved instances. In this paper, we propose a Semi-Elastic Cluster (SEC) computing model for organizations to reserve and dynamically resize a virtual cloud-based cluster. We present a set of integrated batch scheduling plus resource scaling strategies uniquely enabled by SEC, as well as an online reserved instance provisioning algorithm based on job history. Our trace-driven simulation results show that such a model has a 61.0 percent cost saving than individual users acquiring and managing cloud resources without causing longer average job wait time. Moreover, to exploit the advantages of different public clouds, we also extend SEC to a multi-cloud environment, where SEC can get a lower cost than on any single cloud. We design and implement a prototype system of the SEC model and evaluate it in terms of management overhead and average job wait time. Experimental results show that the management overhead is negligible with respect to the job wait time.

Publication
IEEE Transactions on Parallel and Distributed Systems
Jidong Zhai
Jidong Zhai
Associate Professor
(特别研究员、博士生导师)
Wenguang Chen
Wenguang Chen
Professor
(教授)