Performance Analysis of Parallel Applications for HPC

fig

Abstract

Performance analysis is essential for understanding the performance behaviors of large-scale parallel applications on modern supercomputers. Current performance analysis techniques are based on either profiling or tracing. Profiling incurs low costs during runtime but misses important information for identifying underlying bottlenecks, while tracing brings unacceptable overhead at large scales. In this book, we leverage static information, such as program structures and data dependence, from source codes and executable binaries to guide dynamic analysis, which achieves the analyzability of tracing with the overhead of profiling. We apply this approach to many performance analysis tasks, including memory monitoring, communication analysis, scalability analysis, and noise detection.

Table of Contents (10 Chapters)

  • Backgroud and Overview
  • Fast Communication Trace Collection
  • Structure-Based Communication Trace Compression
  • Informed Memory Access Monitoring
  • Graph Analysis for Scalability Analysis
  • Performance Prediction for Scalability Analysis
  • Lightweight Noise Detection
  • Production-Run Noise Detection
  • Domain-Specific Framework for Performance Analysis
  • Conclusion and Future Work

More details can be found here.