Dissertation Defense: Advances in High Performance Computing Through Concurrent Data Structures and Predictive Scheduling

Thursday, June 6, 2024 10 a.m. to noon

Modern High Performance Computing (HPC) systems are made up of thousands of server-grade compute nodes linked through a high-speed network interconnect. Each node has tens or even hundreds of CPU cores each, with counts continuing to grow on newer HPC clusters. This results in a need to make use of millions of cores per cluster. Fully leveraging these resources is difficult. There is an active need to design software that scales and fully utilizes the hardware. In this dissertation, we address this gap with a dual approach, considering both intra-node (single node) and intra-node (across node) concerns. To aid in intra-node performance, we propose two novel concurrent data structures, a transactional vector and a persistent hash map. These designs have broad applicability in any multi-core environment but are particularly useful in HPC, which commonly features a large number of cores per node. For inter-node performance, we propose a metrics-driven approach to improve scheduling quality, using predicted run times to backfill jobs more accurately and aggressively. This is augmented using application input parameters to further improve these run time predictions. Improved scheduling reduces the number of idle nodes in an HPC cluster, maximizing job throughput. We find that our data structures outperform the prior state-of-the-art while offering additional features. Our backfill technique likewise outperforms previous techniques in simulations, and our run time predictions were significantly more accurate than conventional approaches. Code for these works is freely available, and we have plans to deploy these techniques more broadly on real HPC systems in the future.

Damian Dechev, Committee Chair.

Read More

Location:

Research 1: 103

Contact:

College of Graduate Studies 4078232766 editor@ucf.edu

Calendar:

Graduate Thesis and Dissertation

Category:

Uncategorized/Other

Tags:

Graduate Computer Science defense