Data sheet
Public defense date: 2017/04/21 (Slides). Opponent: Prof. Ewa Deelman, USC Information Sciences Institute. |
Publication Date: 2017/04/21. ORCID iD: 0000-0003-3315-8253. ISBN: 978-191-7601-693-0. |
Institution: Distributed Systems group), Department of Computing Scienc at Umeå University. |
Supervisors: Prof. Erik Elmroth, Umeå University. Dr. Lavanya Ramakrishnan, Lawrence Berkeley National Lab. Dr. P-O Östberg, Umeå University. |
Summary
This thesis focuses on understanding what new scheduling models are and will be required for future HPC systems. It starts presenting how workloads have evolved in the lifetime of recent and current systems (Paper 1). It identifies new specific workload challenges that affect the scheduling performance (Paper 1). It follows analyzing and proposing general scheduling models for HPC systems (Papers 2 and 3). Next, it presents the set of tools that we have developed to perform scheduling research (Paper 4). Finally, it ends presenting a new scheduling algorithm for one of the identified challenges: efficient scheduling of workflows (Paper 5).
Open Source projects product of this thesis
ScSF: an scheduling simulation framework, that will provide the community with tools to perform scheduling research: workload analysis, generation, simulation, and analysis. Download
WoAS: a workflow aware scheduling algorithm implementation integrated in Slurm to provide short workflow turnaround time while not over-allocation resources. Download
Peer reviewed publications included in this thesis
Rodrigo Álvarez, G. P., Östberg, P. O., Elmroth, E., Antypas, K., Gerber, R., Ramakrishnan, L. (2016, May). Towards Understanding HPC Users and Systems: A NERSC Case Study. Submitted to JPDC (Journal of Parallel and Distributed Computing). In Journal of Parallel and Distributed Computing, Volume 111, 2018, Pages 206-221, ISSN 0743-7315. Full Text
Rodrigo Álvarez, G. P., Östberg, P-O. Elmroth, E. (2014). Priority Operators for Fairshare Scheduling. 18th Workshops on Job Scheduling Strategies for Parallel Processing (JSSPP 2014) co-located with the IPDPS 2014 conference. Full Text
Rodrigo Álvarez, G. P., Östberg, P. O., Elmroth, E., Ramakrishnan, L. (2015, June). A2L2: An Application Aware Flexible HPC Scheduling Model for Low-Latency Allocation. In Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing (VTDC 2015) (pp. 11-19). ACM. Co-located with the HPDC 2015 conference. Full Text
Rodrigo Álvarez, G.P, Elmroth, E., Östberg, P.O., Ramakrishnan, L. ScSF: A Scheduling Simulation Framework. 21th Workshops on Job Scheduling Strategies for Parallel Processing (JSSPP 2017) co-located with the IPDPS 2017 conference. Full Text
Rodrigo Álvarez, G.P, Elmroth, E., Östberg, P.O., Ramakrishnan, L. Enabling workflow aware scheduling on HPC systems. 26th International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2017). Full Text