Data sheet


Public defense date: 2017/04/21 (Slides). Opponent: Prof. Ewa Deelman, USC Information Sciences Institute.
Publication Date: 2017/04/21. ORCID iD: 0000-0003-3315-8253. ISBN: 978-191-7601-693-0.
Institution: Distributed Systems group), Department of Computing Scienc at Umeå University.
Supervisors: Prof. Erik Elmroth, Umeå University. Dr. Lavanya Ramakrishnan, Lawrence Berkeley National Lab. Dr. P-O Östberg, Umeå University.

Summary


This thesis focuses on understanding what new scheduling models are and will be required for future HPC systems. It starts presenting how workloads have evolved in the lifetime of recent and current systems (Paper 1). It identifies new specific workload challenges that affect the scheduling performance (Paper 1). It follows analyzing and proposing general scheduling models for HPC systems (Papers 2 and 3). Next, it presents the set of tools that we have developed to perform scheduling research (Paper 4). Finally, it ends presenting a new scheduling algorithm for one of the identified challenges: efficient scheduling of workflows (Paper 5).

Open Source projects product of this thesis


ScSF: an scheduling simulation framework, that will provide the community with tools to perform scheduling research: workload analysis, generation, simulation, and analysis. Download

WoAS: a workflow aware scheduling algorithm implementation integrated in Slurm to provide short workflow turnaround time while not over-allocation resources. Download

Peer reviewed publications included in this thesis


Rodrigo Álvarez, G. P., Östberg, P. O., Elmroth, E., Antypas, K., Gerber, R., Ramakrishnan, L. (2016, May). Towards Understanding HPC Users and Systems: A NERSC Case Study. Submitted to JPDC (Journal of Parallel and Distributed Computing). In Journal of Parallel and Distributed Computing, Volume 111, 2018, Pages 206-221, ISSN 0743-7315. Full Text

Rodrigo Álvarez, G. P., Östberg, P-O. Elmroth, E. (2014). Priority Operators for Fairshare Scheduling. 18th Workshops on Job Scheduling Strategies for Parallel Processing (JSSPP 2014) co-located with the IPDPS 2014 conference. Full Text

Rodrigo Álvarez, G. P., Östberg, P. O., Elmroth, E., Ramakrishnan, L. (2015, June). A2L2: An Application Aware Flexible HPC Scheduling Model for Low-Latency Allocation. In Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing (VTDC 2015) (pp. 11-19). ACM. Co-located with the HPDC 2015 conference. Full Text

Rodrigo Álvarez, G.P, Elmroth, E., Östberg, P.O., Ramakrishnan, L. ScSF: A Scheduling Simulation Framework. 21th Workshops on Job Scheduling Strategies for Parallel Processing (JSSPP 2017) co-located with the IPDPS 2017 conference. Full Text

Rodrigo Álvarez, G.P, Elmroth, E., Östberg, P.O., Ramakrishnan, L. Enabling workflow aware scheduling on HPC systems. 26th International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2017). Full Text