Obtaining Dynamic Scheduling Policies with Simulation and
Machine Learning
Event Type
Paper
Machine Learning
Quantum Computing
TimeWednesday, November 15th2pm -
2:30pm
Location301-302-303
DescriptionDynamic scheduling of tasks in large-scale HPC
platforms is normally accomplished using ad-hoc
heuristics, based on task characteristics, combined with
some backfilling strategy. Defining heuristics that work
efficiently in different scenarios is a difficult task,
specially when considering the large variety of task
types and platform architectures.
In this work, we present a methodology based on simulation and machine learning to obtain dynamic scheduling policies. Using simulations and a workload generation model, we can determine the characteristics of tasks that lead to a reduction in the mean slowdown of tasks in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next task to execute in a queue improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in performance improvements, attesting the generalization capability of the obtained heuristics.
In this work, we present a methodology based on simulation and machine learning to obtain dynamic scheduling policies. Using simulations and a workload generation model, we can determine the characteristics of tasks that lead to a reduction in the mean slowdown of tasks in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next task to execute in a queue improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in performance improvements, attesting the generalization capability of the obtained heuristics.
Download PDF:
here




