P62: How To Do Machine Learning on Big Clusters
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception
TimeTuesday, November 14th5:15pm -
7pm
LocationFour Seasons Ballroom
DescriptionScientific pipelines, such those in chemogenomics
machine learning applications, often compose of multiple
interdependent data processing tasks. We are developing
HyperLoom - a platform for defining and executing
workflow pipelines in large-scale distributed
environments. HyperLoom users can easily define
dependencies between computational tasks and create a
pipeline which can then be executed on HPC systems. The
high-performance core of HyperLoom dynamically
orchestrates the tasks over available resources
respecting task requirements. The entire system was
designed to have a minimal overhead and to efficiently
deal with varying computational times of the tasks.
HyperLoom allows to execute pipelines that contain basic
built-in tasks, user-defined Python tasks, tasks
wrapping third-party applications or a combination of
those.




