HPC Workflow @ENCCB
from
Tuesday, 15 February 2022 (08:00)
to
Wednesday, 16 February 2022 (17:00)
Monday, 14 February 2022
Tuesday, 15 February 2022
08:00
Welcome & Introduction
-
Leplae Raphaël
(
ULB
)
Welcome & Introduction
Leplae Raphaël
(
ULB
)
08:00 - 08:30
To start the workshop, this talk will first present the ecosystem and environment related to HPC in Belgium and the contexts of PRACE and EuroCC. It will then move on to a short introduction on Workflows in HPC and a description of the different seminars to come in this workshop.
08:30
Workflows with basic GNU tools (and Maestro)
-
François Damien
(
UCLouvain
)
Workflows with basic GNU tools (and Maestro)
François Damien
(
UCLouvain
)
08:30 - 10:00
This presentation will present the two basic building blocks of workflows that are the job arrays and job dependencies. Job arrays allow creating parametrised jobs that all look identical except for one parameter that varies through the workflow, while job dependencies enable a fixed ordering of jobs and make sure the steps of the workflows are carried on only when their requirements (input data, software, output directory, etc.) are available. It will also discuss the concepts of micro-scheduling (running multiple small jobs steps inside of a single job allocation) and macro-scheduling (submitting multiple jobs at the same time with a single command). The presentation will also introduce the use of basic GNU/Linux commands that make micro- and macro-scheduling easier: xargs, seq, GNU Parallel, GNU Make, envsubst. The concepts will be illustrated with Slurm but should apply to any other scheduler. Finally, the session will present Maestro, a little workflow manager developed by the same lab as Slurm originated from, that focuses on documentation and organisation, and that makes it easy to build small workflows without the need to manually submit the jobs and is a nice complement to the Linux tools mentioned earlier.
10:30
Checkpoint/Restart
-
Mattelaer Olivier
(
UCLouvain
)
Checkpoint/Restart
Mattelaer Olivier
(
UCLouvain
)
10:30 - 11:30
This session will discuss one specific type of workflows that is checkpoint/restart and how Linux signals can be leveraged to build self-resubmitting jobs that can run longer than the maximum wall time of the cluster.
13:00
atools
-
Bex Geert Jan
(
Uhasselt – KULeuven
)
atools
Bex Geert Jan
(
Uhasselt – KULeuven
)
13:00 - 14:00
This presentation will present a collection of tools named atools that help building and managing large job arrays for parametrised studies. Such workflows can be referred to as "wide" workflows: many similar jobs siblings one to another, with no dependency among them.
14:30
Makeflow
-
François Damien
(
UCLouvain
)
Makeflow
François Damien
(
UCLouvain
)
14:30 - 15:15
This session will discuss Makeflow, a tool that can be used to model workflows with many dependencies among jobs. Such workflows can be referred to as "deep" workflows by contrast with the "wide workflows" described earlier.
Wednesday, 16 February 2022
08:00
CI/CD implementation with gitHub
-
Mattelaer Olivier
(
UCLouvain
)
CI/CD implementation with gitHub
Mattelaer Olivier
(
UCLouvain
)
08:00 - 09:00
This session will be about GitHub and its continuous integration/continuous deployment (CI/CD) features and how it can be used on clusters with a regular user to automatically compile software and even submit benchmark jobs whenever new features or improvements are added to the software you are writing.
09:30
Singularity
-
Mattelaer Olivier
(
UCLouvain
)
Singularity
Mattelaer Olivier
(
UCLouvain
)
09:30 - 10:30
This presentation will be about Singularity and how to build containers and deploy them on clusters so as to install software in a uniform way, not being stopped by the Linux flavour or available software modules.
13:00
Snakemake
-
Louant Orian
(
ULiège
)
Snakemake
Louant Orian
(
ULiège
)
13:00 - 14:00
We are back to scientific workflows and the seventh presentation will be a tutorial on SnakeMake, a tool that is a bit more complex to use than the other two but that can handle both wide and deep workflows, and can do more things like templating, containers, etc.
14:30
User testimonials
User testimonials
14:30 - 16:00
15h30-15h50: NextFlow for bioinformatics by Luc Cornet 15h50-16h10: Fireworks for material science by Guillaume Brunin 16h10-16h30: Coral, a home made workflow system tool to manage numerical climate simulations by François Klein 16h30-16h50: Examples of CI/CD in research codebases using git-based websites and HPC by Denis-Gabriel Caprace