Queue Manager Example YAML Files

The primary way to set up a Manager is to setup a YAML config file. This page provides helpful config files which mostly can be just copied and used in place (filling in things like **username** and **password** as needed.)

The full documentation of every option and how it can be used can be found in the Queue Manager’s API.

For these examples, the username will always be “Foo” and the password will always be “b4R” (which are just placeholders and not valid). The manager_name variable can be any string and these examples provide some descriptive samples. The more distinct the name, the better it is to see its status on the Server.

SLURM Cluster, Dask Adapter with additional options

This example is similar to the example on the start page for Managers, but with some additional options such as connecting back to a central Fractal instance and setting more cluster-specific options. Again, this starts a manager with a dask Adapter, on a SLURM cluster, consuming 1 CPU and 8 GB of ram, targeting a Fractal Server running on that cluster, and using the SLURM partition default, save the following YAML config file:

common:
 adapter: dask
 tasks_per_worker: 1
 cores_per_worker: 1
 memory_per_worker: 8

server:
 fractal_uri: "localhost:7777"
 username: Foo
 password: b4R

manager:
 manager_name: "SlurmCluster_OneDaskTask"

cluster:
 scheduler: slurm
 walltime: "72:00:00"

dask:
 queue: default

Multiple Tasks, 1 Cluster Job

This example starts a max of 1 cluster Job, but multiple tasks. The hardware will be consumed uniformly by the Worker. With 8 cores, 20 GB of memory, and 4 tasks; the Worker will provide 2 cores and 5 GB of memory to compute each Task. We set common.max_workers to 1 to limit the number of Workers and Jobs which can be started. Since this is SLURM, the squeue information will show this user has run 1 sbatch jobs which requested 4 cores and 20 GB of memory.

common:
 adapter: dask
 tasks_per_worker: 4
 cores_per_worker: 8
 memory_per_worker: 20
 max_workers: 1

server:
 fractal_uri: "localhost:7777"
 username: Foo
 password: b4R

manager:
 manager_name: "SlurmCluster_MultiDask"

cluster:
 scheduler: slurm
 walltime: "72:00:00"

dask:
 queue: default

Testing the Manager Setup

This will test the Manager to make sure it’s setup correctly, and does not need to connect to the Server, and therefore does not need a server block. It will still however submit jobs.

common:
 adapter: dask
 tasks_per_worker: 2
 cores_per_worker: 4
 memory_per_worker: 10

manager:
 manager_name: "TestBox_NeverSeen_OnServer"
 test: True
 ntests: 5

cluster:
 scheduler: slurm
 walltime: "01:00:00"

dask:
 queue: default

Running commands before work

Suppose there are some commands you want to run before starting the Worker, such as starting a Conda environment, or setting some environment variables. This lets you specify that. For this, we will run on a Sun Grid Engine (SGE) cluster, start a conda environment, and load a module.

An important note about this one, we have now set max_workers to something larger than 1. Each Job will still request 16 cores and 256 GB of memory to be evenly distributed between the 4 tasks, however, the Adapter will attempt to start 5 independent jobs, for a total of 80 cores, 1.280 TB of memory, distributed over 5 Workers collectively running 20 concurrent tasks. If the Scheduler does not allow all of those jobs to start, whether due to lack of resources or user limits, the Adapter can still start fewer jobs, each with 16 cores and 256 GB of memory, but Task concurrency will change by blocks of 4 since the Worker in each Job is configured to handle 4 tasks each.

common:
 adapter: dask
 tasks_per_worker: 4
 cores_per_worker: 16
 memory_per_worker: 256
 max_workers: 5

server:
 fractal_uri: localhost:7777
 username: Foo
 password: b4R

manager:
 manager_name: "GridEngine_OpenMPI_DaskWorker"
 test: False

cluster:
 scheduler: sge
 task_startup_commands:
     - module load mpi/gcc/openmpi-1.6.4
     - conda activate qcfmanager
 walltime: "71:00:00"

dask:
 queue: free64

Additional Scheduler Flags

A Scheduler may ask you to set additional flags (or you might want to) when submitting a Job. Maybe it’s a Sys. Admin enforced rule, maybe you want to pull from a specific account, or set something not interpreted for you in the Manager or Adapter (do tell us though if this is the case). This example sets additional flags on a PBS cluster such that the final Job launch file will have #PBS {my headers}.

This example also uses Parsl and sets a scratch directory.

common:
 adapter: parsl
 tasks_per_worker: 1
 cores_per_worker: 6
 memory_per_worker: 64
 max_workers: 5
 scratch_directory: "$TMPDIR"

server:
 fractal_uri: localhost:7777
 username: Foo
 password: b4R
 verify: False

manager:
 manager_name: "PBS_Parsl_MyPIGroupAccount_Manger"

cluster:
 node_exclusivity: True
 scheduler: pbs
 scheduler_options:
     - "-A MyPIsGroupAccount"
 task_startup_commands:
     - conda activate qca
     - cd $WORK
 walltime: "06:00:00"

parsl:
 provider:
  partition: normal_q
  cmd_timeout: 30

Single Job with Multiple Nodes and Single-Node Tasks with Parsl Adapter

Leadership platforms prefer or require more than one node per Job request. The following configuration will request a Job with 256 nodes and place one Worker on each node.

common:
    adapter: parsl
    tasks_per_worker: 1
    cores_per_worker: 64  # Number of cores per compute node
    max_workers: 256  # Maximum number of workers deployed to compute nodes
    nodes_per_job: 256

cluster:
    node_exclusivity: true
    task_startup_commands:
        - module load miniconda-3/latest  # You will need to load the Python environment on startup
        - source activate qcfractal
        - export KMP_AFFINITY=disable  # KNL-related issue. Needed for multithreaded apps
        - export PATH=~/software/psi4/bin:$PATH  # Points to psi4 compiled for compute nodes
    scheduler: cobalt  # Varies depending on supercomputing center

parsl:
    provider:
        queue: default
        launcher:  # Defines the MPI launching function
            launcher_class: AprunLauncher
            overrides: -d 64  # Option for XC40 machines, allows workers to access 64 threads
        init_blocks: 0
        min_blocks: 0
        account: CSC249ADCD08
        cmd_timeout: 60
        walltime: "3:00:00"

Consult the Parsl configuration docs for information on how to configure the Launcher and Provider classes for your cluster.

Single Job with Multiple, Node-Parallel Tasks with Parsl Adapter

Running MPI-parallel tasks requires a similar configuration to the multiple nodes per job for the manager and also some extra work in defining the qcengine environment. The key difference that sets apart managers for node-parallel applications is that that nodes_per_job is set to more than one and Parsl uses SimpleLauncher to deploy a Parsl executor onto the batch/login node once a job is allocated.

common:
    adapter: parsl
    tasks_per_worker: 1
    cores_per_worker: 16  # Number of cores used on each compute node
    max_workers: 128
    memory_per_worker: 180  # Summary for the amount per compute node
    nodes_per_job: 128
    nodes_per_task: 2  # Number of nodes to use for each task
    cores_per_rank: 1  # Number of cores to each of each MPI rank

cluster:
    node_exclusivity: true
    task_startup_commands:
        - module load miniconda-3/latest
        - source activate qcfractal
        - export PATH="/soft/applications/nwchem/6.8/bin/:$PATH"
        - which nwchem
    scheduler: cobalt

parsl:
    provider:
        queue: default
        launcher:
            launcher_class: SimpleLauncher
        init_blocks: 0
        min_blocks: 0
        account: CSC249ADCD08
        cmd_timeout: 60
        walltime: "0:30:00"

The configuration that describes how to launch the tasks must be written at a qcengine.yaml file. See QCEngine docs for possible locations to place the qcengine.yaml file and full descriptions of the configuration option. One key option for the qcengine.yaml file is the description of how to launch MPI tasks, mpiexec_command. For example, many systems use mpirun (e.g., OpenMPI). An example configuration a Cray supercomputer is:

all:
  hostname_pattern: "*"
  scratch_directory: ./scratch  # Must be on the global filesystem
  is_batch_node: True  # Indicates that `aprun` must be used for all QC code invocations
  mpiexec_command: "aprun -n {total_ranks} -N {ranks_per_node} -C -cc depth --env CRAY_OMP_CHECK_AFFINITY=TRUE --env OMP_NUM_THREADS={cores_per_rank} --env MKL_NUM_THREADS={cores_per_rank}
  -d {cores_per_rank} -j 1"
  jobs_per_node: 1
  ncores: 64

Note that there are several variables in the mpiexec_command that describe how to insert parallel configurations into the command: total_ranks, ranks_per_node, and cores_per_rank. Each of these values are computed based on the number of cores per node, the number of nodes per application and the number of cores per MPI rank, which are all defined in the Manager settings file.