Job scratch area

ARC allows to configure different approaches to manage the job scratch area during the job life cycle. In most cases this is achieved by generating a wrapper submission script to the batch system that carries on tasks relative to directory creation and data movement.

The key elements are:
job session directory
directory on the ARC CE where the job files are located.
job scratch directory
directory on LRMS-managed worker nodes (WNs) where all I/O during computation is performed.
<arc-job-id>
a unique identifier for the job, assigned by ARC.
Example: 10YLDmDRgrynVALY5mGJwcyoABFKDmABFKDmUvKKDmABFKDmG588Rm
job files
  • input files, those coming from the client and data-staging framework that are needed for the actual job processing, copied inside a folder named after <arc-job-id>, that will also contain the job stdout and stderr files.
  • some metadata files used by ARC, mainly <arc-job-id>.comment and <arc-job-id>.diag

Job session directory is configured with sessiondir configuration option. It is possible to configure several session directory root paths. A-REX will then select one of the available directories and append the <arc-job-id> to the path.

Example:

/nfs/sessiondir5/
├── 10YLDmDRgrynVALY5mGJwcyoABFKDmABFKDmUvKKDmABFKDmG588Rm
│   ├── script.sh
│   ├── outfile
│   ├── stderr
│   └── stdout
├── 10YLDmDRgrynVALY5mGJwcyoABFKDmABFKDmUvKKDmABFKDmG588Rm.comment
├── 10YLDmDRgrynVALY5mGJwcyoABFKDmABFKDmUvKKDmABFKDmG588Rm.diag
There are several configuration options that affect the selection of job scratch directory:
shared_filesystem
defines if the job session directory is shared between ARC CE and WNs (by means of e.g. NFS). Sets the environment variable RUNTIME_NODE_SEES_FRONTEND.
scratchdir
defines the path to job scratch directory on the WN. Sets the environment variable RUNTIME_LOCAL_SCRATCH_DIR.
movetool
defines what tool the job wrapper will use to move data from the session directory to the local WNs scratch directory. Sets the environment variable RUNTIME_LOCAL_SCRATCH_MOVE_TOOL.
shared_scratch
defines that the WNs scratch directory can be accessible from ARC CE (by means of e.g. NFS) using the configured path. Sets the environment variable RUNTIME_FRONTEND_SEES_NODE.
Currently this legacy option is not used much, as it is uncommon that nodes share a single scratch directory. If you require such functionality please contact the ARC team.

Note

Described environment variables can be redefined by RunTime Environments dynamically. For example ENV/LRMS-SCRATCH can be used to utilize local scratch that created dynamically by LRMS.

Compute inside shared session directory

In the simplest case, where the job session directory is shared between ARC CE and WNs (e.g. via NFS) and accessible on the same path - the job scratch directory is the job session directory.

Sessiondir is shared between ARC CE and WNs

Fig. 5 Sessiondir is shared between ARC CE and WNs. No local scratchdir defined.

Configuration:

[arex]
shared_filesystem = yes

Variables set in the wrapper script:

RUNTIME_NODE_SEES_FRONTEND='yes'

Compute inside a local WN scratch directory

Session directory is shared, WN scratch directory is not shared

For I/O performance reasons it is possible to perform computations inside a local directory on the WN.

In this case job scratch directory is created in the configured local directory and files from the job session directory are moved to job scratch before execution starts (a).

Only files representing the job’s stdout and stderr are placed in the original job session directory and soft-linked in scratch (b).

After the execution has completed all output files are moved to the job session directory (b) and are then available for clients to download (c).

The move is performed by default using the mv command. The admin can change this behaviour using the movetool option.

Sessiondir is shared between ARC CE and WNs, scratchdir is defined and available only on nodes.

Fig. 6 Sessiondir is shared between ARC CE and WNs, scratchdir is defined and available only on nodes.

Configuration:

[lrms]
movetool = rsync -av

[arex]
shared_filesystem = yes
scratchdir = /mnt/scratch/arc

Variables set in the wrapper script:

RUNTIME_NODE_SEES_FRONTEND='yes'
RUNTIME_LOCAL_SCRATCH_DIR='/mnt/scratch/arc'
RUNTIME_LOCAL_SCRATCH_MOVE_TOOL='rsync -av'

Session directory is NOT shared, WN scratch directory is NOT shared

If the session directory is not shared the data movement between ARC CE and WN can be done by one or a combination of these two:
  • means of LRMS backend
    How to implement the data movement between the ARC CE and WN depends on the particular batch system backend used.
    This is not described here, please refer to your batch system manuals.
    For example, in PBS this corresponds to #PBS -W stagein options. In SLURM it could be custom prolog and epilog scripts.
  • A custom made RTE
    This topic is currently not part of this guide, but it is implemented in some ARC setup, so please ask for support from the community.

Two known use cases of setting up ARC to work with such setup are described below.

Static path to WN scratch directory

In this use case the path of the scratch directory on every WN is a static path that does not change depending on the job.

ARC will create a wrapper script that expects all the job files to be on the WN in the path $RUNTIME_LOCAL_SCRATCH_DIR/<arc-job-id>.

Warning

In this use case, when the job session directory is not shared (shared_filesystem = no) job scratch directory MUST BE defined (scratchdir = path) to instruct LRMS where to find the files. Otherwise job submission to LRMS will fail.

After all input files are gathered in the job session directory on ARC CE, the LRMS or the custom RTE copies files to the job scratch directory on WN (Figure a)).

The job performs all I/O using local job scratch directory. After execution all declared output files (including stdout and stderr) must be staged out to the job session directory on ARC CE by LRMS or custom RTE (Figure b)).

When the output job files are available in the job session directory on ARC CE they are ready to be uploaded to external storage elements or be downloaded by the user (Figure c)).

NEITHER sessiondir NOR WN scratchdir are shared between ARC CE and WNs

Fig. 7 NEITHER sessiondir NOR WN scratchdir are shared between ARC CE and WNs.

Configuration:

[arex]
shared_filesystem = no
scratchdir = /mnt/scratch/arc

Variables set in the wrapper script:

RUNTIME_NODE_SEES_FRONTEND=''
RUNTIME_LOCAL_SCRATCH_DIR='/mnt/scratch/arc'

Dynamic job path to the WN scratch directory

In this use case the LRMS generates a dynamic path for each submitted job. This path is not known to ARC during the creation time of the wrapper script, therefore it must be obtained in some way at runtime, after the job is submitted.

A sysadmin may already have defined such variable in their prolog/epilog scripts.

The workflow is very similar to the one described in the picture of the static path use case, with the difference that the name of the LRMS specific environment variable that contains the dynamic path can be defined and configured in a special RTE called ENV/LRMS-SCRATCH, so that it will be used at runtime to generate a path of the kind <lrms-job-id>/<arc-job-id>.

The sysadmin can now create a custom RTE with LRMS specific commands and make use of the variable above when referencing the dynamic dir.

In this scenario the arc.conf variable scratchdir can be defined but it is not mandatory, as the LRMS may have a completely different base path for each node/job.

However, if configured, the scratchdir path will be prefixed to the dynamic path, as in: <arc-conf-scratchdir-var>/<lrms-job-id>/<arc-job-id>

Please refer to the documentation of the bundled ENV/LRMS-SCRATCH runtime environment for more details.

Configuration:

[arex]
shared_filesystem = no

Variables set in the wrapper script:

RUNTIME_NODE_SEES_FRONTEND=''
RUNTIME_LOCAL_SCRATCH_DIR=''