Job scratch area
ARC allows to configure different approaches to manage the job scratch area during the job life cycle. In most cases this is achieved by generating a wrapper submission script to the batch system that carries on tasks relative to directory creation and data movement.
- The key elements are:
- job session directory
directory on the ARC CE where the job files are located.
- job scratch directory
directory on LRMS-managed worker nodes (WNs) where all I/O during computation is performed.
- <arc-job-id>
- a unique identifier for the job, assigned by ARC.Example:
869c20be3993
- job files
input files, those coming from the client and data-staging framework that are needed for the actual job processing, copied inside a folder named after <arc-job-id>, that will also contain the job stdout and stderr files.
some metadata files used by ARC, mainly <arc-job-id>.comment and <arc-job-id>.diag
Job session directory is configured with sessiondir configuration option. It is possible to configure several session directory root paths. A-REX will then select one of the available directories and append the <arc-job-id> to the path.
Example:
/nfs/sessiondir5/
├── 869c20be3993
│ ├── script.sh
│ ├── outfile
│ ├── stderr
│ └── stdout
├── 869c20be3993.comment
├── 869c20be3993.diag
- There are several configuration options that affect the selection of job scratch directory:
- shared_filesystem
defines if the job session directory is shared between ARC CE and WNs (by means of e.g. NFS). Sets the environment variable
RUNTIME_NODE_SEES_FRONTEND
.- scratchdir
defines the path to job scratch directory on the WN. Sets the environment variable
RUNTIME_LOCAL_SCRATCH_DIR
.- movetool
defines what tool the job wrapper will use to move data from the session directory to the local WNs scratch directory. Sets the environment variable
RUNTIME_LOCAL_SCRATCH_MOVE_TOOL
.- shared_scratch
- defines that the WNs scratch directory can be accessible from ARC CE (by means of e.g. NFS) using the configured path. Sets the environment variable
RUNTIME_FRONTEND_SEES_NODE
.Currently this legacy option is not used much, as it is uncommon that nodes share a single scratch directory. If you require such functionality please contact the ARC team.
Note
Described environment variables can be redefined by RunTime Environments dynamically. For example ENV/LRMS-SCRATCH can be used to utilize local scratch that created dynamically by LRMS.