NORDUGRID-MANUAL-2

ARC 0.8.x Server Installation Instructions: Setting Up A Grid Resource

General notes:

 :: Preparation ::  Grid software ::  Security ::  Configuration ::  Start-up :: 

Pre-installation steps:

General requirements for equipment

Hardware, operating system etc

The NorduGrid middleware, also known as the Advanced Resource Connector (ARC) does not impose heavy requirements on hardware. Any 32-bit architecture will do, as well as many 64-bit ones. Some success has been reported for PPC, too. CPU frequency from 333 MHz and up has been tested, and RAM of 128 MB and up. Disk space required for the ARC installation including development interface is about 160 MB, while external software (most notably, minimal setup of Globus Toolkit 5) requires another 10 MB. Network connectivity of servers (front-ends, gatekeepres, database servers, storage arrays etc) is required to be both out- and inbound. In case you are behind a firewall, a range of ports will have to be completely opened. For clusters, the worker nodes can either be on a private or a public network.

A shared file system, such as NFS, is desirable (due to simplicity) but not required, if the Local Resource Management System provides means for file staging between the computing nodes and the frontend, or if execution happens on the same machine (like it does with fork). Local authentication of Grid users is supported through embedded authentication algoritms and callouts to external executables or functions in dynamically loadable libraries. Actual implementation (e.g., for AFS) requires site-specific modules to be provided.

The NorduGrid ARC middleware is expected to run on any system supported by Globus. At the moment, only GNU/Linux of the following distributions are supported: Fedora, Red Hat Enterpise Linux, Debian, Ubuntu and (partially) OpenSuSE

DNS Requirements for GSI (Grid Security Infrastructure)

In order for the authentication of a server's host certificate to be successful, the reverse DNS lookup of the IP address of the server must result in the hostname given in the host certificate.

This means that the reverse DNS lookup for a host running a GSI enabled service must be configured properly - a "host not found" result is not acceptable. When a server has several hostnames/aliases the host certificate should be requested with the hostname that is used in the reverse lookup table in the DNS.

This reverse lookup must work for all clients trying to connect to the server, including clients running on the machine itself. Even if the host is a dedicated server and no user interface commands is being run on it, other clients such as uploader and downloader processes run by the Grid Manager require GSI authentication to work.

Since the hostname in the host certificate is fully qualified the reverse lookup must yield the fully qualified hostname. If the /etc/hosts file is used for local lookups instead of DNS make sure that the fully qualified hostname is listed before any shortnames or aliases for the server host.

If e.g. the /etc/hosts file of the server looks like this

1.2.3.4    somename    somename.domain.com

any clients running on that machine can NOT contact servers on the machine itself since the result of a reverse lookup will be the unqualified hostname "somename" which will not match the fully qualified hostname in the host certificate. Such an /etc/hosts file should be modified to read

1.2.3.4    somename.domain.com    somename

Time synchronization

Since authorization on the Grid relies on temporary proxies, it is very important to adjust the clock on your boxes with a reliable time server. If the clock on a cluster is off by 3 hours, the cluster will either reject a newly created user proxy for the first 3 hours of its lifetime and then accept the proxy for 3 hours longer than it is supposed to, or start rejecting the proxy three hours too early, depending on in which direction the clock is off.

Clusters:

  1. First, you have to create (some) UNIX accounts on your cluster dedicated for Grid. These local UNIX accounts will be used to map Grid users locally and every Grid job or Grid activity will take place via these accounts. In the simplest scenario, it is enough to create a single account, e.g. a user called grid, but you can also have separate accounts for the different Grid user groups. In addition to authorization rules provided by middleware You may group the created Grid accounts into UNIX groups and use the local UNIX authorization methods to restrict the Grid accounts.
  2. Create disk areas on the front-end which will be used by the Grid services. A typical setup is given in the table below with example locations indicated. NFS means that the directory has to be available on the nodes. It is recommended to put the grid area and the cache directory onto separate volumes (partitions, disks). The cache can be split into 2 subdirectories:
    1. a "control" subdirectory for control files and
    2. a "data" subdirectory for the cached data itself.
    For security reasons, the control directory may be accessible only from frontend.


  3. Function Location Description Example
    grid area (required) NFS the directory which accomodates the session directories of the Grid jobs /scratch/grid
    cache directory (optional) NFS/local the place where the input files of the Grid jobs are cached /scratch/cache
    runtime environment scripts (optional) NFS the place for the initialization scripts of the pre-installed software environments /SOFTWARE/runtime
    control directory (required) local to the front-end the directory for the internal control files of the Grid Manager /var/spool/nordugrid/jobstatus


    Further notes on the Grid directories: some of the NFS requirements can be relaxed with a special cluster setup and configuration. For the possible special setups please consult the Grid Manager documentation.
  4. Check the network connectivity of the computing nodes. For the NorduGrid middleware, internal cluster nodes are NOT required to be fully available on the public internet (however, user applications may eventually require it). Nodes can have inbound, outbound, both or no network connectivity. This nodeaccess property should be set in the configuration (see below).
  5. Make your firewall Grid-friendly: there are certain incoming and outgoing ports and port ranges which need to be opened in case your Grid resource is behind a firewall. All of the requirements come from the Globus internals (you can read more on Globus and firewalls). Globus-based Grid services, including those currently implemented in ARC, are not supported to work behind NAT firewalls. NorduGrid ARC needs the following incoming and outgoing ports to be opened:
    • For MDS, default 2135
    • For GridFTP:
      • default 2811
      • a range of ports for GridFTP data channels, typically 9000-9300
    • For HTTP(s), default 80 and 443
    • For HTTPg, default 8443 (outgoing only)
    • For SMTP, default 25 (outgoing only)
    • For NTP, default 123 (outgoing only, in case NTP is used for time synchronisation)
    Most ports, including 2135 and 2811, are registered with IANA and should normally not be changed. The ports for GridFTP data channels can be chosen arbitrary, based on following considerations: gridftpd by default handles 100 connections simultaneously; each connection should not use more than 1 additional TCP port. Taking into account that Linux tends to keep ports allocated even after the handle is closed for some time, it is a good idea to triple that amount. Hence about 300 data transfer ports should be enough for the default configuration. Typically, the range of ports from 9000 to 9300 is being opened. Remember to specify this range in the ARC configuration file ([common] section, globus_tcp_port_range attribute) later on.
  6. Configure Local Resource Management/Batch System in order to suit the Grid. In a typical scenario a queue (or queues) dedicated to Grid jobs has to be created, all or some of the cluster nodes assigned to the Grid queues, and queue and user limits set for the Grid queue and the Grid accounts. Brief instructions are available for PBS and Condor setups procedures. DO NOT use PBS routing queues as grid queues – they are not supported.

Storage Element:

  1. Install a standard Linux box with a dedicated disk storage area. In case the SE wants to serve several Grid user groups (or Virtual Organizations) it is preferable to dedicate separate disks (volumes, partitions, etc.) for the different Grid user groups.
  2. Creating Grid accounts: You do not have to create UNIX accounts dedicated to Grid. But You find it useful to do that for local accounting. These local UNIX accounts will be used to map Grid users locally and the data stored on the storage element will be owned by these accounts. In the simplest scenario, it is enough to create a single account, e.g. a user called grid, but you can also have separate accounts for the different Grid user groups. You may find it useful to put all the Grid accounts into the same UNIX group.
  3. Make your firewall Grid-friendly: there are certain incoming and outgoing ports and port ranges which need to be opened in case your Grid resource is behind a firewall. All of the requirements come from the Globus internals (you can read more on Globus and firewalls). Globus-based Grid services, including those currently implemented in ARC, are not supported to work behind NAT firewalls. NorduGrid ARC Storage Element needs the following ports to be opened:
    • For MDS, default 2135
    • For GridFTP:
      • default 2811
      • a range of ports for GridFTP data channels, typically 9000-9300
    • For HTTP(s), default 80 and 443
    • For HTTPg, default 8443 (outgoing only)
    • If SRM-enabled Smart Storage Element is used instead of the GridFTP one, then default ports are 8000 for HTTPg and 8001 for HTTPs connections
    • For SMTP, default 25 (outgoing only)
    • For NTP, default 123 (outgoing only, in case NTP is used for time synchronisation)
    Most ports, including 2135 and 2811, are registered with IANA and should normally not be changed. The ports for GridFTP data channels can be chosen arbitrary, based on following considerations: gridftpd by default handles 100 connections simultaneously; each connection should not use more than 1 additional TCP port. Taking into account that Linux tends to keep ports allocated even after the handle is closed for some time, it is a good idea to triple that amount. Hence about 300 data transfer ports should be enough for the default configuration. Typically, the range of ports from 9000 to 9300 is being opened. Remember to specify this range in the ARC configuration file ([common] section, globus_tcp_port_range attribute) later on.

Collecting and Installing the Grid software (middleware):

The same basic server software is needed both for cluster and storage resources. The NorduGrid download area contains all the required software including the necessary external packages as well as pre-compiled binaries for many Linux distributions, and also source code distributions. Binaries are available as either RPMs, debs or tarballs.

For RedHat-based systems (RHEL, Fedora, SL, CentOS), the recommended way is to use Yum repositories and groupinstall option.

For Debian-based systems (Debian, Ubuntu) the recommended way is to use APT repositories

If you do not use Yum or APT, please follow the instructions below.

NorduGrid ARC middleware depends on several external packages, most notably, Globus Toolkit 5 (GT5). Since 2009, GT5 and most other external dependencies (VOMS, LFC) are available from standard repositories of Fedora, RedHat (via EPEL), Debian and Ubuntu. For other operating systems, you can check NorduGrid's Globus Toolkit packages, or install Globus from source. After installing GT5, please check that the variables GLOBUS_LOCATION (and GPT_LOCATION, for older systems) are set according to your Globus (and GPT) installations.

  1. Check that all the necessary external dependencies such as Globus, VOMS, BDII etc are satisfied, or download and install missing packages, if any.
  2. Install the ARC middleware packages from the NorduGrid yum/apt repositories or download and install them from the NorduGrid Downloads area. You definitely need the nordugrid-arc-gridftpd, nordugrid-arc-grid-manager, nordugrid-arc-infosys-ldap, nordugrid-arc-ca-utils, nordugrid-arc-gridmap-utils and nordugrid-arc-libs packages, while the nordugrid-arc-client, nordugrid-arc-libs-devel and the nordugrid-arc-doc packages are optional but recommended. It is useful to have the client installed on a server for testing purposes. You may need to install some of the external packages in order to satisfy dependencies.
  3. If you plan to obtain host and user certificates from the NorduGrid Certification Authority (that is, if your site resides in Denmark, Finland, Norway, Island or Sweden), make sure to install the package globus-gsi-cert-utils-progs.

Optional: Re-building ARC middleware

This step is only needed if you are using an unsupported operating system or different versions of external dependencies, and experience problems with ARC.

See also detailed instructions on how to build the ARC middleware.

  1. Cross-check that the variables GLOBUS_LOCATION (and GPT_LOCATION, when relevant) are pointing to your Globus (and GPT) installation.
  2. Check that all the necessary external dependencies are satisfied, or download and install missing packages, if any.
  3. Get from the NorduGrid Downloads area source RPMs of latest ARC release code nordugrid-arc-<x.y.z-1>.src.rpm and rebuild it: rpm --rebuild nordugrid-arc-<x.y.z-1>.src.rpm
  4. Alternatively, you can get a tarball nordugrid-arc-<x.y.z>.tar.gz, and follow the usual procedure: tar xvzf nordugrid-arc-<x.y.z>.tar.gz
    cd nordugrid-arc-<x.y.z>
    ./configure
    make
    make install

Setting up the Grid Security Infrastructure: Certificates, Authentication and Authorization

Read carefuly the following section, as your resource will not be able to function if it has improper or outdated credentials.

The following considerations apply for both clusters and storage elements. You may find useful our certificate mini How-to.

  1. Your site needs to have certificates for the Grid services issued by your national Certificate Authority (CA). The minimum is a host certificate but we recommend to have a certificate for each service (e.g. LDAP) as well. Each country has own certification policies and procedures, please consult your local Certificate Authority In order to be able to work on Grid, you will need to install your Certificate Authority credentials, from either the CA itself, IGTF repository, NorduGrid yum/apt repositories, or from the NorduGrid Downloads area. You need the credentials of that CA which will certify you and your site, as well as credentials of all the CAs which certified the services you plan to use and users you plan to accept. For example, if your host certificate is issued by the NorduGrid CA, and your user has a certificate issued by the Estonian CA, and she is going to transfer files between your site and Slovakia, you need the NorduGrid, Estonian and Slovak CA credentials. In case your resource is in a Nordic country (Denmark, Finland, Norway, IInceland or Sweden), install the certrequest-config package from the NorduGrid Downloads area. This contains the default configuration for generating certificate requests for Nordic-based services and users. If you are located elsewhere, contact your local CA for details. For example, in Nordic countries, generate a host certificate request with grid-cert-request -host <my.host.fqdn> and a LDAP certificate request with grid-cert-request -service ldap -host <my.host.fqdn> and send the request(s) to the NorduGrid CA for signing.
    Upon receipt of the signed certificates, place them into the proper location (by default, /etc/grid-security/). Check that the certificate and key files are owned by root and the private keys are only readable by root and that none of the files has executable permissions. Also make sure private keys are not password-protected. This is especially important if you used tool other than grid-cert-request or ran it in interactive mode.
  2. Set up your authentication policy: decide which certificates your site will accept. You are strongly advised to obtain credentials from each CA by contacting them. To simplify this task, the NorduGrid Downloads area has a non-authoritative collection of CA credentials approved by EUGridPMA. As soon as you deside on the list of trusted certificate authorities, you simply download and install packages containing their public keys etc. Before installing any CA package, you are advised to check the credibility of the CA and verify its policy!
  3. The Certificate Authorities are responsible for maintaining lists of revoked personal and service certificates, known as CRL (Certificate Revocation List). It is the site (that is, yours) responsibility to check the CRLs regularly and deny access to Grid users presenting a revoked certificate. Outdated CRL will render your site unuseable. NorduGrid provides an automatic tool for regular CRL check-up. We recommend to install the nordugrid-arc-ca-utils from the NorduGrid Downloads area The utility periodically keeps track of the CA revocation lists.
  4. Set up your authorization policy: decide which Grid users or groups of Grid users (Virtual Organizations) are allowed to use your resource, and define the Grid mappings (Grid users to local Unix users). The Grid mappings are listed in the so-called grid-mapfile. Within NorduGrid, there is an automatic tool which keeps the local grid-mapfiles synchronized to a central user database. If your site joins NorduGrid, you are advised to install the nordugrid-arc-gridmap-utils from the NorduGrid Downloads area. Follow the configuration instructions to configure your system properly authorization-wise: it involves editing [vo] blocks in the configuration file. For further info on authorization read the NorduGrid VO documentation. IMPORTANT: you either maintain the grid mappings by hand editing the /etc/grid-security/grid-mapfile directly, or use the nordugrid-arc-gridmap-utils (nordugridmap script ran through cron) to create and maintain the mappings file for your site. In the latter case, the utility keeps the grid-mapfile synchronized with the central authorization service of your choice, for instance NorduGrid user list. If you install the nordugrid-arc-gridmap-utils you ONLY have to edit the [vo] blocks in the configuration file and optionally the file representing local list of mappings (usually /etc/grid-security/local-grid-mapfile). ADVANCED: You may use more flexible methods of authorizing and mapping Grid users to UNIX accounts including dynmaic allocation and third-party algorithms. For more information please refer to "Configuration and Authorisation of ARC Services" and "The NorduGrid Grid Manager and GridFTP Server" in Documents section. But You still need to maintain /etc/grid-security/grid-mapfile with at least a superset of authorized user because information system still relies on it.

Configuring the Grid resource:

Next step is the configuration of your resource. ARC uses a single configuration file per host node, independently of the number and nature of services it hosts. The default location of this file is /etc/arc.conf. A different location can be specified by the environment variable ARC_CONFIG. The unified configuration template is installed with the nordugrid-arc-server package and is located in $ARC_LOCATION/share/doc/arc.conf.template. This template also serves currently as the basic configuration document, while man arc.conf is available as well. The configuration file consists of dedicated blocks for different services. If your host node runs only some of the services, unrelated blocks should be removed.

Not having a service block means not running the corresponding service on the resource.

For more details, see the Configuration and Authorisation document.

  1. Create your /etc/arc.conf by using the configuration template arc.conf.template from the $ARC_LOCATION/share/doc directory (provided you installed the NorduGrid ARC middleware under $ARC_LOCATION). With the arc.conf you configure all services and processes:
    • GridFTP server
    • Grid Manager
    • Smart Storage Element
    • job submission interface
    • Grid storage areas
    • Information providers
    • Authorization

    Make sure you configure your services to use the ports that are opened in the firewall. In particular, define globus_tcp_port_range="9000,9300" in the [common] section of arc.conf, or whatever range is opened in the firewall for gridftp data connections. The ports 2135 (MDS) and 2811 (GridFTP) can be changed with the port="<port number>" option in the [infosys] and the [gridftpd] section of the arc.conf, respectively - however, changing MDS port number is strongly discouraged because the current ARC CLI can use only the standard port 2135.

    Make sure you have one or more [vo] sections in arc.conf. These blocks should be configured to create user mappings in /etc/grid-security/grid-mapfile (the latter file name is configurable in arc.conf). Follow the configuration template and consult NorduGrid VO lists for detailed information.

  2. If your site is going to provide resources via the NorduGrid production grid, you will need to check the latest NorduGrid GIIS Information for the list of country-level and core NorduGrid Grid Information Index Services to which your host will have to register.
  3. Check that /etc/sysconfig/nordugrid and /etc/sysconfig/globus files have the environment variables defined properly. For details, consult the relocation instructions.
  4. Optionally, you can setup Runtime Environments on your computing cluster. Setting up a Runtime Environment means installing a specific application software package onto the cluster in a centralized and shared manner (the software package is made available for the worker nodes as well!), and placing a Runtime Environment initialization script (named after the Runtime Environment) into the dedicated directory. You may want to consult a Runtime Environment Registry for a list of official Runtime environments.
  5. If you configured your storage element to be GACL-enabled, consult the GACL Howto for explanations and examples of .gacl files.

Startup scripts, services, logfiles, debug mode, test-suite:

  1. After a successfull installation and configuration of a NorduGrid resource the following services must be started:
    • The GridFTP server (gridftpd daemon): /etc/init.d/gridftpd start
    • The Information System (LDAP server) and the registration processes: /etc/init.d/grid-infosys start
    • The Grid Manager daemon (not needed for a Storage Element): /etc/init.d/grid-manager start
    • The Smart Storage Element server (may be used on the Storage Element): /etc/init.d/httpsd start
    All services can be run under non-root account (configurable in the arc.conf). While at Storage Element that only affects ownership of stored data, for Computing Element impact is more significant and part of cunctionality is lost. Make sure that the host and service certificates are owned by the corresponding users (those in which name the services are started).
  2. The log files can be used to check the services:
    • the Information System uses the /var/log/infoprovider.log and the /var/log/bdii/ or /var/log/bdii4 files. You can also find some logs in /var/run/bdii/ or /var/run/bdii4 directory respectively.
    • gridftpd writes log into the /var/log/gridftpd.log by default, the debug level can be set in the arc.conf
    • the Grid Manager uses /var/log/grid-manager.log for general information and /var/log/gm-jobs.log for logging job information, the debug level is set in the arc.conf.
    location of log files may be altered in arc.conf. Log rotation is performed by services themselves and configured in arc.conf too. Only Information System needs external utility to rotate it's logs. The startup scripts log failure to the syslog. Once the server is up and running you should consult the corresponding server's log file. If a service fails even to start up, the syslog file of your system should be checked. This is normally /var/log/messages or /var/log/syslog.
  3. Debug information: in the arc.conf different debug levels can be set all services. Please note that enabling debugging results may cause serious performance losses (especially in the case of the MDS LDAP server), therefore use the default level of debuging in a production system.
  4. The NorduGrid client comes fit with the ngtest utility. Use it to test the basic functionality of the computing resource. The utility includes several tests which can be interesting to test your cluster with e.g. simple up- and download tests. A complete list of test-cases is obtained by issuing ngtest --help Prior to submitting test jobs, make sure you possess a valid user certificate, have generated a valid Grid proxy and have credentials of all the necessary CAs installed. Consult the User Guide for detailed information on certificates, proxies and CA credentials. For a quick installation validation, run the test number 1 against your resource: ngtest -c <my.host.fqdn> -d 1 -job 1 This will execute a Grid job, including staging of files to the computing resource (downloading input files from several locations and caching), and running test calculation on the resource. We recommend to run at least this test against a newly installed resource and to fetch the job output by using: ngget -a -d 1 See ngtest man-page for more details on the test suite. The ngls client (comes with the nordugrid-arc-client package) can be used for testing the Storage Element and Computing Element interface setup: ngls -d 3 -l gsiftp://<my.host.fqdn> This instruction opens an GridFTP connection to site. You should be able to see top level of virtual directory tree configured at server side.