Coordinators of the ATLAS Rome Production in the NorduGrid space are
Farid Ould-Saada, and
Mattias Ellert, .
For all the sites participating ATLAS Rome Production via the NorduGrid/ARC infrastructure, the following steps have to be performed:
Consider upgrading to
NorduGrid ARC 0.4.4. This is the best release so far.
It is strongly recommended to upgrade your Globus installation to version 2.4.3-16ng (released on December 6, 2004), as it includes very important bug fixes.
Register your site to the ATLAS GIIS. To do
this, add the following block in globus.conf:
[mds/gris/registration/GrisToAtlas]
regname="Atlas"
reghn=atlasgiis.nbi.dk
regperiod=30
servicename=nordugrid-cluster-name
When done, send an e-mail containing your site's host name to
.
Join the dedicated mailing list by sending the message with the body text "subscribe atlas-ng-arc" to . NB! Make sure your "From" and
"Return-Path" fields are the same as the e-mail address you use,
otherwise you will not be able to post to the list (CERN's new policy).
Instructions for ATLAS release installation
Software installation. Install ATLAS software releases
9.0.3 and 9.0.4. The official CERN release of the Pacman
kit is available for RedHat7.3, but it is reported to be working
on other systems as well (provided the necessary libraries and
compilers are in place).
ATLAS provides instructions on how to get
and install a release of the Atlas Software on RedHat7.3 via the
"Pacman kit"; please consult also the digested
instructions by Thomas Kittelmann and Jørgen Beck Hansen,
which also are useful for non-RedHat7.3 systems .
For RHEL and FC1, RPMs are available, as well as tarballs
for Debian. They are located, for 9.0.3 and 9.0.4
respectively, at
ftp://ftp.nordugrid.org/applications/hep/atlas/9.0.3/ ftp://ftp.nordugrid.org/applications/hep/atlas/9.0.4
Source RPMs (SRPMs), including latest patches for Fedora, are
available from the same location. If you have a different system, you
can do one of the following: try the RH7.3 one (might work), rebuild
the reelease from SRPMs, or request to get your system supported by
writing to
Information on package name, version and approximate order
of installation to install from SRPMs is in the file
dep_list.txt
For all the packages, there is an installation script with
the usual name installatlasrpms-<dist>.sh
(for Debian, use "sarge" for <dist>). To install the
packages for your Linux distribution, fetch a corresponding
script, define a set of necessary environment variables and
execute it:
wget ftp://ftp.nordugrid.org/applications/hep/atlas/<release>/installatlasrpms-<dist>.sh
export ATLAS_ROOT=<where_you_want_to_have_atlas>
export G4INSTALL=<where_you_want_to_have_geant4>
export ROOTSYS=<where_you_want_to_have_root>
export CERN=<where_you_want_to_have_cernlib>
chmod u+x installatlasrpms-<dist>.sh
./installatlasrpms-<dist>.sh
Check outbound connectivity.
ATLAS jobs have to contact external databases, thus an
outbound connectivity from the worker nodes must be
enabled. For firewalled environments, several ports have
to be opened, e.g., port 3306 for MySQL. Port
10521 is reported to be needed for something else,
and others use to turn out eventually.
Local validation. If you installed the "Pacman
kit", you can validate the ATLAS software installation locally
by using the standard ATLAS KitValidation
tool. Please note that this will not check the entire
functionality, but will only test whether the installation was
successful. For details, follow instructions in the Kit description.
If you installed a NorduGrid distribution of ATLAS s/w, a script
TEST-ATLAS-<release> was produced automatically, in the
location where you executed
installatlasrpms-<dist>.sh. It is needed to set
correctly ATLAS runtime environment. The installation then can be
validated locally by running the corresponding validation script
kitval9.sh, available at:
http://grid.uio.no/atlas/validation/kitval9.sh
To validate the installation locally, do:
cd <some_place_with_some_disk_space>
wget http://grid.uio.no/atlas/validation/kitval9.sh
chmod u+x kitval9.sh
./kitval9.sh <path_to_the_script>TEST-ATLAS-<release>
or
cd <some_place_with_some_disk_space>
wget http://grid.uio.no/atlas/validation/kitval9.sh
chmod u+x kitval9.sh
source <path_to_the_script>TEST-ATLAS-<release>
./kitval9.sh -r <release>
The validation runs with some default arguments for KitValidation
which leaves the outputs in place – the location is told when
the script is exited. Each run of KitValidation opens a new
sub-directory with a partially random name.
Publish the release tag. Following a successful local validation,
copy the script TEST-ATLAS-<release> into your runtimeenvironment
directory, into the subdirectory APPS/HEP:
cp <path_to_the_script>TEST-ATLAS-<release> <path_to_the_rte_dir>APPS/HEP/TEST-ATLAS-<release>
If you installed the release from the "Pacman kit", you will have to create such script by hand. An example is described in the Step
6 of the Instruction (rename setup-9.0.3.sh to
TEST-ATLAS-9.0.3).
Validate the release Grid-wise.
Fetch the vaildation job definition and submit the validation job to your cluster:
wget http://grid.uio.no/atlas/validation/kitval<release>_rpm_TEST.xrsl
ngsub -f kitval<release>_rpm_TEST.xrsl -c <your_cluster>
Once the job has finished, please retrieve the results with
ngget and check that there are no errors in the
logfile KitValidation.log. If so, change the
runtime environment script name from
TEST-ATLAS-<release> to ATLAS-<release> and your
cluster is ready.
If the validation is passed, rename the script
TEST-ATLAS-<release> to
ATLAS-<RELEASE>.
Note on cluster settings. Most jobs require large amounts
of memory, thus sites advertising less than 800 MB of RAM are
unlikely to get jobs. Siteadmins are encouraged to check/update the
site node-memory specifications in condor (if any) and nordugrid.conf.
Authorize the NorduGrid production managers (Mattias Ellert, Alex Read, Katarina Pajchel, Samir Ferrag and Rasmus Mackeprang):
/O=Grid/O=NorduGrid/OU=tsl.uu.se/CN=Mattias Ellert
/O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Alex Read
/O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
/O=Grid/O=NorduGrid/OU=uio.no/CN=Samir Ferrag
who are also the members of the SWEGRID's ATLAS VO:
https://www.pdc.kth.se/grid/swegrid-vo/vo.atlas-testusers-vo
Make sure you have the public keys of the NorduGrid
CA installed.
Setting up a Storage Element
The important thing when committing a Storage Element (SE) for ATLAS usage is to
authorise on the SE level the ATLAS Virtual Organisation members, which implies accepting
their respecitve Certificate Authorities (CA). Below are the instructions on how to achieve it,
and other related information.
Instructions for authorizing ATLAS physicists to SE's
Add the following line to /etc/grid-security/nordugridmap.conf (this line is
the ATLAS VO server contact string):
group "ldap://grid-vo.nikhef.nl/ou=lcg1,o=atlas,dc=eu-datagrid,dc=org"
The next time the nordugridmap utility is run, the grid-mapfile
/etc/grid-security/grid-mapfile is filled with the DN's of the members
of the ATLAS VO. nordugridmap by default makes use of the
/etc/grid-security/nordugridmap.conf file which can be overwritten on the
command line with the -c switch. The name and location of the generated
mapfile (default is /etc/grid-security/grid-mapfile) can be modified in the
configuraton file, which might be useful for generating different ATLAS
grid-mapfiles (see below).
Make sure that write-access is provided for the members of the SWEGRID's dedicated ATLAS VO:
https://www.pdc.kth.se/grid/swegrid-vo/vo.atlas-testusers-vo
Configure the fileplugin SE to contain a read-only location through which data can be downloaded.
Note: for the stable release-series 0.4.x, it is not possible to configure
the SE so that some people have read- and other people write-access to the
SE unless one uses the low-level configuration-file gridftpd.conf. In fact,
the people that have read-access through the read-only location defined
below will also write-access through the ordinary write-location that are
used by the NorduGrid DC2 production managers. It is nevertheless
recommended to make a read-only location to prevent accidents.
The above restriction is removed in the development series 0.5.
Below two examples are given: configuring the SE using nordugrid.conf, and
configuring the SE using gridftpd.conf.
To configure a read-only location in the SE using nordugrid.conf, add the block
[gridftpd/dc2_read]
to nordugrid.conf with the following content:
plugin=fileplugin.so
path=/dc2_read
mount="<your physical filedir with dc2 files>"
dir="/ nouser read cd dirlist"
This gives read-access to people through the path:
gsiftp://<clustername>/dc2_read
Using the low-level gridftpd.conf configuration file (usually placed in
/opt/nordugrid/etc) for defining a read-only path in the SE is also easy
and gives a real opportunity to distinguish between people having read- and
people having write-access. There is a small problem though: the gridftpd.conf
configuration file is overwritten with the information
from nordugrid.confif one uses the standard method of starting the
gridftp-server "service gridftpd start". Instead one should start the gridftpd using the command:
/opt/nordugrid/sbin/gridftpd -c /opt/nordugrid/etc/gridftpd.conf
This may need adding /opt/voms/lib to LD_LIBRARY_PATH first.
With this in mind, the following is a standard gridftpd.conf configuration
file:
pidfile /var/run/gridftpd.pid
logfile /var/log/gridftpd.log
port 2811
pluginpath /opt/nordugrid/lib
encryption yes
allowunknown no
group atlas
file /etc/grid-security/atlas-mapfile
end
group atlas_read
file /etc/grid-security/atlasreaders-mapfile
end
groupcfg atlas
plugin /dc2 fileplugin.so
mount /
dir / nouser read cd dirlist delete create *:* 664:664 mkdir *:* 775:775
end
groupcfg atlas_read
plugin /dc2_read fileplugin.so
mount /
dir / nouser read cd dirlist
end
In this case, the people in /etc/grid-security/atlasreaders-mapfile
will have read-access to the files and the people in
/etc/grid-security/atlas-mapfile will have write-access. The file
/etc/grid-security/atlas-mapfile should be filled with (at least) the
DN's of the ATLAS DC production managers while the
file /etc/grid-security/atlasreaders-mapfile should be filled with
the DN's of the ATLAS VO people.
SE service requirements
Storage Elements are expected to serve data on request over an extended period of several monthes, typically -
around one year. There are known cases when users requested data stored 2 years ago
If you plan to pemanently shut down a SE, please notify the coordinators and take the necessary steps to
rescue the stored data, by replicating them to another SE, erasing old records from the indexing database,
and eventually creating backups.
It is generally a good practice to have regular backups of the stored data, whenever possible.
The list of SE's that can accept production data at the moment can always be
obtained by the query:
globus-rls-cli query lrc lfn __storage_service__ rls://gridsrv3.nbi.dk
If a SE is not in this list, it means it will not accept new data - but it still can serve such, if any are stored.
If your SE goes down and/or is down for maintenace for a short while, please let the coordinator know beforehand,
so that this list can be adjusted. This is also to make sure that enough space is available at all times.