If you have problems or questions regarding ATLAS Production in the NorduGrid space, write to , or contact personally
Farid Ould-Saada, ,
Alex Read, or Oxana Smirnova, .
For all the sites participating ATLAS Production via the NorduGrid/ARC infrastructure, the following steps have to be performed:
Consider deploying
NorduGrid ARC 0.4.5. This is the latest stable release so far. Don't use development tags 0.5.x unless you know what are you doing.
It is strongly recommended to upgrade your Globus installation to version
2.4.3-16ng (released on December 6, 2004), as it includes very important bug fixes.
Register your site to the ATLAS GIIS. To do
this, add the following block to globus.conf if you use nordugrid v.0.4.5:
[mds/gris/registration/GrisToAtlas]
regname="Atlas"
reghn=atlasgiis.nbi.dk
regperiod=30
servicename=nordugrid-cluster-name
or this to arc.conf if you use nordugrid v.0.5.3x:
[infosys/cluster/registration/GrisToAtlas]
targethostname="atlasgiis.nbi.dk"
targetport="2135"
targetsuffix="mds-vo-name=Atlas,o=grid"
regperiod="23"
Then restart information system services:
if you use nordugrid v.0.4.5:
/etc/init.d/globus-mds restart
if you use nordugrid v.0.5.x:
/etc/init.d/grid-infosys restart
When done, send an e-mail containing your front-end host
name and request to be authorised to the ATLAS GIIS to
.
Join the dedicated mailing list "atlas-ng-arc" by using CERN's mailing list interface. NB! Make sure your "From" and
"Return-Path" fields are the same as the e-mail address you use,
otherwise you will not be able to post to the list.
Instructions for ATLAS release installation
Software installation. Always install the latest
ATLAS software release available, currently 11.0.x series.
There are two ways (at least) to install ATLAS
software:
The official ATLAS distribution kit is available
from Pacman
repository as binary tarballs for Scientific Linux CERN
v3 (SLC3). It is reported to be working on other systems
as well (provided the 32 bit mode, necessary libraries
and compilers are in place). ATLAS provides
instructions on how to get and install such a
release.
One can also use the nice interactive installation script
prepared for NorduGrid, which automatically produces the
runtimeenvironment setup script (see below).
The script is available for releases 11.0.0 and up from
http://grid.uio.no/atlas
Some people prefer RPM distributions; such are
prepared by a group of Nordic ATLAS researchers, with
ATLAS approval. Below are the RPM installation
instructions for ATLAS s/w release 11.0.0 at RHEL3;
for availability of other releases and OS versions, check the following repositories:
http://www.grid.tsl.uu.se/RTEs/ATLAS/ http://grid.uio.no/atlas
There are installation scripts, called either
ATLAS-x.y.z-install-<opsys>.sh or
installatlasrpms-<opsys>.sh. In case of
doubts or problems, please contact the
atlas-ng-arc mailing list. The repositories also contain source RPMs for eventual rebuild on a different platform; www.grid.tsl.uu.se/RTEs/ATLAS has source RPMs for external packages, too. The installation
procedure is rather simple:
export SITEROOT=<path_to_top_atlas_location>
cd $SITEROOT
wget http://grid.uio.no/atlas/11.0.0/installatlasrpms-RHEL3.sh
chmod u+x installatlasrpms-RHEL3_dev.sh
./installatlasrpms-RHEL3_dev.sh -a -c
Check outbound connectivity.
ATLAS jobs have to contact external databases, thus an
outbound connectivity from the worker nodes must be
enabled. For firewalled environments, several ports have
to be opened, e.g., port 3306 for MySQL. Port
10521 is reported to be needed for something else,
and others use to turn out eventually.
Local validation. If you installed the "Pacman
kit", you can validate the ATLAS software installation locally
by using the standard ATLAS KitValidation
tool. Please note that this will not check the entire
functionality, but will only test whether the installation was
successful. Instructions to be provided.
If you installed a NorduGrid distribution of ATLAS s/w, a script
TEST-ATLAS-<release_nr> was produced automatically, in the
location where you executed the installation script.
It is needed to set
correctly ATLAS runtime environment. The installation then can be
validated locally by running the corresponding validation script
kitval<release_nr>.sh, available from:
http://grid.uio.no/atlas/validation
To validate the installation locally, do:
cd <some_place_with_some_disk_space>
wget http://grid.uio.no/atlas/validation/kitval<release_nr>.sh
chmod u+x kitval<release_nr>.sh
./kitval<release_nr>.sh <path_to_the_script>/TEST-ATLAS-<release_nr>
or
cd <some_place_with_some_disk_space>
wget http://grid.uio.no/atlas/validation/kitval<release_nr>.sh
chmod u+x kitval<release_nr>.sh
source <path_to_the_script>/TEST-ATLAS-<release_nr>
./kitval<release_nr>.sh -r <release_nr>
The validation runs with some default arguments for KitValidation
which leaves the outputs in place – the location is told when
the script is exited. Each run of KitValidation opens a new
sub-directory with a partially random name. Check that there
are no errors or "FAILED" outcomes. If such occur, seek help
from the experts.
Publish the release tag. Following a successful local validation,
copy the script TEST-ATLAS-<release_nr> into your runtimeenvironment
directory, into the subdirectory APPS/HEP:
cp <path_to_the_script>/TEST-ATLAS-<release_nr> <rte_dir>/APPS/HEP/TEST-ATLAS-<release_nr>
If you installed the release from the "Pacman kit", you will
have to create such script by hand. An example of such
script for release 11.0.0 is here
Validate the release Grid-wise.
Fetch the vaildation job definition and submit the
validation job to your cluster, e.g.:
wget http://grid.uio.no/atlas/validation/kitval11.0.0_TEST.xrsl
ngsub -f kitval11.0.0_TEST.xrsl -c <your_cluster>
NB! If you use client version 0.5.30 and higher, remove
-f option from ngsub instruction above.
Once the job has finished, please retrieve the results with
ngget and check that there are no errors in the
logfile KitValidation.log. If so, rename the
runtime environment script to ATLAS-<release_nr> and your
cluster is ready.
If the validation is passed, rename the script
TEST-ATLAS-<release_nr> to
ATLAS-<release_nr>.
Note on cluster settings. Most jobs require large amounts
of memory, thus sites advertising less than 800 MB of RAM are
unlikely to get jobs. Siteadmins are encouraged to check/update the
site node-memory specifications in condor (if any) and nordugrid.conf.
Authorize the NorduGrid production managers (Mattias Ellert, Alex Read, Katarina Pajchel, Samir Ferrag):
/O=Grid/O=NorduGrid/OU=tsl.uu.se/CN=Mattias Ellert
/O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Alex Read
/O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
/O=Grid/O=NorduGrid/OU=uio.no/CN=Samir Ferrag
who are also the members of the SWEGRID's ATLAS VO:
https://www.pdc.kth.se/grid/swegrid-vo/vo.atlas-testusers-vo
Make sure you have the public keys of the NorduGrid
CA installed.
Setting up a Storage Element
Registering Storage Element to ATLAS GIIS
Add the following block to globus.conf if you use nordugrid v.0.4.5:
[mds/gris/registration/SEtoAtlas]
regname="Atlas"
reghn=atlasgiis.nbi.dk
regperiod=30
rootdn="nordugrid-se-name=<mySE>:my.host.name,Mds-Vo-name=local,o=grid"
Here <mySE> should be substituted with the same string as in the header of the
storage element block: [se/<mySE>].
If you use nordugrid v.0.5.3x, add instead this to arc.conf:
[infosys/se/<mySE>/registration/toATLAS]
targethostname="atlasgiis.nbi.dk"
targetport="2135"
targetsuffix="mds-vo-name=Atlas,o=grid"
regperiod="44"
Here <mySE> should be substituted with the same string as in the header of the
information system storage element block: [infosys/se/<mySE>].
Restart information system services:
if you use nordugrid v.0.4.5:
/etc/init.d/globus-mds restart
if you use nordugrid v.0.5.x:
/etc/init.d/grid-infosys restart
Send an e-mail containing your SE host name and request
to be authorised to the ATLAS GIIS to
.
Instructions for authorizing ATLAS physicists to SE's
The important thing when committing a Storage Element (SE) for ATLAS usage is to
authorise on the SE level the ATLAS Virtual Organisation members, which implies accepting
their respecitve Certificate Authorities (CA). Below are the instructions on how to achieve it,
and other related information.
Add the following line to /etc/grid-security/nordugridmap.conf (this line is
the ATLAS VO server contact string):
group "ldap://grid-vo.nikhef.nl/ou=lcg1,o=atlas,dc=eu-datagrid,dc=org"
The next time the nordugridmap utility is run, the grid-mapfile
/etc/grid-security/grid-mapfile is filled with the DN's of the members
of the ATLAS VO. nordugridmap by default makes use of the
/etc/grid-security/nordugridmap.conf file which can be overwritten on the
command line with the -c switch. The name and location of the generated
mapfile (default is /etc/grid-security/grid-mapfile) can be modified in the
configuraton file, which might be useful for generating different ATLAS
grid-mapfiles (see below).
Make sure that write-access is provided for the members
of the SWEGRID's dedicated ATLAS VO (Nordic production managers):
https://www.pdc.kth.se/grid/swegrid-vo/vo.atlas-testusers-vo
Configure the fileplugin SE to contain a read-only location through which data can be downloaded.
Note: for the stable release-series 0.4.x, it is not possible to configure
the SE so that some people have read- and other people write-access to the
SE unless one uses the low-level configuration-file gridftpd.conf. In fact,
the people that have read-access through the read-only location defined
below will also have write-access through the ordinary write-location that are
used by the NorduGrid ATLAS production managers. It is nevertheless
recommended to make a read-only location to prevent accidents.
The above restriction is removed in the development series 0.5.
Below two examples are given: configuring the SE using nordugrid.conf (nordugrid v.0.4.5) or arc.conf (nordugrid v.0.5.x), and
configuring the SE using low-level gridftpd.conf.
To configure a read-only location in the SE using
nordugrid.conf (nordugrid v.0.4.5) or
arc.conf (nordugrid v.0.5.x), add the block
[gridftpd/atlasprod_read]
to nordugrid.conf (nordugrid v.0.4.5) or
arc.conf (nordugrid v.0.5.x) with the following content:
plugin=fileplugin.so
path=/atlasprod_read
mount="<your physical filedir with ATLAS files>"
dir="/ nouser read cd dirlist"
This gives read-access to people through the path:
gsiftp://<clustername:port>/atlasprod_read
Using the low-level gridftpd.conf configuration file (usually placed in
/opt/nordugrid/etc) for defining a read-only path in the SE is also easy
and gives a real opportunity to distinguish between people having read- and
people having write-access. There is a small problem though: the gridftpd.conf
configuration file is overwritten with the information
from nordugrid.conf (or arc.conf) if one uses the standard method of starting the
gridftp-server "service gridftpd start". Instead one should start the gridftpd using the command:
/opt/nordugrid/sbin/gridftpd -c /opt/nordugrid/etc/gridftpd.conf
This may need adding /opt/voms/lib to LD_LIBRARY_PATH first.
With this in mind, the following is a standard gridftpd.conf configuration
file:
pidfile /var/run/gridftpd.pid
logfile /var/log/gridftpd.log
port 2811
pluginpath /opt/nordugrid/lib
encryption yes
allowunknown no
group atlas
file /etc/grid-security/atlas-mapfile
end
group atlas_read
file /etc/grid-security/atlasreaders-mapfile
end
groupcfg atlas
plugin /atlasprod fileplugin.so
mount /
dir / nouser read cd dirlist delete create *:* 664:664 mkdir *:* 775:775
end
groupcfg atlas_read
plugin /atlasprod_read fileplugin.so
mount /
dir / nouser read cd dirlist
end
In this case, the people in /etc/grid-security/atlasreaders-mapfile
will have read-access to the files and the people in
/etc/grid-security/atlas-mapfile will have write-access. The file
/etc/grid-security/atlas-mapfile should be filled with (at least) the
DN's of the ATLAS DC production managers while the
file /etc/grid-security/atlasreaders-mapfile should be filled with
the DN's of the ATLAS VO people.
SE service requirements
Storage Elements are expected to serve data on request over an extended period of several monthes, typically -
around one year. There are known cases when users requested data stored 2 years ago
If you plan to pemanently shut down a SE, please notify the coordinators and take the necessary steps to
rescue the stored data, by replicating them to another SE, erasing old records from the indexing database,
and eventually creating backups.
It is generally a good practice to have regular backups of the stored data, whenever possible.
The list of SE's that can accept production data at the moment can always be
obtained by the query:
globus-rls-cli query lrc lfn __storage_service__ rls://atlasrls.nordugrid.org:39281
If a SE is not in this list, it means it will not accept new data - but it still can serve such, if any are stored.
If your SE goes down and/or is down for maintenace for a short while, please let the coordinator know beforehand,
so that this list can be adjusted. This is also to make sure that enough space is available at all times.
Old production data (DC2) are stored in rls://atlasrls.nordugrid.org:39282