Theoretical High Energy Physics |
Researchers |
Peter Skands (Lund University) |
Research area |
Theoretical High Energy Physics |
Task specifics |
CPU Intensive, batch jobs (usual Monte Carlo
generation without detector simulation). |
Software requirements |
Pre-compiled self-contained executables from F77 (pythia source +
main program + cernlibs), so only run-time libraries need to be
same as on the build machine. |
System requirements |
Any environment where the compiled executables will run (see
above). OS is a Linux 2.4.4-4 system. Usually, executable + data
files are less than about 20 MB combined. |
Grid middleware requests |
- Standalone software for SuSE Linux.
- Possibility for a finished job to automatically download
itself to a specified computer, for instance an own machine (not
remote GRID storage). User often runs into problems with jobs
being deleted before being retrieved, and even devised a small
cron job that checks once a day.
|
Experimental High Energy Physics |
Researchers |
Nils Gollub (Uppsala University) |
Research area |
High Energy Physics |
Task specifics |
CPU-intensive, storage intensive |
Software requirements |
ATLAS runtime environment, own code (C++) |
System requirements |
Red Hat Linux, CERN certified release |
Remote resources usage |
Whatever the ATLAS software needs |
Grid middleware requests | Improved job scheduling. Sometimes jobs are
assigned to clusters where they queue and on other clusters there
is a lot of free CPU time. This might also occur if CPU time is
freed on some cluster after the job submission. It would be nice
if the jobs would be automatically rescheduled from time to time
(every hour/day/etc... or so) |
Grid middleware development |
Researchers |
Henrik Jensen, Jesper Leth (Aalborg University) |
Research area |
A daemon which supervises jobs and resubmits them if they fail.
|
Software requirements | It is necessary to create an own grid server
which rejects all job submissions. For this, Globus and NorduGrid
have to be re-built, which is somewhat hard and undocumented at
the current time. |
Grid middleware requests |
- ftp.nordugrid.org site is messy messy. Cleaning would be
nice. pub and test are redundant are contain software that
should be otherplace.
- Distrubute the source as tar.gz
or tar.bz2 instead of rpm (or at least both). It's annoying to
convert it (especially when debian does not ship rpm2targz and
one copies the rpm to ones laptop, converts it copy it
back).
|
Comments | Getting into the VO is hard to say at least. People are
very paranoid (or lazy) about granting access. It took a month to
get allowed on a single machine, bugging the sysadmin several
times a week. This makes it hard to get started. |
Gene Regulation Bioinformatics |
Researchers |
Johan Geijer (Karolinska Institute, CGB) |
Research area |
Genomics and bioinformatics. The goal
for the Grid project at Karolinska Institute is to provide a Grid
platform for Gene Regulation Bioinformatics that allow predictions of
involvement of genes in the pathogenesis of human diseases. |
Task specifics |
High throughput. Large number (hundreds or thousands) of
small, independent jobs running the same application but with
different input data. |
Software requirements |
Perl version 5.8.0 and PDL; own C code. |
System requirements |
Any system equipped with Perl 5.8.0 |
Remote resources usage |
At the moment, no external resources are
required: a small flatfile is uploaded for every job. Further
developement of the application will however require access to a
remote database. |
Grid middleware requests |
Since a job-binding is used, i.e. many
small jobs are combined into larger ones, it would be great to
have some kind of rating system that describes the workload on
the Grid. That would make it easier to estimate how many jobs
should be binded into one job at a specific time. |
Comments |
To validate and estimate the performance
using the Grid is quite difficult because of the fluctuating
workload on the Grid. However, it can be shown that the Grid
technology is efficient for this application. There is data
enough to demonstrate that usage of the Grid technology will
speed-up the process between 4x - 25x on average compared to
running the jobs locally. |