NorduGrid technical meeting

16-17 January 2006, Uppsala

Minutes

Participants: Mattias, Aleksei, Michael, Anders, Sigve, Olli, Arto, Balazs, Oxana, Niels, Ilja, Johan, Mike, Henrik (at the minutes)

1 Monday 16th January

Meeting started at 13.30 hours.

1.1 Agenda

1.2 Conferences and Meetings

1.3 Release Policies and Tags

People are installing tags, not stable releases. This is bad, but the stable release is very old, and some things are not working in it, i.e., 64-bit support, which forces people to use tags. Current tag has a dysfunctional GridFTP server.

Everyone agrees that there should be more stable releases, but current head is not stable or tested properly. It would be good to have an overview of which tags are good and bad. The problem is to define what is working and what is not.

Automated testing was suggested. Anders has an almost working test system, but it is not currently in use, due to issues with proxies and the tests, not being aware of "distributedness" .

Furthermore it was agreed that bad tags should not be on the download page, however it was discussed if there would be any progressions on the download page with this. It was agreed that there should be some way of identifying a broken release, on the download page and FTP server. Additionally some test system should be setup. This system should be run automatically, ideally every night. Until an automated test system is running, manually testing will have to do.

1.4 Runtime Environments

Juha is no longer employed by NDGF, and can no longer maintain the runtime environments registry. CSC volunteered to continue the effort. It should be more clear on how to get RE name spaces, and how to get sites to install an RE. RE pages should describe how the software should be installed.

Currently it is very unclear for users to install/deploy software on the grid. There are some explanations on how to this. but this has to be easier to find. It also needs to be explained how to use software without REs. There is also a need for clear policies, e.g., what to do with inactive REs maintainers.

It was discussed on how to make REs linkable, and possibly have it as a service, in order to make it easier to obtain information about them. Namespace issues where also mentioned. Anders thought that the present wasn't suited to handle two VOs using different editions of the same software (of same revision). He suggested to have separate VO name spaces, however the idea meet a lot of resistance, and there was doubt whether it would solve the problems. No real decision where taken.

1.5 Service Responsibility

2 Tuesday 17th January

The day started with setting agenda, and time for each topic.

2.1 External Software

Anders gave a list of the external software dependencies for both stable and an development versions of ARC. The list was shorter for the development, as some modules where taken away, although they where still needed. There was some concern over this. There should be a list for client and server and what is optional. -Anders promised to gave a list of this for 0.6. Runtime and compile requirements should also be separated and listed. A bugzilla entry requesting clarification on this also exists.

There was some discussion on having our own Globus, which was both good and bad, but no results came out of this.

2.2 ARC Status

The stable version is working okay. The GridFTP server in the development version is currently not functioning, which is a showstopper. Additionally the broker in ARCLib needs work, as it distributes jobs very unevenly. There is also issue with the information system going into an infinite loop. Documentation also needs some work. Packaging should be mostly in shape, but there should be more distribution repositories. There was some small naming issue with the client configuration file for the new ARCLib client. The new commands should not be renamed to the ng prefix until a proper broker is in place. The Java GUI client should be ready for 0.6, Ilja will do some work on it. The status of the following software was clarified:

The real showstopper for 0.6 is that it needs lots of testing. Balazs will make of lists of things that needs to be tested. Things that needs testing is Fireman interface and VOMS, which Niels has committed to test. The tests should begin as soon as there is a useful server, and lasts for about two to three weeks. It is hoped that the server will be useful within a week. Release date is set to February 24th.

2.3 Software Distribution

Tar packages is currently build from RPM packages. Anders would like to produce debian packages, but was not sure he could make it for 0.6. There are currently two yum repositories, an old and a new. Fedora 1 & 2 uses the old format, newer versions use the new repositories. RedHat can use Fedora repositories. It was questioned whether to have apt repositories for Fedora, but there was no real need for this. The intention is to have binary builds for all supported distributions. There was a lot of discussion of the infrastructure for automatic updates, which didn't really lead anywhere.

Additonally VOMS client tools are also needed, and there it was agreed that there should be tarballs of all source code, not just RPMs.

There was some discussion of what should be in the stand-alone of NorduGrid and that this should be documented. The stand-alone should be able to use NorduGrid as a client, and should not have any fancy external dependencies, such as Globus or gSOAP. There should be a README file with explanations of what the stand-alone is and also an interactive FTP client, which will be uberftp.

2.4 Logger

Alexei gave a presentation of the logger, and his work with it. The current database schema has a lot of problems with regard to redundancy and indexing. Alexei has a new schema which removes most of the redundancy and indexes the most queried data. It also adds a view, which can provide backwards compatibility with the old logger. It speeds up queries a lot. Additionally the service uses web service instead of the MySQL API. It will take at least one month to get the new logger done. It was suggested to have a user interface to query the database. The new logger should also have a way of dealing with the future GGF job logging standard. There was discussion about the GGF and NG attributes. In general NG should move towards the GGF attributes, even though this will not be without problems.

2.5 Brokering and WS Submission

Johan gave a presentation of his work which is on standards based job submission and brokering. The architecture introduces a job submission service, which can accept several kinds of jobs, transform the jobs, and submit it to different kinds of middlewares. Data staging is handled by having the client run a data server, and having the submission service stage the file copying from the client to the server. There was some concern of the submission method to LCG2, as that middleware uses job gateways, which usually perform the brokering and staging. It was mentioned that the submission service could also be used to manage jobs, and this would provide a single entry into the grid for the user. There was a discussion on optimization, taking staging and run-time into consideration. Currently the service is able to submit 40 jobs/minute. It appears to be the resources which is the bottleneck. The submission service is also capable of making reservations, and using them.

2.6 Brokering

Currently there are a number of brokers. One in ngsub (the best), two in ARCLib(bad and not so bad), one in the Arconaut(random), the benchmark broker, and one in Job Submission Service(real algorithm). The one in ARCLib needs to get better, if ARCLib is to be default client library in the future. Matthias will look into porting the ngsub broker to ARCLib. Arconaut will try to re-use the code from the Job Submission Service. The benchmark broker is currently is limbo; it is unknown if it will be used.

2.7 Job Management

There was a new job manager released some time ago, but momentum isn't happening. It was agreed to make a new release which will feature a database and some additional plugins, making it easier to see how plugins should be written. Killing ngjm will not be done, until arcjm has similar functionality.

3 Wednesday 18th January

3.1 LCG-ARC interoperability

Tord gave a welcome to all the attendees. The agenda was handed out. Everyone presented themselves. A few points was added to the agenda: The need for testing and the usage of GGF standards.

Michael gave a presentation of how to make LCG and ARC submit jobs to each other. One can submit to both NorduGrid and LCG using Condor. Basic functionality is there, but it still needs work. An option for interoperability is to setup an LCG front end, which has access to a Condor Pool of all the ARC resources, hence making ARC appearing as an LCG CE. Worries was expressed over this solution, and that focus should be on making "real" interoperability.

Focus should instead be on integrating with gLite, which will be the future grid middleware at CERN There should be some agreement on how job endpoints should look like. The Condor front-end to ARC should be tested with the 0.5 series of ARC. Johans submission service was also discussed as a solution which could act as a front-end to several grid systems.

The current plan at CERN is that gLite 3, should be deployed at end of February, but there was doubt that they would make it in this time frame.

There are currently some problems interacting with the BDII (LCG gateway to ARC info system), and Finish grid sites due to firewall problems. While this should be relatively easy to fix, opening the systems might be tough. It was discussed to move the BDII in order to solve the problem, but people wanted go get rid of the BDII and not continue supporting it.

Laurence had a proposal for GLUE2, that he was ready to submit. This scheme is divided into two levels: Transport and abstract. One would work with the abstract schema in applications, and information exchanged using the transport schema. The transport schema should be very stable, while the ab- stract scheme could change in a more dynamic way. The new schema should also be more strongly defined. Laurence would send the suggestion soon. The suggestion would also reflect the latest ARC comments.

Oxana told about an effort to submit directly to LCG CEs without using the central broker. With this, it was managed to saturate all LCG clusters. The traditional LCG broker had scheduling, which would not distribute jobs evenly, but the new broker could. However some sites where missing from the list which the new brokers used. This lead to a discussion on the usage of the Nordic LCG clusters, which was very low.

There was a discussion on ATLAS software and the installation on different Linux flavors. This turned into a discussion on creating an LCG CE element as front end to ARC, which will be attempted be done before CHEP.

LCG has setup a testing system, that continually test sites, and on fails notifies the system administrators. This has helped identifying bad sites very fast. It could also be used to identify sites acting as black holes.

In general there was a lot of short discussions. Topics where standards in job submission, information systems, and people dreaming that they have good solutions.

The GGF conference was also mentioned. Balazs is going, and there will be a session at GGF for interoperability for major grids. There was a concern that GGF was more about creating visions for interoperability, than trying to make existing software working together.

CERN would also like a clear overview of the NorduGrid policy of VOs. Is NorduGrid one grid, or several grids, e.g., Finish, Danish, and Swedish working together, and how would they cooperate with CERN. In general NDGF is a mess, and the CERN worries where understandable. It is unclear what the future of NDGF will be. It appears that there will be some funding for NDGF contin- uation, but the money might be stuck. A working group of two members has been created, and it is currently being evaluated.

Besides LCG-ARC job submission, accounting will soon come into work. There is an EGEE schema which point to the important information; it should be relatively close to the GGF.

ARC-LCG submission working group hasn't started, and has low priority. The strategy is going be the same as LCG-ARC, i.e., set up an ARC CE which acts as a front end to the LCG Grid.

Johan gave a demonstration of his Job Submission Service, which can submit xrsl and jdsl job specifications to ARC and LCG sites.

There was a discussion on the NDGF. NDGF will have additional funding for at least two years, possibly five. Brian will continue as director until around March. The NDGF structure will be changed and will be under the responsi- bility of NorduNet. There is general consensus that the circumstances around NDGF should change soon.

The interoperability working groups would be left unchanged, and continue their work, even though some are not so active. A thing that should be done relatively fast is gLite job broker submission to ARC sites (Laurence, Michael, Balazs). It is hoped that at next interoperability meeting NDGF will be active and going again.

3.2 Nordic Nordic Region Tier-1 Meeting

Head was about to explode. Michael has minutes.

4 Thursday 19th January -NGN Meeting

First an introduction and a presentation to UppMax was given by Sverker Holm- green. Matthias then gave short list of practical things.

4.1 NGN, NorduGrid, and NDGF

Farid gave a state of the NorduGrid Neighbourhood. We are currently in the second year, which started September 2005. Basically a summary of the latest NGN meetings and a 10000 foot overview of the grid activities. Topics were LCG--ARC interoperability, NDGF status, KnowARC (Project for next genera- tion grid, based on ARC -has applied for EU funding). Finally there was some discussion about the dates for the next NGN meetings.

4.2 ARC LCG interoperability

Michael gave a presentation LCG-ARC interoperability. There are three proposed goals: Multiple Middlewares at same sites (short term, accomplished some places), gateways (medium term, in work), shared interfaces(1ong term). The common interface work will mostly be geared towards information systems and job submission. There is also an effort to have interoperability between LCG and OSG (American Grid). Job description language is moving towards JSDL.

The medium term goal is composed of five work packages. First is LCG CE documentation. Second and third is submission gateways in each directorion.

LCG-ARC has highest priority. Fourth is service discovery and fifth is Glue2, which is a long term goal.

Currently LCG2 uses Condor-G for submission, which currently support Condor, GT2, GT3, GT4, and ARC. The plan is to enable ARC submission in gLite RB. Alternatively an LCG CE could be set up to mask the entire ARC grid, but this is difficult due to the heterogeneity of NorduGrid. Condor- G submission to ARC sites works right now, though there might be some file transfer problems. There is work on getting Condor-G to do brokering between LCG and ARC resources. Currently LCG uses Condor Class ads for submission, not Condor-G. However the new gLite uses Condor-G, and supports Condor CEs, LCG CEs, and gLite CEs. It should be relatively easy to add new CE types. It is hoped that ARC CEs will be supported.

ARC-LCG submission is slightly more tricky, as the client does the brokering, hence submission must be done directly to CEs. There are several options to solve this including GRAM2 (scaling problems) and Condor-G (needs FQDN on client, and adds dependencies).

Resource scheme mapping is currently being done at Cern by using the BDII (arc-bdii.cern.ch) , which presents a Glue schema of ARC resources. It works, but needs more testing. It is meant as input for the modified gLite RB. It was noted that the mapping isn't completely done, but only needs a little work.

The Glue2 effort is at least a year away, but both LCG and ARC expects to switch to this. It will support transport (info system) and abstract (client) level schema. The next step is to modify the gLite RB to support ARC. Organizational integration of ARC into EGEE is also needed.

There was some concern over the man hours this effort required, but it was hoped that interoperability would become a more mainstream activity, and receive more attention (and funding). Everyone agreed that this effort is im- portant, and the fact that people is talking together is a major progress.

4.3 Tier-1 Challenge and Summary

Leif gave a presentation of the Tier-1 challenge. Tier-1 is 10 centers in the world which will receive the data flood from the LHC. The Nordic region has been lacking behind so far. One of the reasons for this is that there is no site, which can handle all the traffic. Therefore a "distributed centre" has been setup. Currently the center is composed of NSC, PDC and NBI. As long as the center still appears as one, this behaviour is allowed.

Lately the Tier-1 center has participated in Service Challenge 3. A task force has been appointed, and is composed of six people. They are currently lacking a boss; Brian Vinter might step in. Last summer the first throughput test was run, which is currently being re-run. Service challenge 4 will begin in spring 2006.

A Tier-1 center presents an SRM interface, which is used by FTS (File Transfer Service) at Cern. FTS manages the data transfers, currently is tries to have 25 transfers in parallel to NDGF. The 25 connections was reached through "negotiations". The Nordic Tier-1 center uses DPM behind SRM, which is a simple disk pool manager, which uses round-robin to distribute the data. DPM is disk only (no tape).

Yesterday NDGF peaked at 953 Mbps. Currently the bottleneck is at CERN, which needs a bigger pipe. Next week a max throughput measurements of each Tier-1 will be attempted. Currently each tier-1 site has one, some two, SRhI interfaces. The plan is that SRM acts an entry point for finding files. This architecture somewhat limiting and centralistic, but seems to satisfy most requirements. There was a general consensus that the solution was far from optimal, but we entered the project too late to have any saying.

The next big thing is disk-to-tape test in April. For this dCache will likely be needed. SC4 will merge with other challenges into the final analysis infras- tructure.

4.4 Baltic Grid

Per Oster gave a presentation of the Baltic Grid, which is an organization for introducing Grid in the Baltic Countries. There are 10 institutions from Sweden, Estonia, Latvia, Lithuania, Poland and Switzerland. It has a 3M Euro budget, which is divided into service, network, and research activities. The Baltic Grid started November 1st 2005.

The objective is to extend the European effort on grid, with respect to infrastructure and research. The project has chosen to build on the EGEE project, and is an extension of this. It wants to co-exist with other grids, such as DEISA, CrossGrid and NorduGrid. Furthermore it want to engage the Baltic countries into policy and standardization activities.

There is no middleware development, so there is no focus in interoperability work. Focus instead is on education and running services and bringing appli- cation to the grid. They have three pilot applications which they intend to bring to the grid. Additionally it will run a CA, information, monitoring and logging services. The research activities is within user management and accounting. The project last until April 2008. A time line was presented which included several meetings. There is currently a lot of related EGEE projects starting.

4.5 NextGRID

Olle gave an overview of the NextGRID project, which is a research project that started September 2004. There are 22 partners, which have 11M Euro to share between them. It is a research and exploration project with a 5-10 year lookout. It has tree objectives: Meeting business objectives, participation by the public and consolidation/standardisation.

They try to create an architecture, given business requirements, by doing experiments, but tries to keep the process lightweight. They have produced an architecture white paper, which presents and explains the NextGRID vision. The current white paper is somewhat old though. Big themes are work flows, service level agreements, and security (intra domain).

4.6 Grid Tools for Resource and Project Management

Erik Elmroth gave a presentation on a research project, which aims to grid- enable toolkits and scientific computing. The project is composed of several sub-projects, including job submission/brokering (Johan's work), accounting (SGAS, www.sgas.se), decentralized grid-wide scheduling (better VO/user fairness on grid), resource/project portal. The project has produced a nice number of papers. Ume is having the para06 conference in June (18-21).

The fair-share scheduling is based on a hiarchial model, where the local policy is combined in local rules and global group rules, which is then combined. It makes it possible to give time to groups and individuals. It does assume access to global run statistics, but doing grid-wide fair share without this, would be surprisingly hard. This is currently at the conceptual/algorithmic level.

The portal work could keep track of the users jobs, and do various actions with them. The main benefit appears to be usability and a nice way to keep track of jobs. File staging is problematic in the portal, due to the current libraries.

4.7 GGF Status and Plans

Olle gave a presentation about what is happening at GGF these days. The focus of GGF is threefold: Standards, community and operations (mostly support role). There was a broad overview of where grid is today, and that it needs stan- dards and common practices to move ahead. A presentation of the organization was given, and what they will do to overcome the growth problems. There was a lot of controversy on the expense of going to GGF meetings ($800 price tag).

Last year, their mission statement was made clear, the organization was realigned, and the web site improved. Last year the number of documents was doubled. Currently a lot of work is going into OGSA, e.g., OGSA on top of WSRF.

The number of groups in GGF is now about 50, and there is a lot of players which have overlapping interests. It is attempted to divide this in a better way, so less double work is done. This does limit creating several standards for the same area though.

There a lot of new/almost published documents, including: JSDL, Usage record, Resource Usage Service, OGSA v1.5 (architecture and basic profile).

GGF 16 will be held in Athens in February, with the theme: "Production Grids: The Path to Global Interoperability". GGF 17 will be in Tokyo, May. GGF 18 in Washington DC, along with GridWorld. GGF 19 will be in Europe.

4.8 VOMS Administration Interface

Niels gave a presentation of lightweight administration interface, for VOMS. The main problem with the existing was that it was very heavy, using LCG and Java. There is both a web interface and a command line interface. There was some problems with not using Java as the existing VOMS had some assumptions (e.g., serializing Java). Niels gave a brief example of the user interface.

4.9 SAT-solving in Grids

Antii gave a presentation of his work in solving SAT on grids. He has worked with automatically parallizing and solving, and using grid as computing fabric. SAT solving is an NP-complete problem, so several algorithms has been invented to find solutions in a faster way. Antti described the most successful algorithm, which is based on deducing solutions that is not needed to test for.

Solving this in parallel might make it slower due to sequential solvers are highly optimized. By making some "assumptions" the problem can be divided into sub problems. By running several of these, the assumptions can be strength- ened. It is not possible to just split it into a number of parts, as the problem as NP complete, and would take to long to solve this way. A heuristic is used for scattering the problems, which can lead to much faster solving. He has created a piece of software which first tries to solve the problem locally, and if takes longer than ten seconds, he dispatches it to grid. He uses ngjm to keep track of grid jobs.

He get speedups using grid, but he has diminishing returns on having many simultaneous jobs. He needs large problems in order to use many CPUs. He concludes that SAT can be solved in parallel without having node communi- cation, using off-the-shelf solvers. His current grid problems is job submission time and retrieval and failure problems.

4.10 M-Grid experiences using ARC

Arto gave a presentation on the h1-grids experiences using ARC. M-grid is the finish material science grid, which aims to provide throughput computing to physics and chemistry researchers. It is a joint project between seven Finish universities. hI-grid has been running for a year and consists of 778 CPU cores, which equals to about 5 TFlop/s. They use Rocks, N1 Grid Engine and -4RC.

M-grid has central administration for the clusters, and has a small cluster for testing new software. Additionally there are regular meetings between the administrators. hI-grid are 80% local, and 20% for grid jobs, but unused nodes are always available.

They have had a few problems installing ARC. One is that their environment is 64-bit, which GT2.4 doesn't support. They tried GT4, but ran into some problems with NorduGrid usage of GT. Secondly the interface to the N1 Grid Engine was buggy.

Currently there are only a few grid users, as most users uses local access. This means that grid must be easier to use than using LRMS, otherwise no one will use it. Another experience is that grid projects always has other aspects than technology.

They collected a user complain list: Need to request a certificate, different job description compared to LRMS, need to list input files (they need recursive directory support), getting software available as runtime environments (they will try to share packages between sites), higher failure rate, and less determined execution times.

They will try concentrate on a few selected applications, and try to make them popular and have rumors spreading. They will also continue making more tutorials and documents. Additionally they will investigate job management tools and educate users if the tools are find satisfying. Also they will increase the PR-level for M-grid, and monitor the system better in order to fix failures quickly.

Leif commented on the presentations, saying that they have had a lot of the same problems. Especially certificates seems to be a step, which users are afraid to cross. Also the step from hello world jobs to putting their own jobs on the grid is a large step.

5 Thursday 20th January -NGN Meeting

Matthias decided to disappear so meeting started a bit late.

5.1 Harvesting Free Windows CPU Cycles

Rasmus gave a presentation of how to harvest CPU cycles in Windows machines by using sandboxing. Basically he runs a Linux in a sandbox in Windows. The work has been integrated into MiG. The resource requirement is an ssh account to a local user, hence they need a sandbox that can run as a user, so the local system will not suffer from bad security.

The project had considered both emulation and virtual machines, which each have their trade offs. Hybrids of these two can provide the best of the two, but all of them are propriety products. Their choice fell on Qemu, as security was the most important issue. They run ttyLinux inside Qemu, and equips the distribution with ssh and https server. This takes up less than three megabytes.

A11 this is started by a screen saver, that boots the Linux image, after a certain amount of idle time. A problem is to figure out when the screen saver will stop. MiG has heuristics to figure this out, which work reasonably well. Unfortunately it is not possible to suspend Qemu, but mobile Occam-pi processes can migrate.

5.2 ATLAS Production using ARC

Katarina gave a presentation of ATLAS production on ARC. The effort started in November and will keep continue for a while. Recently Cern has started a lot of challenges, including Rome production (DC3) (currently running) and the service challenges.

The challenge also consists of software installation and other administrative purposes. The effort has been very streamlined recently, e.g. it takes less than 48 hours to install new software. The production chain is: Event generation, even simulation (detector response, digitization, reconstruction), output of data, and pile-up.

An overview of the ATLAS production system was given. This is composed of a production database, four supervisors, and a central data management system is part of this. NorduGrid uses RLS, other grids uses other systems. The NorduGrid supervisor, called Dulcinea, is using ARCLib with Python bindings, and they are quite happy with it.

The production process has become more and more streamlined. ARC does not contribute with much, but given the number of CPUs it is reasonably good. Also some problems have low success rates. The broker in algorithm in ARCLib could do some work as it does not distribute the jobs properly. Inconsistent information announced by clusters is also causes some problems. The RLS server seems to be blocking at times, which is a problem. Job reruns also sometimes fail due to output file already being registered into RLS. Also GACL goes missing sometimes.

LHC will start producing data soon, so Cern is ramping up the number of jobs. 10K jobs per day in April, 50K jobs per day in summer, l00K jobs per day in December 2006, 1M in in fall 2007. There was some consensus that this very ambitious.

5.3 ATLAS Distributed Analysis

Sigve gave a presentation of the ATLAS analysis. They expect to run 1M jobs per day. There was a brief introduction to the ATLAS experiment. The ATLAS detector generates about 200 events per second, filling 1.6 MB a piece. That is a lot of data -they need grid. Currently there are 13 sites with 772 CPUs. We need more if want to contribute with 10%. In general there will be a lot to do.

In Scandinavia there is a problem, with the sites not giving the ATLAS physicists access to job submission -they need this. Sigve need a job managing tool.

5.4 Managing Scientific Queries over Distributed Data

Ruslan gave a talk on on using databases in grid. In these days most databases are distributed as files. The aim is to provide a high-level database for specifying analysis. The tool makes it easy to specify queries on a high level, reducing the chance of error. This has been combined with ARC, which they are using to test the tool. They have job babysitting functionality in the tool to manage jobs.

5.5 Round-table discussion

CSC would like to host the next NGN meeting. Weeks 20,21,22 where discussed. Will be at CSC, Helsinki: Tutorial 7 June, Technical meeting 7 June, Conference 8-9 June. These days might be moved around. Baltic grid will have meeting on 26th of June. There might some NorduGrid activity there. See balticgrid.org