Leif opened the meeting and everyone presented themselves
* Grid Security presentation by Olle
Olle Mulmo gave a presentation of Grid Security over phone
The presentation covered:
Requirements for grid security
How it worked (CA, PKI, proxy)
EGEE/gLite security
How it looks today - good overview of what is wrong:
Rights in proxys,
software running as root
Only luck that massive grid hacking has not yet happened
* NSC intrusion
Leif talked about the intrusion at NSC
Was compromised three times in a row
Local user account was compromised and rootkit installed
Installed ssh client with password sniffer
Other incidents was also covered
It was discussed how to protct sites from this
* M-grid / grid security policies
Urpo Kalia gave a presentation of security for M-grid
(www.csc.fi/proj/mgrid)
Presented a list of issues that must be managed for grid to be
'secure'
M-grid will have a security policy - there is a draft
Presents how to various should be done, and various procedures
Also has checklists (easy to use)
There was an open discussion about security in grid.
There is a need for people knowing "real world" security
People need to work together on this
There is a need for policies and HOWTOs
Local site security is difficult, jobs accessing other jobs
Lots of security issues where discussed
Security can be hard to scale up sometimes
How should policies/rules be decided, by who and how
International issues, some jobs might be legal in some contries,
but not everywhere
Storing of data about persons is problematic - privacy laws
How should certificates be revoked
What should be done if a resource owner discovers a 'nasty' user.
Present: Balázs, Anders, Jakob, Ilja, Andi, Aleksandr, Mattias, Leif, Åke, Oxana, Niels, Juha, Mike, Arto, Henrik (at the minutes).
Balazs sugested an agenda based on the minutes from the last meeting Agenda: * Catchup with the last meeting * Runtime presentation by Juha * ARCLib status * LCG2 Software presentation * Data Management Something had to be presented at Europaen Grid Conference about ARC, but it was not quite clear who could/should do it. * Data Management Biggest thing missing Not clear what requirements are Use cases and requirements should be found Only one present who was interested in doing some Data Management was Aleksander Matthias Wadenstein (wasnt present) is still interested, but from on admin point of view It is not clear who should manage storage elements Aleksander suggested some people coming together with an architecture and \ functionality of a data management system == Items from the last meeting == * Logger Has still not been tested. The student in Lund who should have looked at it, had not. The task is open for everyone. * Logger cleanup Not complete yet, works on the clientside. Installing the logger is too complex. Aleksander has made installing instructions. MySQL is not setup automatically. * Logging web interface Has been cleaned up, and should be ready soon. It requires a graph drawing library, Anders will make an RPM for it. Currently the logger needs to read a file from. $NORDUGRID_LOCATION/etc/nglogger-conf to read login information. The naming about various logging 'stuff' is confusing. * Logging API/RPC interface Jakob had promised to post something about the logging interface, but had been to busy to get anything done. Jakob promised to post schema and comments soon. The interfacte will be GGF compliant. The schema is finished, but implementation doesnt exist yet. SWEGRID might be interested in the logging framework. SWEGRID logging might use SGAS, not really clear how it will collect information though. Anders suggested that the two efforts should work together GGF schema is not very good; unclear how it should work and what semantics the values should have. * Log rotation Aleksander have made an example in nordugrid.conf which illustrates how to use log rotation. Anders has not made log rotation work with the system GM should not make its own rotation, but use the one in the system. Currently the GM just opens that file and leaves it open. Makes it hard to move the file. Reopening the file is hard in a multi-threaded systems. Can be done with copy-truncate, but might cause data loss. Leif uses it, but it is not ideal. Henrik said GM crashed when log file reached 2 GB. GM doesn't do anything special about logging, just open the file and appends, perhaps it should be opened in 64 bit mode, but it should work without. Threads can write to the log file simultaneously which causes log messages to mixed (stream multiplexing). A lock should be held when writing to the log file. * Default GM log file The question about whether a default logging file should be created was raised. Having admins setup their software is usually good, but software should also be easy to setup. Default should be made, but should still be configurable. * Broken files in session catalog What should be done about these. Had not been investigated/done. * GM Scan period Not quite done yet. Something new in GM which i did not hear, Aleks was not sure whether it worked fully and documentation was not there yet. * Resource backends Fork backend had been done and is documented - should be good. Condor works, but configuration is messy. SGE is nasty, is very site specific, some updates has been made. Integration with information system still needs some work. The Fins promised to do something. Backends should be stable and finished at 0.6 * More verbose gm-jobs Implementation should be done, but isnt used yet. Does not display the new jobs - should it? # I lost it here Will be possible to specify who can submit jobs and who can retrieve files from session catalog - to avoid zombie jobs. * Benchmarking integration. Infosystem is ready. Matthias has checked in the code for it, but needs testing. Implementation can use frequency if they dont provide benchmark. Also needs documentation, deployment is also missing. Balazs will solve some 10% extra problem. * Seperating user authorization from user mapping Allows for dynamic user mapping. Leif should provide something, but hasnt, not sure was is expected from him. DNs should only by used for authorization and logging (i.e., not mapping) * Running ARC as non root Not clear what is needed. TCP wrappers for deamons is hard to make right. A list of what is needed should be made. Info system needs to read files from the job control dir. Hostkeys needs to be owned by the user running the daemons. Henrik will try to setup francis by by running a non-grid user, and make a checklist. Breaks changing user for jobs. SWEGRID does some mapping, but not in any consistent way. Lots of problems comes up, e.g., reading session directory * Plug-in templates Downloader check plug-in (Aleks) - should be working. Test site plugin, checking that TESTSITE is specified (Leif) Has been send to Anders, not checked in yet. Not clear how they should be configured, should be set in the configuration file. * Testsites None has been setup yet. Should they be added to be monitor - makes it slower. Usually testsites arent the slow ones. Only allow jobs which has the TESTSITE runtime environment. Will do this by the previous mentioned plug-in Should somehow specify what it is testing. Need plug-in before testsites can go online. Will be in 0.5.15 * Specifying url options in xRSL E.g., number of threads, hashes/sums, read/write. A proper way for specifying this is needed. Currently is works by having a seperator in the URL. Matthias suggested a third inputfile parameter called option. Wont solve all problems, some are specific to index servers. Not all options have names - need naming scheme. We can have multiple destination and sources. We need a proper proposal for how it should be done. # Head started to hurt here * Client library API specification Henrik had promised to do this, but had forgot it. Will probably be in ARCLib. * Globus 3.2 Appears to be working. Some have it working, but it needs more testing. There where issues with GridFTP - should have been solved now. * Growing GRIS server. Grows with both authenticated and anonymous quires. Grows much more than the number of jobs. Hard to debug, Ake has some ideas. Globus uses some of our patches, parhaps Anders has taken it out Unlikely though. It is likely that there is a connection between the number of jobs, and the speed of the memory leek. Balasz will investigate it. Still grows even if it not quried, Ake made one on a special port, without the Globus stuff -> It is a slapd bug! %Post meeting notes: Did not grow if not queried, Ake made a slight mistake Problem is in ldif backend, Ake has a patch for it. # Went away for five minutes * Globus New Globus will be modular, ARC will depend on 25 Globus packages Would be nice to drop some Globus dependencies, .e.g., openssl/ldap Problem with old distros using old versions of openssl (Redhat 7.3) Globus OpenSSL can be removed when 7.3 is no longer supported. OpenLDAP in Globus is bad, since it is very old. Not trivial to replace. * CVS -> Subversion No one really knew about whether it is worth moving Balazs suggested keeping CVS since we know what it is. * CA directory cleanup Anders has started this, moved CAs to seperate package Some certificates can be remomved from cvs. We might have a policy directory as well. * NorduGrid -> ARC transition Lots of places in code this should be changed. ARC LDAP namespace is not taken, attributes will be changed. We will have backwards compability, but old cli tools will not be able to use the new clusters -> change should be fast. * Info system startup script Needs to be improved Balazs will make this soon - before 0.6 * Single client configuration file Matthias hasnt started. Anders has made a configuration class, which can be a basis. * Webpage Everyone should send links, projects, etc. to webmaster (Oxana) # LUNCH * Runtime Environment Registry Juha presented the concept and its webpage. There are some questions regarding: Hiarchial namespaces. Naming and versioning scheme. Maintaining lists, how to do it. How to enforce it. Should there be a database or service for it. How to allocate and use namespaces. People generally want this to happen now. Delegating namespaces Hiarchial structure is not clear (/ORG/CERN/ATLAS >< /APPL/HEP/ATLAS) How to make resource owners change REs? Spam 'em and have the old for a month Perhaps only add the new one, under the new namespaces and keep the old * LCG Runtime Environment Oxana presented an LCG runtime environement document. There was a list of requirements for REs made by the LHC (perhaps LCG). ARC was pretty close to them, except that REs should be over services. Unkown if LCG will implement the requirements (it is just a list). The had installed LCG2 software on Ingvar in Linkoping. There where quite a few problems. Stil in testing, not really stable or anything yet. Installation was messy. There is a lack of clear/consise documentation. ARC is _much_ simpler to install (and other things) than LCG. # Coffee Henrik gave a summary about the status of ARCLib There was a discussion about using cURL together with Globus This would probably require compiling cURL against globus Anders presented the build system. There was a lot of discussion about #ifdefs and limitation to distros and software versions. Some tests where presented. Aleksander wanted timestamps and rotating in notify Can be done by altering outstream. He also wanted parallel ldap queries, ARCLib will have it. Anders presented config which can read from different backends Aleksander wanted section support and perhaps support for order. * CVS Structure: People discussed this and things where moved around. Result was quite good. Data Management: # Was away for IBM Bluegene presentation. # Heard this though: We need proxys with access rights.