Accounting Subsystem

New in version 6.4.

Changed in version 6.12.

Warning

Information in this chapter is relevant only for 6.4+ ARC releases.

Moreover ARC 6.12 get accounting changes to address the APEL move to ARGO messaging service protocol. If you are publishing to APEL you must update to 6.12+ ARC release.

Note

If you are lookng for the information about legacy accounting subsystem in 6.0-6.3 ARC releases please read Accounting with legacy JURA but it is highly recommended to update to recent release.

The next generation ARC Accounting Subsystem aimed to improve scalability, eliminate bottlenecks of the legacy implementation and provide a local on-site ARC CE accounting database to be queried and analyzed by CE admins.

Overview

Figure shows the overview of next generation accouting subsystem in ARC. More details can be found in ARC Accounting Technical Details.

Next generation ARC accounting subsystem overview

Fig. 4 Next generation ARC accounting subsystem overview: creating AAR records, getting access to the data and publishing to external SGAS/APEL services

The central point of ARC next generation Accounting Subsystem is a local SQLite accounting database that stores all the A-Rex Accounting Record (AAR) information. AAR defines all accounting information stored about a single ARC CE job.

The database is populated directly by A-REX based on per-job files inside the control directory. In particular the .diag file is a main source of resource usage information, .statistics holds data transfers measurements and .local is a general datastore about the job properties like ID, owner, etc.

The record about the job is created in the accounting database as soon as the job enters ARC CE. Than each state change of the job is recorded. Once the job reached terminal state (both successfull completion or failed) all resource usage data is written becomes available via local queries (arcctl accounting) or for reporting to SGAS/APEL (jura-ng).

Accounting subsystem is integral part of the A-REX and allways enabled.

Configuration

In the typical use-case accounting subsystem just works and does not requires additional configuration.

However in some rare cases there are several things you can consider to tune.

WLCG VOs

The recommended and typically used way to authorize WLCG VOs is to use [authgroup] block voms configuration option that exactly match attributes in VOMS extesion of proxy certificate.

If you have authorization rules configured this way NOTHING should be configured.

In case you do not have at least one voms option in the any of the defined [authgroup] blocks A-REX will not trigget the VOMS attributes parsing, that consequently leads to no VO info in the accounting!

To have VO info in the accounting in this particular case you can add standalone authroup with voms option that have no further usage in arc.conf:

[authgroup: parsevoms]
voms = * * * *

There is also forcedefaultvoms configuration option (can be defined on per-queue basis) that can define the accounted VO for jobs that have no VOMS extension in owner’s certificate.

Enabling accounting records reporing

It is possible to send resource usage reports to the centralized SGAS and/or APEL accounting databases.

Based on the local accounting database data the Accounting Publishing Module is capable of generating the:

  • OGF Usage Record 1.0 (UR) XML format to be sent to the SGAS LUTS (Logging and Usage Tracking Service)
  • EMI Compute Accounting Record (CAR) v1.2 XML format for individual job records to be sent to APEL
  • APEL Summaries (to reduce traffic) and APEL Sync messages for full-featured integration with APEL services

Regular publishing sequence is handled by jura-ng helper tool and can be enabled with [arex/jura] block.

A-REX periodically runs jura-ng with default and minimum hourly period. It can be increased with urdelivery_frequency option. Futhremore it can be increased per-target, using the same option inside a particulat targer block.

JURA has dedicated log file defined by logfile option. Log rotation has been set for default /var/log/arc/jura.log location.

Accounting services for sending the records are configured with dedicated sub-blocks. You need to define a separate block with an unique targetname for every target server used.

Note

The target block name will be used by jura-ng to track that latest records sent to this targed. Be aware that if you rename the block, target will be handled as a new one. However targeturl change will not trigger a new target handling and records will continue publishing using the latest recorded timestamp

The AARs data will be reported to all of the defined destinations, unless vofiler option configured for some of them to filter records by VO name.

Warning

There were sevaral issues in the codebase and misunderstanging from the operational point of view how the benchmarks are propagated to the central accounting services with ARC accounting. Please read About benchmarks and accounting publishing for more details.

Configuring reporting to SGAS

The SGAS sub-block enables and configures an SGAS accounting server as a target destination to which ARC CE will send properly formatted OGF.98 Usage Record 1.0 (UR) XML records.

The targeturl option is the only mandatory parameter to configure SGAS target. In the specific setup cases you can also apply VO filtering and set prefix for local job IDs.

Example:

[arex/jura/sgas: NeIC]
targeturl = https://grid.uio.no:8001/logger
urbatchsize = 80

Configuring reporting to APEL

The APEL sub-block enables and configures an APEL accounting server as a target destination to which ARC will send the data.

The targeturl option defines the APEL broker URL to send records to. The currently known APEL AMS endpoint is provided in the targeturl example but you should refer to APEL for up to date information.

The apel_messages option allows you to choose between per-job EMI Compute Accounting Record (CAR) XML records publishing and APEL Summaries publishing. Sending summaries is a default and recommended by APEL behaviour that allows to save resources and traffic.

APEL Sync records that syncronize the total job counters per-month are always sent.

You also need GOCDB name of the resource. Since move to AMS publishing the APEL topic is always gLite-APEL that is a default value.

For correct production accounting setup it is recommended to specify resource benchmarking results in the [queue:name] block. ARC assumes that the nodes in the same queue are homogeneous with respect to the benchmark performance and benchmark values are specified per-queue.

Example:

[arex/jura/apel: EGI]
targeturl = https://msg.argo.grnet.gr
gocdb_name = RAL-LCG2

[queue: grid]
benchmark = HEPSPEC 8.73

Lookup local accounting data

Data in ARC CE accounting database can be viwed with ARC Control Tool. Timeframe of interest and many other filters can be specified, e.g.:

[root ~]# arcctl accounting stats --filter-vo ops --start-from 2019-07-10
A-REX Accounting Statistics:
  Number of Jobs: 1317
  Execution timeframe: 2019-07-10 00:01:45 - 2019-07-26 12:48:13
  Total WallTime: 4:40:28
  Total CPUTime: 0:22:32 (including 0:00:00 of kernel time)
  Data staged in: 2.3M
  Data staged out: 9.7K

[root ~]# arcctl accounting job events UufLDmmmS5unf5481mks8bjnABFKDmABFKDmmSMKDmNBFKDm9nRnVo
2019-07-10 17:30:00   ACCEPTED
2019-07-10 17:30:00   PREPARING
2019-07-10 17:30:00   DTRDOWNLOADSTART
2019-07-10 17:30:08   SUBMIT
2019-07-10 17:30:08   DTRDOWNLOADEND
2019-07-10 17:30:11   INLRMS
2019-07-10 17:30:16   LRMSSTART
2019-07-10 17:30:26   LRMSEND
2019-07-10 17:30:47   FINISHING
2019-07-10 17:30:47   FINISHED
2019-07-14 15:01:16   DELETED

[root ~]# arcctl accounting job transfers UufLDmmmS5unf5481mks8bjnABFKDmABFKDmmSMKDmNBFKDm9nRnVo
Data transfers (downloads) performed during A-REX stage-in:
  http://download.nordugrid.org:80/packages/nordugrid-arc/releases/6.1.0/src/nordugrid-arc-6.1.0.tar.gz:
    Size: 5.2M
    Download timeframe: 2019-07-10 17:30:00 - 2019-07-10 17:30:04
No stage-out data transfers (uploads) performed by A-REX.

More queries examples can be found in this document.

Republishing records

When something goes wrong with accounting services, network, etc there is possible need of republishing local records again.

In the current implementation of accounting subsystem, there is no difference between publishing and re-publising. The same Accounting Publishing Module will be used to generate the records to be sent to target service for defined timeframe.

Rebuplishing is triggered by ARC CE administrator using ARC Control Tool.

The most streamlined way is to republish data to the target that is already configured in arc.conf for regular publishing:

[root ~]#  arcctl accounting republish -b 2019-06-01 -e 2019-07-01 -t EGI

However it is also possible to define all target options from the command line, without the defined target in arc.conf:

[root ~]#  arcctl accounting republish --end-from 2019-06-01 --end-till 2019-07-01 \
> --apel-url https://msg.argo.grnet.gr --gocdb-name "UA-KNU" \
> --apel-messages summaries --apel-topic gLite-APEL

Clean up of the <controldir>/logs folder

With the accounting subsystem change in ARC 6.4 legacy archive files are no longer used or written to the <controldir>/logs directory. It is recommended to manually wipe this directory.