Monitoring and Accounting Workpackage
A grid infrastructure needs to be actively monitored in order to reduce the job failure rate. In particular, jobs are assigned to a site on the basis of the site’s profile (static information, e.g. support for a certain VO) and status (dynamic information, e.g. free CPU slots). The dynamic information may however not be up-to-date and may not reflect the real status. It is the task of the monitoring to detect these situations. In addition, the monitoring system must also provide job specific monitoring information for the user.
The monitoring activity thus requires:
- A testing engine, which automatically and regularly probes the grid services and their functionalities at sites (probes/alarms and their severities will be agreed upon across
sites) - Software to display the results obtained by the testing engine
- Job monitoring and tracking
For accounting we only rely on the information logged by the grid services. We envision consolidating this information in an accounting infrastructure.
