Fifemon monitoring of data transfers

Back in about 2012, when we were designing the IFDH layer to insulate experimenter’s code from the gory details of data handling and operating on the grid, I drew a diagram that included the ifdh copy utility logging all the copies to a central logging facility and an agent of the monitoring system scraping those logs to provide counts, transfer rates, etc. While this never really got off the ground in the early versions of Fifemon, the current implementation, which uses Elasticsearch tools to collect statistics from logging data, has brought it to a complete implementation. There are now two dashboards in Fifemon that provide a view of those logged copies. These dashboards are currently in pre-production. When they are in production, this post will be updated with the links.

ifdhsummary_20161201IFDH Summary

The first of these, the IFDH Summary dashboard, provides a view of the volume, rate and counts of failed/successful data transfers of “ifdh cp” use, broken down by experiment and interactive vs. grid usage. However, for a more detailed breakdown view by user for a given experiment, we have another page.

 

 

Experiment IFDH Summaryifdh_exp_summary_20161201

The Experiment IFDH Summary page shows the usage for a given experiment (or group of experiments), broken down by user. Once again, there are graphs broken out for volume, rates, and count of failed/successful transfers for interactive vs. grid use.

 

 

ifdh_kibana_eg_20161201Detailed logs in Kibana

If you are on-site (or connected via VPN) you can also find ifdh log info in Kibana, in the fife-dh-* collection. There you can set your time window in the upper right hand corner, choose the fife-dh-* collection in the upper left, and search, for example, for a particular job-id or filename.

 

 

–Marc Mengel