With an increasing demand from the production groups, the Production Operations Management System (POMS) is being extended to meet the Intensity Frontier (IF) experiments’ requirements for high scale production and distributed analysis processing.
Several experiments are using or have expressed interest in using POMS. NOvA is extensively tracking their entire production. LArIAT and MicroBooNE have adopted POMS for some of their data processing. g-2 just started with Monte Carlo tests; Adam Lyon, quadrant head of the Scientific Computing Division and senior scientist of the g-2 collaboration, says: “Muon g-2 is excited to be about to launch a major simulation generation effort with POMS. Its tracking, bookkeeping and interface are very appealing to us. We look forward to gaining more experience with it and enjoying its benefits for production campaigns.”
POMS assists the production processing of experiments, starting from grid job submission and proceeding through monitoring, automatic resubmission, failure triage and bookkeeping, thanks to the inter-operation with other systems like SAM and Fifemon.
Among many other features, POMS guides the user through the submission of jobs on the grid through a web interface. It allows users to run arbitrary executables/scripts and workflows while keeping track of the configuration used. This information can be used, for example, to recover from grid failures. (Fig 1)
Furthermore, the system has been designed in a way that it is possible to override some parameters without changing the configuration scripts. POMS also allows the scheduling of job submissions at specific dates and times through a crontab. This feature is particularly important for the daily processing of the data collected by the detector during the data-taking of the experiment, thus reducing the scheduling overhead of the experiment production groups. (Fig. 2)
Monitoring is another important aspect as it provides the users with information on the progress of the grid jobs and the status of the data files. The display of output logs helps users to understand failures that can occur during grid submission and data processing, which can otherwise be a very time-consuming task and become a critical deadlock for small collaborations with limited manpower. (Fig. 3)
We are working to develop new features required by future experiments like the Fermilab flagship experiment DUNE. A test for MC production processing for ProtoDUNE has already started and will continue processing data for analysis and presentation at conferences and collaboration meetings. ProtoDUNE, a crucial milestone for the DUNE experiment as it will test and validate the design and technologies of the far detector, has a very demanding schedule and production processing must be efficient.
New features include improvements of the web interface and monitoring through the integration of systems like Fifemon that monitor the HTCondor pools, data handling and storage systems, and other related systems.
An ongoing effort involves data management: POMS will support the model of sending data to jobs and the pre-staging of the input datasets at different local caches in order to make an efficient use of the Grid. Furthermore, large datasets will be automatically split into subsets to avoid the overload of grid resources.
The first major POMS release for the ProtoDUNE test beam experiment was deployed in January this year with another major release planned for this summer that will be focused on data management. Major features of this next release are included in the roadmap, which can be found here.