Throughout any given year, the need of the HEP community to consume computing resources is not constant. It follows cycles of peaks and valleys driven by holiday schedules, conference dates and other factors. Because of this, the classical method of provisioning these resources at providing facilities has drawbacks, such as potential over-provisioning. Grid federations like Open Science Grid offer opportunistic access to the excess capacity so that no cycle goes unused. However, as the appetite for computing increases, so does the need to maximize cost efficiency by developing a model for dynamically provisioning resources only when they’re needed.
To address this issue, the HEP Cloud project was launched by the Scientific Computing Division in June 2015. Its goal is to develop a virtual facility that provides a common interface to access a variety of physical computing resources, including local clusters, grids, high-performance computers, and community and commercial clouds. Now in its first phase, the project is evaluating the use of the “elastic” provisioning model offered by commercial clouds such as Amazon Web Services. In this model, resources are rented and provisioned dynamically over the Internet as needed.
The HEP Cloud project team successfully demonstrated this elastic model for CMS in January and February using Amazon Web Services as discussed here and shown below:
In March, the project team demonstrated that HEP Cloud is also a viable solution for the Intensity Frontier community. During the month, OPOS ran 3 productions activities for NOvA consisting of Monte Carlo and data event processing on HEP Cloud. The campaign contributed to the large computing “crunch” necessary to produce results for the Neutrino 2016 conference. It processed more than 90 TB of input data with 550,000 hours of computation, producing more than 150 TB of output. Data IO was handled efficiently by using S3, the highly scalable storage at Amazon. The team demonstrated that NOvA could sustain slot levels of 7,300 concurrent cores, a burst of almost 4 times over the slots allocated at Fermilab for NOvA. And thanks to the project’s integration activities, NOvA is using the same familiar services they use for local computations, such as data handling and job submission.
A version of this article originally appeared in Feb 15, 2016 news.fnal.gov.
— Gabriele Garzoglio and Burt Holzman