There was an interesting trend in the US nuclear power industry through the 1990s and 2000s: despite no new plants being built, net electricity generation increased almost continuously, averaging around 2% gained per year even as some plants were being shut down. Instead of building new plants, utilities were finding ways to generate more from… More »
Because of the planned removal of Network Attached Storage (Bluearc) mounts from worker nodes, all experiments and projects will be expected to distribute their software to worker nodes with CVMFS. Many already do, but now the remaining ones will need to transition to CVMFS. This article is for them.
The HEPCloud program had a very productive 2017, successfully delivering several milestones targeted for this year! Since early this year, the team has been working on designing a new Decision Engine (DE) based on a framework architecture that can be extended to support future needs. The DE is an intelligent decision support system and is… More »
Efficiency Threshold Reference Table Role memory CPU Success rate Analysis 15% 35% 50% Production 15% 35% 50% POMS 15% 35% 50% Job clusters with efficiency below these thresholds will be tagged as inefficient and the submitter will be contacted through email to diagnose and potentially modify their workflow. Total wall time for all jobs in… More »
Nearly all forms of scientific computing at Fermilab require some form of non-volatile storage. While the primary storage format for scientific data at Fermilab is tape-backed mass storage systems (MSS, consisting of Enstore and dCache), there are a variety of other storage solutions available, depending on the type of scientific computing that needs to be… More »
As membership of the DUNE collaboration approaches a thousand scientists from around the world, one of the challenges that the experiment faces is how to simulate the DUNE and ProtoDUNE detectors, and to analyze the data that these simulations will produce. But if you are a new student or postdoc that has just joined DUNE,… More »
Have you heard about Singularity? You should probably wait until 2040 to see it, but meanwhile, OSG and FIFE teams are working hard to introduce Singularity to improve users’ experience on the grid. When running jobs on the grid, one issue that users encounter is that their test environment, for example on an interactive node,… More »
Have you ever wondered about what happens when a new postdoc joined an experiment, or if someone you’re collaborating with wanted to run a production workflow? By now, you’re probably used to accessing ServiceNow, navigating through pretty complicated choices, selecting an appropriate form, and submitting the request. Do you want to know what happens next? Probably… More »
The annual Scientific Computing Portfolio Management Team (SCPMT) review is scheduled for Feb. 23 and 24. This review helps develop the M&S purchases and SCD service directives for the following fiscal year(s), e.g., SCPMT17 helps to develop the FY18 and FY19 plans. In preparation, all experiments that use computing resources fill out a resource request spreadsheet…. More »
The Jobsub high-availability servers have recently completed their third year as FIFE’s preferred batch submission systems to the OSG. We have worked to achieve a good balance between user convenience, security, service availability and resource utilization. Experience and high user load have taught us where improvements are needed.
Last December, Kevin Retzke traveled to New York to speak about Fermilab and Fifemon at GrafanaCon, the annual gathering of Grafana users and developers. Everyone was excited to learn more about Fermilab’s scientific mission, and how Grafana is being used to monitor scientific computing.
The experiment with the most opportunistic hours on OSG between Dec. 1, 2016 and Feb. 1, 2017 was Mu2e with 2,066,794 hours.
The most efficient big non-production users on GPGrid who used more than 100,000 hours since Dec. 1, 2016 are listed in the included table.
The most efficient experiments on GPGrid that used more than 100,000 hours since Dec. 1, 2016 were LArIAT (92.73%) and Mu2e (90.03%).