Gábor Samu
Gábor Samu
Creator of this blog.
Jan 4, 2022 3 min read

New Year's Resolution for HPC: Using Resources More Efficiently

thumbnail for this post

A hearty happy new year to everyone. It’s that time of the year that we hear from folks about their resolutions for new year’s. But rather than talk about me purchasing a gym membership, I’d like to share my thoughts on a new year’s resolution for HPC.

With the topsy-turvy weather that we’re seeing all over the planet, we’re all acutely aware of the changes that are happening to our climate and what is represents for humankind. HPC is a key engine for science, including efforts that are crucial to help with our climate change battle. Climate and ocean modelling are some examples of the use of HPC that immediately come to mind in this respect. Modelling the environment is important for us to understand what is occurring around us and what is projected to occur. Additionally, materials science is also important in order to help develop the necessary technologies to more effectively store energy from renewable sources and transmit, generate energy. HPC is a consumer of energy, which brings me to the HPC resolution for this year – using computing resources more efficiently.

We’ve seen great strides in the efficiency of processors and systems. But at scale, large HPC centers consume large amounts of energy for both powering the servers and storage systems, as well as the cost of cooling. And if you’re using cloud for HPC, then of course you’re not concerned with the energy and cooling, but rather the cost to you. In either case, making the most efficient use of your infrastructure should be a key consideration. Workload schedulers are the interface between users and jobs in any HPC environment. Users submit work and it’s the task of the workload scheduler to find suitable compute resources to dispatch the work to. On the surface, this may seem like a trivial task. But with potentially large numbers of jobs, users, servers and priorities, workload and resource management is anything but a trivial. The good news is that there are workload management solutions which bring decades of experience to the table.

IBM Spectrum LSF Suites provide a fully integrated workload management solution for HPC environments. LSF builds on almost 30 years of experience in workload and resource management and is used on some of the worlds’ largest supercomputers including Summit, at the Oak Ridge Leadership Computing Facility. On a high-level, here are some critical areas where LSF can help to drive better efficiency in your HPC infrastructure:

  • Dynamic hybrid cloud – automatically flex up and down cloud resources according to policies, with support for all major cloud providers. Learn more here
  • Dynamic multi-instance GPU support – right size NVIDIA A100 multi-instance GPU slices according to incoming workload demands. Learn more here
  • User productivity – single unified UI for job submission and management which captures repeatable best practices. Learn more here

Start the year off right, with a focus efficiency in your HPC environment with IBM Spectrum LSF. Learn more here.