Technology assessment initiatives at SURFsara within SURF Open Innovation Lab has a clear goal of understanding the performance of numerous HPC and AI workloads on upcoming computing technologies with a focus on performance analysis, collaboration and open sharing of results.
As a HPC center, we help researchers to solve challenges associated with their applications on our infrastructure. In this particular initiative, we focus on evaluating, assessing, and benchmarking modern and emerging computing architectures for several application domains such as high performance computing, machine Learning and data visualisation. We have identified four major dimensions of effort.
- Consistent, reproducible and open benchmarking.
- Access to the latest compute resources for external and internal users.
- External collaborations and the development of innovative HPC services.
- Knowledge dissemination through workshops and seminars.
Here the focus is on aggregrating several relevant benchmarks and standardise - compilation, execution of tests and results extraction in a common framework for the hardware and software ecosystem supporting it. The idea is to spend less time organising benchmarks and more time in designing new tests, execution and qualitative analysis of new computing systems making the process of benchmarking itself more reproducible, open and community engaging.
We are using Reframe, a HPC regression testing framework from CSCS to automate the process of benchmarking new systems and compute architectures. We are working on developing core tests in Reframe and would be making it open source soon.
The tests pipeline would involve minimal software installation on the remote system and flexible integration of new tests and benchmarks inline with SURF's HPC infrastructure. We have been using Reframe to test experimental configuration located inside SURFsara and University of Amsterdam and would also be using it to test DAS-6 systems as well.
Our team here at SURFsara would be enthusiastic to include different benchmarks covering different scientific disciplines and domains.
With the availability of numerous computing architectures it becomes important to make scientific application as portable as possible. We would like to facilitate researchers across different spectrum of science and engineering to access experimental compute resources and peform comparative performance analysis to better understand their computational workloads for uncharted computing systems.
At the moment we host following experimental systems at SURFsara
- 2x ARM 64 core from Huawei.
- 1x Intel Gold 6128 12c + 1x Nvidia GPU RTX 2080 + 1x U250 Xilinx FPGA.
- 2x AMD EPYC Naples 32C
- 1x AMD EPYC Naples 32c + 4x AMD GPU MI50
We are collaborating with Numerical Analysis, TU Delft and Performance analysis research group at University of Amsterdam to make FPGA usage simple for scientific use cases. Here we are working on using heterogenous compute devices (CPU + GPU + FPGA) to solve advance finite element analysis problem.
We did organise several workshops, seminars and external talks and some of them are listed as follows :
- Building Affordable and Programmable Exascale Capable Supercomputers
- Technology Assessment overview
- "Exploring the Potential of the ROCm Software Stack for High Performance Computing and Deep Learning on AMD GPUs" at GPU conference NLeSC
- "Micro-benchmarking and Performance analysis of skylake silver nodes" deployed in the National LISA cluster
- Guest Lunch lecture on "Hardware based numerics on distributed hardware" from TU Delft at University of Amsterdam (Slides availalble on-demand)
We are planning to organise more of interactive hands-on workshop to share our experiences for wider research community.
Sagar Dolas (sagar.dolas@surfsara.nl)
Project Lead
Dr. Axel berg (axel.berg@surfsara.nl)
Manager, SURF Open Innovation Lab