By SSI Fellow Luke Abraham, National Centre for Atmospheric Science and University of Cambridge.
I have been working with the United Kingdom Chemistry and Aerosols model (UKCA) since 2007, and have been organising training courses on how to use it since 2013. It is designed to simulate atmospheric aerosols and chemistry as part of the Met Office Unified Model (UM), and is used to perform simulations as part of international climate assessments as well as to provide air quality forecasts for the UK. It is a big model that is complex to use, especially for new users. I refresh the training materials every year or so, and very quickly I realised that one of the biggest issues students had while learning how to use UKCA was actually with the supercomputer that we were using for our training courses.
Why use a supercomputer for training?
UKCA is designed to be run on large supercomputers such as ARCHER2. Typically, researchers would use hundreds to thousands of cores and run for many weeks to months in real time, with research projects typically adding a new process to the model and investigating impacts on past, present, or future climates.
However, when learning how to use UKCA, simulations of this type are not suitable due to the time and resources they require. Running for less time (e.g. a single model day) still requires the use of a supercomputer due to the amount of memory these jobs use. In addition, complex machines like ARCHER2 have a steep learning curve themselves.
When organising training courses often the first few hours are spent ensuring all the students can connect to the supercomputer before they even begin working through the training materials. While it is possible to reserve part of the machine for the students to use, some steps may still take a long time. This then reduces the amount of time left to actually work on UKCA.
Sometimes, learning how to use the supercomputer is just as important as learning how to use the model. However, this is done by other training courses offered by the National Centre for Atmospheric Science, so for UKCA training I wanted to just focus on helping students learn how to use the model.
Is there another way?
A solution to some of these issues is to use virtual machines (VMs) for training. While this does not give the students experience in using the supercomputer they may be using for their research, it allows for better control over a number of other factors.
Rather than having to use large jobs that have been designed for supercomputers, it is instead possible to use smaller “toy” configurations that can run very quickly and have simpler settings. These jobs still use the same code as those used for research purposes, and the simplicity of them means that the students can just focus on the parts related to UKCA.
These VMs are single-user, so each student has their own machine and can run everything straight away. They are simpler to connect to and require fewer usernames and passwords. It is also possible to provide a full desktop environment, making the system more approachable and user-friendly.
Using a cloud-based system, such as Amazon Web Services (AWS) Elastic Compute Cloud (EC2) means that pre-built VM images can be created, holding example jobs, sample output, worked solutions, documentation, python scripts etc. as required for the training course. These can then be created in minutes and connection information distributed to the students as needed.
This approach has been especially helpful during the pandemic when it was necessary to move courses online. At in-person events it is easy to sit with someone and help with any issues they may have. Over videoconferencing it is often much harder to be able to diagnose and fix problems. When students had issues I was able to connect to their VM using EC2 Instance Connect, which allowed me to see their code changes and output in more detail and helped me understand what was going on.
My experience from the past two years is that using virtual machines gives a better experience for the students. I am currently planning the next UKCA course, and while feedback from the students has indicated that going back to in-person training would be preferable, I will still be using VMs rather than a supercomputer for the training.
Use at any time
The Met Office provides Vagrant configuration files to provision a VM that can be used with AWS, VirtualBox, or VMware. An Ansible version of the VM is also available which includes information on how to provision multiple EC2 instances on AWS. This means that students don’t even need to attend the training course to work through the UKCA training material. As these are all online, anyone can make their own VM with the required software packages and work through the UKCA tutorials at any time.
As well as allowing new users to become familiar with UKCA, this set-up can also be used beyond the training materials. The VM can be used for making general model developments and testing new schemes, prior to then running these on a supercomputer. This can reduce the development time significantly and allow new users to try out UKCA before they spend any supercomputer time. Further discussion of this system can be found in Abraham et al. 2018.
Want to discuss this post with us? Send us an email or contact us on Twitter @SoftwareSaved.