Temporary page not for general user community; documentation update pending.
The Thunder cluster is a test system that features Marvell's ThunderX2 Arm processors. These processors utilize the aarch64 instruction set, rather than the x86-64 instruction set used by Intel and AMD processors.
To request access to the system, email email@example.com.
The cluster, procured from Aeon computing, features one login node and four batch/compute nodes.
1 login node and 4 compute nodes
CentOS 7.8 operating system
Log in the same way you do on Cheyenne, with that username and your two-factor authentication method. Thunder must be accessed via a machine on the NCAR network. If you are working remotely, you can sign into Thunder from a Cheyenne session. Use the following domain:
ssh -l username thunder.ucar.edu
Once you sign in, you will have access to a home directory and a scratch space. For convenience, these spaces have been given the same paths as those on Cheyenne, but they do not use the same GLADE environment.
/glade/u/home/$USER – Home directory for scripts, code, and built executables
/glade/scratch/$USER – Scratch space for model input and output
Quotas are not enforced on these storage spaces and files are not purged. Please delete data that are no longer needed and be mindful of your storage footprint.
Software is accessed via environment modules. When you first log in, you will have access to a default set of modules, but you can modify your environment using the module command. The following sub-commands are particularly useful:
module load <name> - Add any binaries, libraries, and compile headers from a particular software installation to your computing environment.
module unload <name> - Remove a software installation from your computing environment.
module purge - Remove all software installations from your computing environment.
module available - See all software installations that are installed on the system and available to load.
module reset - Return your computing environment to the default collection of modules.
By default, you will have access to the standard NCAR environment along with the gnu/9.1.0 compiler and openmpi/4.0.3. Many programs and libraries from Cheyenne and Casper are also available to load on Thunder. These include (but are not limited to) the following:
A small subset of software is not currently available for Arm processors and has not been installed on Thunder. These include MATLAB and IDL.
Use the Thunder login node to compile your programs. ThunderX2 processors have more physical cores (32) than the Intel processors on Cheyenne (18), though each core is slower. Therefore, we suggest using more compile threads than you would typically use on Cheyenne and Casper.
In general, the process of compiling software for Arm processors is the same as for Intel chips. The Intel compilers themselves are not installed on Thunder, so some adjustments may be necessary to build options to use either GCC or Clang/Flang flags. Additionally, many compiles specify the flag "-march=x86-64". Override this setting to specify "-march=native" instead, which should work for both Intel (x86-64) and Arm (aarch64) processors.
Loading the ncarcompilers module simplifies the process of including headers and linking libraries (netcdf, for example) at compile time.
The Thunder cluster uses Slurm for submitting jobs to the compute nodes. Both interactive and batch jobs are supported. For interactive sessions, Slurm has a two-stage process in which you first request resources, and then run programs on the resources you have been allocated. Here is an example in which we allocate a single task and then run Python on that compute node:
Any commands run without srun will be executed on the login node. To run a shell on the compute node, begin a shell session as follows:
Batch scripts are submitted with the sbatch command. Resources for the batch job are requested via header directives. In this example, we request two nodes for 6 hours of execution time on the regular partition:
If no wallclock time is specified, a default of 12 hours is assigned. If no memory request is provided, the job will be allocated 2 GB per task. Please be mindful of other users and request only the amount of time you expect your job will need. Accounting is not active on Thunder, so jobs generally dispatch in a first-in-first-out manner.
If you need exclusive access to a Thunder node, specify the "--exclusive" flag to sbatch either at the command line or in an #SBATCH directive. To allow other users quick access to the compute resources, use this option only when necessary.
Better application performance is generally expected with two computational threads per physical core (SMT-2), so this mode is enabled on Thunder. SMT-2 is analogous to Intel's hyper-threading, which can be used on Cheyenne. If you prefer to use only one thread per physical core in your job, it is important to tell Slurm to disable multiple threads and thereby ensure that the MPI library places your tasks properly. Note the difference in batch directives for each approach:
Since Thunder is a test system, send any support inquiries, software and hardware concerns, or requests for access to firstname.lastname@example.org instead of the CISL Help Desk.
We also welcome any feedback and performance reports that you can share as you test your workflows on Thunder.