Throughout the summer, CISL has been working on augmenting Casper hardware resources and crafting a new software stack to better match the Derecho environment. We are pleased to announce that 7 new GPGPU nodes are available on Casper, featuring NVIDIA A100 GPUs (with 80 GB of VRAM each) and AMD Milan CPUs. These nodes are very similar to Derecho GPU nodes, aside from the doubling of GPU memory.
To make it easier to create workflows that span Casper and Derecho, we are also working on migrating the Casper cluster to an OpenSUSE-based compute environment, with a whole new software stack similar to Derecho (though without Cray modules). The new A100 GPGPU nodes feature this software stack, which is incompatible with any software compiled for the CentOS-based Casper nodes in production now. Thus, to use the A100 nodes, applications will need to be recompiled in the new environment. We have set up a temporary OpenSUSE login node for this purpose - which you may also use to examine the new software stack. To begin, simply ssh to the new node:
To request the new GPGPU nodes, you will need to instruct PBS to select nodes with the opensuse15 operating system resource; this ensures that you do not unintentionally get assigned these nodes. To request a single GPGPU node, use the following syntax qsub/PBS arguments:
In October, CISL is planning to migrate all current Casper nodes to the new OpenSUSE environment.
Even if you do not intend to use these GPGPU nodes, you are encouraged to examine the new compute environment and reach out to us if there is missing software you need - either via a support ticket
or by submitting an issue to our Casper software stack GitHub repository