To make it easier to create workflows that span Casper and Derecho, CISL has been working on migrating the Casper cluster to an OpenSUSE-based compute environment, with a whole new software stack similar to Derecho (though without Cray modules). Recently-deployed A100 GPGPU nodes already feature this software stack, which is incompatible with any software compiled for the CentOS-based Casper nodes in production now. User applications will need to be recompiled in the new environment. We are planning to transition ALL of Casper to the new software stack during the October 23–27 outage window.
We have set up a temporary OpenSUSE login node for you to explore the new stack. To begin, simply ssh to this node:
ssh casper01.hpc.ucar.edu
If you wish to use the new A100 GPGPU nodes before the outage, you will need to instruct PBS to select nodes with the opensuse15 operating system resource. To request a single GPGPU node, use the following syntax qsub/PBS arguments:
-l select=1:ncpus=128:mpiprocs=4:ngpus=4:mem=991GB:os=opensuse15
-l gpu_type=a100
We have also converted a “gp100” visualization node to the new OpenSUSE operating system. To access that node, either run vncmgr from the casper01 login node to begin a VNC remote desktop, or use the following arguments to qsub:
-l select=1:ncpus=1:ngpus=1:os=opensuse15
-l gpu_type=gp100