Daily Bulletin

Reminder: Full Casper Software Refresh Coming During October Outage

October 11, 2023

To make it easier to create workflows that span Casper and Derecho, CISL has been working on migrating the Casper cluster to an OpenSUSE-based compute environment, with a whole new software stack similar to Derecho (though without Cray modules). Recently-deployed A100 GPGPU nodes already feature this software stack, which is incompatible with any software compiled for the CentOS-based Casper nodes in production now. User applications will need to be recompiled in the new environment. We are planning to transition ALL of Casper to the new software stack during the October 23–27 outage window.

We have set up a temporary OpenSUSE login node for you to explore the new stack. To begin, simply ssh to this node:

ssh casper01.hpc.ucar.edu

If you wish to use the new A100 GPGPU nodes before the outage, you will need to instruct PBS to select nodes with the opensuse15 operating system resource. To request a single GPGPU node, use the following syntax qsub/PBS arguments:

-l select=1:ncpus=128:mpiprocs=4:ngpus=4:mem=991GB:os=opensuse15
-l gpu_type=a100

We have also converted a “gp100” visualization node to the new OpenSUSE operating system. To access that node, either run vncmgr from the casper01 login node to begin a VNC remote desktop, or use the following arguments to qsub:

-l select=1:ncpus=1:ngpus=1:os=opensuse15
-l gpu_type=gp100

We encourage you to examine the new compute environment and reach out to us if there is missing software you needeither via a support ticket or by submitting an issue to our Casper software stack GitHub repository