Daily Bulletin

HPC Systems Downtime October 23-27 to Refresh Casper and Perform Filesystem Migrations

September 25, 2023

All CISL HPC systems will be down for scheduled maintenance the week of October 23-27 to refresh software operating systems and perform filesystem migrations in preparation for Cheyenne retirement at the end of this year. CISL engineers will relocate a number of filesystem datasets during this outage - including /glade/work - to new hardware. 

We will update the operating system on Casper nodes to an OpenSUSE installation for better compatibility with Derecho.  Users are encouraged to evaluate the test Casper deployment through direct ssh login to a demonstration node, casper01.hpc.ucar.edu, to evaluate the new operating system environment.  For additional information see the #casper-users channel in our NCAR HPC Users Group Slack workspace.

During this time, all Cheyenne, Casper, and Derecho compute nodes will be unavailable. Login nodes will be unavailable at the beginning of the outage window in order to allow for filesystem migrations. Scheduler reservations will be put in place to ensure that all user jobs have completed by October 23 as the downtime begins. Any jobs that are queued when the downtime begins will be retained for execution when the systems return to service.  Progress will be communicated through the Notifier system throughout the course of the outage.