The NWSC facility experienced power quality issues twice last week, causing power safety equipment to trip appropriately and Derecho compute racks to power off. These incidents were the direct result of a newly built neighboring facility trying to energize for the first time. Once Derecho was brought back online, the Slingshot interconnect fabric stopped working and the project team spent most of the week resolving networking problems until the root cause was found and fixed. This highlighted a systematic configuration error that has been addressed.
Our plan as of last Friday was to stress the system hardware, infrastructure, and software by putting stress and load on Derecho over the weekend. Today, the Consulting Services team plans to start building the user software stack required to run applications during acceptance testing, which is now expected to begin in the middle of this week.