Dear members of the Puget Sound community, Technology Services (TS) has resolved the following incident. Details on the issue are available below: Puget Sound Network: Campus Network Outage Friday, January 24 | 3:13 p.m. – 10:30 p.m. Outage Summary: Following a thorough review of event logs and identifying a viable remediation method, our network engineers, with the support of Juniper, implemented a configuration change to the core and distribution layers of the network. This action stabilized and restored connectivity at approximately 9:30 p.m. We apologize for the disruption caused by this outage and appreciate your understanding as we worked to restore services. Technical Details: The team conducted a comprehensive review of all changes made in the two weeks prior to the incident to determine if any recent network modifications had caused the outage. We are confident that no changes performed by our team were responsible. Additionally, we found no evidence of a cyber event contributing to the outage. Unfortunately, a series of factors led the team to a false positive regarding the root cause of the outage. During the event, our logging system was overwhelmed with excessive and unnecessary details, which obscured the underlying issue and made it difficult to diagnose the cause. The campus network's routing fabric uses Border Gateway Protocol (BGP) to determine the best paths for data to travel. BGP relies on a supplementary protocol, Bidirectional Forwarding Detection (BFD), to detect routing failures. Upon further analysis of the logs, we discovered an unknown issue causing BFD to continuously reset routing across the campus network, rendering it virtually impossible for devices to communicate. After consulting with Juniper support, our team confirmed it was safe to disable BFD temporarily to restore network connectivity. Once this configuration change was made, network traffic stabilized immediately, and services were restored. Additionally, packet captures of the network have been obtained to further analyze the outage and identify the root cause of the BFD resets. Once a fix has been identified, we will implement the necessary adjustments and re-enable BFD during the next available routine change window. Our teams continued to work through the weekend to address residual disruptions caused by the outage and ensure the stability of our data center operations. For more information on Technology Services' Incident Communication Plan or to view a list of our Critical Systems and definitions, please visit our Service Announcements page on the Puget Sound website. If you have any questions or concerns, please don't hesitate to contact the TS Service Desk. |
TS Service Desk Walk-In Support: Tech Center in Collins Library Phone Support: (253) 879-8585 Online Help: support.pugetsound.edu |