Campus Network Outage Summary – Friday, January 24

This logo identifies the Technology Services department within the University of Puget Sound

Dear members of the Puget Sound community,

Technology Services (TS) has resolved the following incident. Details on the issue are available below:

Puget Sound Network: Campus Network Outage

Friday, January 24 | 3:13 p.m. – 10:30 p.m.

Outage Summary:
On Friday, January 24th, at 3:13 p.m., Technology Services observed a campus-wide network outage. Due to the scale of the outage and the campus community's inability to communicate via Zoom or email, our Emergency Response Group elected to issue an Everbridge alert to notify the campus. The team undertook several remedial actions to restore network connectivity, and a follow-up Everbridge alert was issued at approximately 5:15 p.m., indicating the outage had been resolved. Unfortunately, at around 5:50 p.m., the team observed a recurrence of the same network behavior seen during the initial outage. At this point, the campus network became unavailable again.

Following a thorough review of event logs and identifying a viable remediation method, our network engineers, with the support of Juniper, implemented a configuration change to the core and distribution layers of the network. This action stabilized and restored connectivity at approximately 9:30 p.m. We apologize for the disruption caused by this outage and appreciate your understanding as we worked to restore services.

Technical Details:
During the initial onset of the outage, our engineers reviewed network traffic patterns and identified what appeared to be a network loop causing the outage. A network loop, in its simplest form, occurs when data travels in a continuous circle without any mechanism to stop it. This can happen if a data port is connected to another port within the same network, creating an infinite data loop, or if a configuration change inadvertently causes traffic to route in a circular path. While protocols like Spanning Tree Protocol (STP) exist to prevent loops from disrupting an entire network, loops can sometimes originate in areas where STP is not applicable.

The team conducted a comprehensive review of all changes made in the two weeks prior to the incident to determine if any recent network modifications had caused the outage. We are confident that no changes performed by our team were responsible. Additionally, we found no evidence of a cyber event contributing to the outage.

Unfortunately, a series of factors led the team to a false positive regarding the root cause of the outage. During the event, our logging system was overwhelmed with excessive and unnecessary details, which obscured the underlying issue and made it difficult to diagnose the cause. The campus network's routing fabric uses Border Gateway Protocol (BGP) to determine the best paths for data to travel. BGP relies on a supplementary protocol, Bidirectional Forwarding Detection (BFD), to detect routing failures. Upon further analysis of the logs, we discovered an unknown issue causing BFD to continuously reset routing across the campus network, rendering it virtually impossible for devices to communicate.

After consulting with Juniper support, our team confirmed it was safe to disable BFD temporarily to restore network connectivity. Once this configuration change was made, network traffic stabilized immediately, and services were restored. Additionally, packet captures of the network have been obtained to further analyze the outage and identify the root cause of the BFD resets. Once a fix has been identified, we will implement the necessary adjustments and re-enable BFD during the next available routine change window.

Our teams continued to work through the weekend to address residual disruptions caused by the outage and ensure the stability of our data center operations.

For more information on Technology Services' Incident Communication Plan or to view a list of our Critical Systems and definitions, please visit our Service Announcements page on the Puget Sound website.

If you have any questions or concerns, please don't hesitate to contact the TS Service Desk. 

TS Service Desk

Walk-In Support: Tech Center in Collins Library

Phone Support: (253) 879-8585

Online Help: support.pugetsound.edu 

Email Support: servicedesk@pugetsound.edu