Internet Connection Down April 27 – entry II – Service Restored

Electric Lightwave, the University’s Internet Service Provider states that the problem that began at around 5 PM was caused by a lightning strike in Tukwila. The lightning strike caused a section of fiber optic cable to fail, causing internet outages between Seattle and Portland, OR. Internet service was restored to the University at 11:08 PM. Some customers in the area did not have service restored until 3:00 AM April 28.

Internet Connection Down April 27 – entry I

Most of the campus is experiencing internet connection failure. This is due to a major outage in Seattle (information from Electric Lightwave, the University’s Internet Service Provider). Some areas of campus apparently have an internet connection, so the problem is due to improper routing information across our section of the internet.

No ETA as to the return of service as yet.

Possible WebMail problem identified

In our efforts to identify the cause of recent problems with the WebMail server, we have been at a loss for information. We have tried to discover what has been causing the delays and unresponsiveness in WebMail as of late. We have looked at possible memory leaks in daemons, possible attacks, possible miss configurations. All of these have not lead to a clear answer.

It is believed at this point in time that if Ockham’s Razor holds true we may have found the source of the problem. It was discovered late yesterday that the available disk space of the WebMail server was extremely low. Since WebMail serves as an imap gateway temporarily caching and displaying mail messages via a http server, disk space for temporary files is necessary. This has been the best possible explanation for the problems we have seen thus far.

We have increased available disk space on the server. We have also contacted server individuals who reported problems to determine if the issue still persists.

Whidbey reboot

Whidbey was experiencing problems with the secure shell daemon. During the troubleshooting process we were unable to get the daemon to restart correctly, and the system rebooted at 12:15 pm.

Upon further investigation of the last log, it became clear that the shutdown command issued on April 14 during preventative maintenance froze. In working with the sshd2 daemon, the shutdown completed.

The secure shell daemon is now running correctly.

Imap issue

The University’s mail server handling imap requests this afternoon between 3:30 pm and 4:00 pm. The testing was done with the secure imap daemon which confused the server. All imap connections were closed and reopened.

Mail server slowness I

The University mail server began to timeout clients and slow down at approximately noon today. The cause of the problem is unclear.

The number of process running on the server appearred to be at a normal load average. The servers response from secure shell was slow. It appearred as though the system was having a problem allocating resources for the processes running, but memory and disk space were both available.

We shut down process to try and identify the cause of the slowness. Imapd and Ipop3d were disabled with no luck. The web server was shutdown, no luck. Then mailman and sendmail. At this point things improved.

I disabled the webmail interface and brought sendmail and mailman backup. I then brought the web server backup. The server still appeared to be responding in a slower than normal, but somewhat timely manner. I restarted the ipop3d daemon and about five minutes later the imapd daemon. The server still looked good–slower than normal, but somewhat timely. After restarting the webmail interface things when into the tank.

We tried restarting the webmail server between 3:00 and 3:30 pm, but the unbareable slowness remained.

We then rebooted the mail server at about 3:40pm.

The server was back to normal after the reboot.

Web server restarted

The disk on the University’s web server filled up sometime last week, causing several cron processes to hang and “zombify”. In order to clear up these problems, the web server was rebooted.

SurgeFTP server license key problems I

The license key for the SurgeFTP server was apparently incorrectly generated. As a consequence, the server was running on the demo key, which expired today. SurgeFTP was stopped, and WU-ftpd was restarted so that people can upload their files. We are writing