PureMessage Edge Server Maintance 9/26

On Friday 9/26 from 5am – 9:30am a redundant PureMessage server will be down for upgrades.  As this server is redundant end users should not notice any downtime.  Users may be unable to recover certain messages from the quarantine during this time.

Details:

Due to high iowait times support suggested we switch disks in the hades.ups.edu server to RAID10 (currently RAID5).  To achive this we will rebuild the server via our automated installation method and then restore the PureMessage applications and data.  Expected downtime for hades is 4.5 hours (1.5hr backup, 1hr disk rebuid, 1 hr OS load, 1hr restore)

Problems with www2.ups.edu

Updated 9/25 11:51 AM

We currently are experiencing problems with www2.ups.edu. FTP service, which was not working earlier, has been restored.

We plan to reboot this system tomorrow morning (9/26/08) before 9:00 AM and hope that this will resolve the issues we’re having. More will posted here as we continue investigate the problems.

All Database Systems Unavailable – 09.09.08

All central database systems are unavailable.  This service disruption began at 11AM today and services impacted include: Cascade, Cascade Web, Banner, Famis, Basis, CRM and Millennium. Technology Services has determined the source of the disruption to be a power outage and is working with Facilities Services to resolve as quickly as possible.  

UPDATE AS OF 2PM:
Stable power has been restored and Technology Services is now focused on restarting and verifying the database systems.  Every effort is being made to restore services as quickly as possible.  Our earliest continuation of service is estimated at 3pm today.

Cascade Unscheduled Downtime

At 6 AM this morning, a reload of the core switch caused a failure of the network interfaces on the Cascade Web server. This in turn caused the Cascade web interface to be unavailable from 6 to 8:15 this morning. The server was rebooted and recovered, restoring service at approximately 8:15.

Other affected servers were grace, camano, and crystal, all of which were rebooted to restore service at 7:30 AM. Batch job processing was affected between 6 and 7:30 AM.