Spam Quarantine Outage – Friday, October 10th 6:30-8:30am

Starting at 6:30am on Friday, October 10th, the Sophos Spam Quarantine will be down for maintenance.  During this time users will not be able to access their spam quarantines, but mail will continue to be delivered as normal.  The Quarantine is expected to be available by 8:30am.

Update — 10/10/2008 @ 8:30am

Migration is taking longer than expected waiting on replication of small files.  We currently estimate that the Spam Quarantine will be up approximately 9:30am.

Update — 10/10/2008 @ 9:45am

Migration was completed at approximately 9:30am.

CRM application tier bounced at 10:20 am on 10/1/2008

CRM fulfillment experienced an error during the processing of an email (rollback segment problem). Restarting the email caused a runaway cpu-intensive session even though the email was subsequently cancelled.

Cause of runaway session: user created content using the wrong content type (admission content type instead of campus email), which relied on admission data being in the system.

Resolution: We bounced the application server and killed the runaway session. User recreated email with correct content type.

PureMessage Edge Server Maintance 9/26

On Friday 9/26 from 5am – 9:30am a redundant PureMessage server will be down for upgrades.  As this server is redundant end users should not notice any downtime.  Users may be unable to recover certain messages from the quarantine during this time.

Details:

Due to high iowait times support suggested we switch disks in the hades.ups.edu server to RAID10 (currently RAID5).  To achive this we will rebuild the server via our automated installation method and then restore the PureMessage applications and data.  Expected downtime for hades is 4.5 hours (1.5hr backup, 1hr disk rebuid, 1 hr OS load, 1hr restore)

Problems with www2.ups.edu

Updated 9/25 11:51 AM

We currently are experiencing problems with www2.ups.edu. FTP service, which was not working earlier, has been restored.

We plan to reboot this system tomorrow morning (9/26/08) before 9:00 AM and hope that this will resolve the issues we’re having. More will posted here as we continue investigate the problems.

All Database Systems Unavailable – 09.09.08

All central database systems are unavailable.  This service disruption began at 11AM today and services impacted include: Cascade, Cascade Web, Banner, Famis, Basis, CRM and Millennium. Technology Services has determined the source of the disruption to be a power outage and is working with Facilities Services to resolve as quickly as possible.  

UPDATE AS OF 2PM:
Stable power has been restored and Technology Services is now focused on restarting and verifying the database systems.  Every effort is being made to restore services as quickly as possible.  Our earliest continuation of service is estimated at 3pm today.

Cascade Unscheduled Downtime

At 6 AM this morning, a reload of the core switch caused a failure of the network interfaces on the Cascade Web server. This in turn caused the Cascade web interface to be unavailable from 6 to 8:15 this morning. The server was rebooted and recovered, restoring service at approximately 8:15.

Other affected servers were grace, camano, and crystal, all of which were rebooted to restore service at 7:30 AM. Batch job processing was affected between 6 and 7:30 AM.