The temporary CA “UPS CA No.2” has been decomissioned, as has the CA on dm-2. These are superseded by University of Puget Sound Certification No. 3
Email Problems
There were reports last night and this morning that user were having problems connecting to webmail. The problem appears to be linked to the Sophos PureMessage software and the timing of the quarantine digest messages and the slow disk array.
We updated our PureMessage configuration to send on digest messages instead of having each server send individual messages. The result being that all the messages hit the mail server and the same time and overwhelmed it with e-mail and client access with the start of the business day.
We are modifying the interval for digest messages from twice a day to once per day and moving the time to 1:30 am each morining.
MERLIN2 Problems
MERLIN2, the administrative file server became unresponsive for unknown reasons, causing workstations connected to it to also become unresponsive.
The server was rebooted, which cleared the problem. Workstations my need to be rebooted as well.
IMAP, POP3, and Webmail Slowness IV
The firmware upgrades did not resolve the problem with disk performance, even though we had a good couple of days. Sun’s final analysis led them to the conclusion that the disk array on which mailboxes reside has simply reached a saturation level in terms of I/O rate. This will mean that we will have to get a faster disk array.
Continue reading
IMAP and POP3 and Webmail Slowness III
Firmware upgrades to the mail server disks have been applied. This had little effect on performance. We’re following up with Sun to determine next steps.
Mail Server Maintenance
All email service will be unavailable from 5PM to 6PM today. Firmware upgrades will be applied to address Webmail, IMAP, and POP3 slowness first noted in this entry: http://www2.ups.edu/ois/nssg/blog/maintenance/archives/000449.html
IMAP and POP3 and Webmail Slowness II
Sun has identified some issues with our disk array configuration. They have provided some new settings for us to apply. One of the changes has been made, with little effect on disk performance. The other two changes will require that the email server be shutdown. We are scheduling this right now. Please refer to the “Scheduled Outages” (http://www2.ups.edu/ois/nssg/network/alerts.shtml)page for the latest information.
IMAP and POP3 and Webmail Slowness I
We are currently experiencing disk performance problems on the mail server. This is causing slowness and problems connecting with Webmail and POP3 and IMAP clients. We are working with the hardware vendor to correct the problem.
WeMail Problems I
Today at about 10:00 AM, the HelpDesk reported major failure in WebMail.
We noticed the presence of a large number of processes (about 1000 and growing) on the mail server, and a larger than normal of mail in-queue. WebMail was stopped, server processes on the mail server were stopped, and the mail queues were processed by hand.
Continue reading
SAN Maintenance and New Installations
On Saturday we will restructure the fibre channel fabric by replacing the existing 8 port switch with two 24 port switches—32 total ports enabled. As part of the restructuring, the database systems (Lenel, Rainier, Crystal, and Grace) will be down. The SAN and tape library will also be down during the restructuring.
We will be attempting a variety of hardware related tasks during the down time:
1. Turning the two Dell cabinets 180 degrees to provide better air flow.
2. Reworking the network, fibre, and power cables in the Dell cabinets for better access.
3. Migrating and rezoning all fibre channel connections to the new switches.
4. Installing a new HBA in Rainier and upgrading the PowerPath license.
5. Adding additional systems to the fibre channel fabric (Exchange Servers, AX100 appliance, Veronica, Merlin2 and Alexandria).
We estimate that this will take approximately eight hours. We will be starting at 8:00 am on Saturday, June 11
Change to PureMessage
Over the last few weeks a pattern has appeared in the cpu load on the sophos server: over the course of time the cpu load increases and stays high (
PureMessage not accepting messages
PureMessage stopped accepting messages this morning when the disk volume was filled by log files. Corrections have been made to the logrotate.conf file in an attempt to prevent this from occurring in the future.
modification to mx records
MX record added for second interface on gehenna to reduce mail load to sophos. Entry created in firewall and external dns for anneheg.
Stray Entry Hit with Comment SPAM
One of our Maintenance Blog entries was accidentally left open to comments, and was filled with “Comment SPAM”. The offending entry was reconfigured, and the SPAM was deleted.
E-mail problem: 5550 5.3.0 Can’t create output
Some user reported this morning the inability to send messages to users. They received a common error,
Final-Recipient: RFC822; username@ups.edu
X-Actual-Recipient: RFC822; username@ups.edu
Action: failed
Status: 5.3.0
(reason: Can’t create output)
This error was the result of poor deactivation of the quotaing system. The quota system had been turned off, but not removed from the fstab file. When the system was rebooted yesterday, the quota system was re-enabled and locked account in excess of their time limit. This issue has been corrected.