FYI – K20 Announcement of fiber damage from the storm.  No issues have been reported yet as a result.

INVOLVING: K20

CURRENT EVENT START  : 12/13/10 22:22 PST

CURRENT EVENT STOP   : ?

 SUMMARY:

  This notice is being sent to all K20 customers.  We probably should have sent this notice sooner.

  WSIPC is part of a 14 site outage caused by a fiber break which occurred last night at 22:22 pm.  That specific outage is being tracked in our ticket 940112.

  As many customers have noticed, WSIPC services such as hosted DNS, email, and some applications such as Skyward and Citrix are affected.

  These WSIPC services will be down until the outage can be resolved.  At this time, the fiber provider is working on providing an alternate path for the affected transport circuits but there is still no ETR at this time.

 =================================================================

K20 Network Operations                  Network Operations Center

noc@wa-k20.net                                     888-934-5551

=================================================================

                     DETAILED OUTAGE HISTORY

 K20 Area Wide

  FIRST DOWN: 12/13/10 22:22 PST

 LAST UP: ?

 TOTAL DOWN TIME: 15 hours 10 minutes

  TOGGLES:

    12/13/10 22:22 PST – ?

Boss Reports Server has been switched

In production, Boss had been using the rep_bossweb reports server which is not stable.

It was down this morning, so we switched it to use the UPS_REPORTS_NONSSO_PROD reports server in production. That worked fine.

There is still the issue of report output security, because anyone using that reports server can access other people’s report file output.

We will plan to test Boss with the SSO reports server and see how it functions and whether it would work for us.

Active Directory password policy was temporarily too restrictive

The Active Directory password policy was inadvertently set to reject passwords that did not contain any special (non-alphanumeric) character, such as *#$% etc.

The problem began about 3/21/2009 and was corrected at 3:15pm on 3/26/2009. During this period, anyone changing a password using Windows was instructed to include a special character.

Passwords changed using Cascade Web during this period were not synchronized to Active Directory, so the new password did not work for Webmail, Windows, etc. This can now be corrected by changing either the AD or OID password.

The problem was corrected by deselecting the special character requirement in the AD password policy.

Here is an example of the error in the ActiveExportUsers_Groups.trc log:

Error in executing mapping DIP_LDAPWRITER_ERROR_MODIFY
javax.naming.OperationNotSupportedException: [LDAP: error code 53 - 0000052D: SvcErr: DSID-031A0FC0, problem 5003 (WILL_NOT_PERFORM), data 0
]

[Resolved] CRM database error, a couple mass emails adversely affected

A database error occurred in CRM about 11:30 am on 4/15/2009. DST was alerted to the problem, a table that was unable to extend, and fixed it about noon. A couple of email campaigns were in-progress and were adversely affected:
1. A message from the President’s office going to faculty, staff and students was sent out twice, but the records indicated it only went out once.
2. A message that was being created by Admission was in the middle of generating the target group and got stuck there. Every attempt to resolve it failed so the solution was to copy the schedule without the target group, re-create the target group and then the email was sent out successfully.

[Resolved] Cascade Web application server is unavailable

[Update 3/19/09 4:20 PM] Services have been restored. The application services were stopped and restarted. Root cause analysis in underway.

The application server that hosts Cascade is currently unavailable. Other services impacted are Cascade, Portal, Discoverer, CRM, and Views Flash Survey. TS is aware of the problem and it is being investigated.

Internet Service Resumes

At approximately 10:30 AM, Internet service was resumed as Integra Telecom moved our routes to a temporary router. Internet service resumed until about 12:30 PM, when Integra moved our routes back to our normal router, causing a brief 5-10-minute interruption in service.

April 1 – April 7 listserv Delivery Problems

Beginning April 1st, the Mailman  listservs started failing to deliver messages to their membership.  This happened because one of the real-time blackhole lists that Mailman was subscribed to went out of service permanently, causing Mailman’s mail server (sendmail) to reject all messages outgoing to the list memberships. So the messages were received by Mailman and archived properly, but they were not delivered. This was found and resolved on April 7th.

This coincided with another email problem  on April  2-4, which masked  the listserv issue, which is why it took so long to find and resolve.

Internet Outage

Our ISP, Integra, was having some issues last night between 8:10pm and 9:46pm. I called ELI to have them look into the problem. Mark and I came onsite to assist. The problem was resolved by Integra who reset the connection interface on their end. I have not seen any further issues occurring.