Service Event - 6:30am - 9:37pm Saturday 22 November
A major outage upstream occurred that affected our services and those of other providers. The matter has been resolved and we continue to monitor services and connections. This page provides initial details surrounding the event. We are awaiting detailed information from other providers which we will update here.
Early Saturday morning a core central piece of equipment upstream that serves Earthlight and others failed. Staff operated from IDC (Invermay Data Centre) throughout the event. The nature of the fault limited Earthlight staff communications but was the best place to operate from. This is noted as an aspect to be improved.
Upstream staff worked to isolate the fault. This activity included a TAC (Technical Assistance Case) with the manufacturer overseas with whom the equipment is under support. It was determined hardware was at fault. Arrangements were made to obtain a replacement which was brought onsite. Once onsite work commenced to physically change to the replacement hardware. Next the task of shifting configuration was performed. As part of this a full checkout has to be performed before it can be released back into service.
Services began to return shortly after 9:30pm. In most cases clients came back online without intervention. A few had to restart their connection to resume. Earthlight attended some sites as well to restart services.
We apologise for inconvenience caused by this event.
The Event in Detail so far
We are awaiting detailed information from upstream providers which will be added later. In the meantime this is the timeline of events as we have so far
At 6:30am a huge number of alerts signalled loss of services. Staff attended IDC to check infrastructure and local services. With this confirmed we contacted upstream and were advised a fault was being investigated there.
7:00am The nature of the fault impacted a number of services. About the only small drawback of IDC's location is poor mobile coverage. This limited redirecting calling. Instead Text messaging was the primary means ot communication for this event. This gave us means to respond and update.
8:30am Upstream fault isolation processes expanded to include overseas technical support. The failed equipment is covered under maintenance support.
9:55am Per maintenance support equipment to replace the failed item was sought. Logistics processes proceeded to gain access to maintenance spares held under support contracts.
2:00pm Replacement equipment received and taken to the Christchurch data centre. Staff commenced equipment changeover which is the first step.
4:30pm With spanner work completed recovery moved on to recovery of setup or configuration. This is enormously complex and just takes time to work through.
5:50pm Overseas technical support continue to be involved in recovery.
7:30pm Progress made, but network core is not yet ready for service
9:00pm Recovery process nears completion with check-out. Network core begins to re-enter service in stages.
9:30pm Services return in full. Earthlight staff have been keeping updates flowing throughout this time. We now move to verification of services. Most return, but some need assistance.
11:30pm Last site visit completed to confirm service recovery. Network and service monitoring shows everything operating as far as we can see.
Sunday 23 November
Monitoring continues and upstream indicate everything has returned to normal We carried out a few more site visits to move backup connections back to main along with some end device restarts.
Next Steps
This is an initial report as we await details from others. Outages can occur, and this is one is rare given the equipment involved. We use events to review response, handling, recovery and resilience for services that were affected. We will be posting further updates along with advisory pages.
Again, our apologies for the inconvenience. Please get in touch if you have any questions or need further information.


