[MONITORING] Internet & Inter-DC Connectivity Loss

We’ve just experienced a short period of a few minutes of connectivity loss between data centres and also to the Internet.

Connectivity has been restored and engineers are currently investigating.

UPDATE 09:13: Connectivity continues to be intermittent.  We are also experiencing problems with our office Internet connectivity which in turn is affecting our office phone system.  The only common factor in these problems is the underlying fibre network that our data centre provider and our office internet connectivity use: Virgin Media.

UPDATE 09:45: Connectivity has been stable for approx 20 minutes now.  We are waiting for an update regarding the outage and will post it here as soon as we have more information.

Scheduled Emergency Maintenance: 17/03/17

Our upstream network provider has notified us of an upcoming emergency maintenance window.  This work has been identified in order to mitigate a potential issue that has been identified and to mitigate a possible risk of unplanned service interruption.

Please be advised that a maintenance window from 00:00 until 02:00 GMT on 17th March 2017 has been allocated to this work.

Whilst every effort will be made to minimise impact, services we provide such as Internet connectivity, Website Hosting, Email and DNS and should be considered at risk during this period and we will be sending updates before and after the work.

If after the maintenance period has closed you are still encountering issues, please contact us via your usual support channels.

[RESOLVED] Upstream network instability

We’ve just received note from our upstream network provider about occasional instability in their network:

We are currently investigating a network issue which may have impacted customers during the last hour or more. Whilst most services are stable we are still investigating more isolated pockets of instability and we aim to provide a further update shortly.

We will update this post as soon as we receive further information…

[RESOLVED] Internet connectivity issue

We are currently experiencing issues with our external Internet connectivity.  Engineers are investigating and we will update this post as soon as we have further information.


Update 23:00: It appears one of our VPS customers was the subject of a DDOS attack.  This caused problems with our connectivity as it struggled to cope with the massive flood of incoming traffic.  Traffic levels have returned to normal and we are consulting with our upstream providers to mitigate the problem.

[RESOLVED] Network Connectivity Issue

Our upstream network provider is currently suffering from a core network issue.  As a result, this is causing intermittent packet loss on all connectivity into the Zebra network.

Engineers are currently working on the issue and will have the problem resolved as soon as possible.

We will update this post as soon as we have further information.


Update: We have been informed the outage was due to a DDOS attack (Distributed Denial Of Service).  The attack was of such as size that all core network connectivity was affected.  Engineers are interrogating data collected in order to look for any recommendations they can make for further enhancement to optimise network performance and security.

Update 07:30: Apologies for the late update.  As of approx 04:40 this morning engineers believed to have the problem under control.  We are currently monitoring.

Update 04:25: Engineers are continuing to work towards a resolution.

Update 03:00: The root cause of the issue as been identified and engineers are now working on a resolution.

Update 23:50: Engineers are continuing to work to stabilise the network issues. We will update this post as soon we have further information.

Update 22:00: Engineers at our provider are continuing to work on the problem.  We will update this post as soon we have further information.

[RESOLVED] Routing problems within BT connected networks

We are currently seeing routing issues to BT Internet connected locations and 3rd parties who receive service from BT related services.

Upstream engineers are monitoring the major service outage on the BT network but at this time only external/routed connections to the BT networks are affected.

This post will be updated as and when more information is received.


Update @ 11:20 – The problem with BT’s network is starting to hit the major technical news sites.  More useful information can be found at ArsTechnica: http://arstechnica.co.uk/business/2016/07/bt-isps-telehouse-north-major-outage/
and The Register: http://www.theregister.co.uk/2016/07/20/telecity_power_outage_bt_offline/

[RESOLVED] Network Connectivity

Earlier this morning a routing issue was identified in one of our upstream network providers which caused connectivity problems when attempting to access our network.

A work around has been put in place which has restored access and we are continuing to monitor the situation.

We will update this post when we have more details.


Update 10:05: The earlier partial connectivity problems were due to our upstream provider not providing us with a full BGP routing table.  This resulted in only part of the Internet being fully reachable from our network.  A workaround was put in place which restored connectivity whilst engineers investigated further.  As of 10:00 a full BGP routing table is now being received and the earlier workaround has been removed.  We will continue to monitor connectivity and will update this post if required.

[DELAYED] Planned network maintenance

On Tuesday 9th February starting at approximately 21:00 we will be performing essential maintenance on our core network
routers.

The maintenance will be performed in such as way as to minimise any possibility of interruption to your services. However
due to the nature of the work, network services will be classed “at risk” as resilience will be impacted.

It is estimated that the work will take less than 1 hour to complete.

Additional updates will be posted to our service status blog at http://www.zebrastatus.net


Update 18/02/2016 – 09:00

Due to the hardware problems we encountered during the original maintenance procedure, we’re waiting for additional spare backup parts to arrive before we’re prepared to perform more maintenance.

As soon as these parts arrive a new maintenance period will be arranged.


 

Update 12/02/2016 – 16:30

“Anubis”, the affected core router has now been replaced and full network resiliency has been restored.

The maintenance due to be performed on the second core router is now expected to be rescheduled for the week commencing 15/02/2016.  A further update will be made once a firm plan is in place.


 

Update 12/02/2016 – 12:30

The replacement hardware has arrived on site and is being prepared for installation.


 

Update 10/02/2016 – 10:00

Replacement hardware has been ordered and we are now awaiting delivery.  Whilst our network resilience is reduced until the replacement can be installed, customers should not encounter any problems.  Engineers will update this post as and when further information is available.


 

Update: 09/02/2016 – 22:20

An issue was encountered during the maintenance of one of our core edge routers. As a result, this router is currently out of service. All further maintenance tasks have been suspended.

[RESOLVED] Shared Hosting Server Degraded Performance

One of our shared hosting servers, Linsvr2, is currently experiencing higher than usual load.

This is causing websites & email accounts for customers hosted on that server to either run slower than usual or timeout with an error.

Engineers are currently investigating and will update this post as soon as possible.

 


It appears a customers WordPress website has been compromised and as a result a rogue webscript had been installed.  This script contained programming errors that caused it to run in an “infinite loop”, using all available processing resources.

The compromised website has been shut down until our customer can confirm they have resolved the problem.