Luna Node Status

Information about upcoming maintenance, current downtime, and RFO for past downtime will be posted here. Updates more than two weeks old may be purged.

Please contact support@lunanode.com for technical support inquiries.

Planned maintenance / current issues

There are no planned maintenance or ongoing issues at this time.

Resolved issues

Roubaix network downtime (5 Jun 2017 03:40 EDT)

Packet loss on internal network due to upstream provider issue caused downtime from 03:40 EDT to 05:40 EDT today morning. We are still waiting for a detailed report from the provider.

Toronto network downtime (18 May 2017 09:04 EDT)

Update 1: network is back online after upstream issue was resolved.

We are investigating network downtime in Toronto.

Network degradation (11 May 2017 15:23 EDT)

We are continuing to investigate network degradation that began at 13:43 EDT.

Toronto hypervisor failure (1 May 2017 21:12 EDT)

A hypervisor toronto14 experienced kernel panic in Toronto. We have brought the hypervisor back online and we are investigating.

Montreal network outage (26 April 2017 10:50 EDT)

We are investigating a network outtage for montreal region

Update 3, performance on volume storage now recovered

Update 2, network has recovered, volume storage is recovering and performance is degraded heavily.

Update 1, Upstream provider issue and is currently being worked on. http://status.ovh.net/?do=details&id=14548

Toronto hypervisor emergency maintenance (24 March 2017 20:30 EDT)

We are performing emergency maintenance on toronto1 due to a detected disk issue.

Montreal hypervisor disk failure (26 February 2017 16:35 EST)

The hypervisor montreal1's disk became inaccessible at 16:35 EST, resulting in the eventual hypervisor failure at 16:45 EST. Hypervisor was power-cycled and came back online with no disk errors. We will investigate as to the root cause of the issue. Service was restored at approximately 17:00 EST.

Toronto network issues (26 February 2017 04:20 EST)

It does appear that there were intermittent packet losses for traffic in/out of Toronto region starting from 4:20am EST to 6:30am EST, and appears to have originated upstream of our router in Toronto.Our network monitoring system was not able to effectively detect this problem due to these interruptions to our network in Toronto being intermittent in nature and only lasting no more than 30s at any given time. We will see if a better detection method can be devised to capture events like this.

We have not observed any issues since 6:30am EST, and will continue to monitor our network in Toronto.

We apologize to for any inconvenience this may have caused.

Montreal / Roubaix hypervisor failures and network interruptions (23 January 2017 09:15 EST)

One hypervisor in each region became unresponsive at around 09:15 EST, and required a hardware power cycle to bring them back to service. As a result ~ 30 minute of network interruption in Roubaix, and 10 minutes of network interruption in Montreal were observed.
We are still examining availalbe logs to attempt to identify the root cause of this event

Toronto planned network maintenance (20 January 2017 00:00 EST)

Update 2 @ 20 January 2017 05:30 EST: the maintenance is completed. there was approximately 15-20 minutes of external network downtime.

Update 1 @ 20 January 2017 00:12 EST: we are seeing intermittent external network connectivity drops in relation to this maintenance.

Our upstream provider in Toronto will be performing maintenance involving activating software module upgrades on code running on edge routers on 20 January 2017 beginning at 00:00 EST in the morning. They expect 45-60 minutes of external network connectivity downtime during the maintenance window. Internal connectivity will remain available throughout the maintenance, and VMs will not be rebooted.

Start: 20 January 2017 00:00 EST (20 January 2017 05:00 UTC)
End: 20 January 2017 06:00 EST (20 January 2017 11:00 UTC)

Please contact us at support@lunanode.com if you have any questions regarding this planned maintenance.

Montreal hypervisor failure (7 January 2017 03:30 EST)

montreal1 failed, causing ten minute downtime.

Montreal hypervisor failure (20 December 2016 15:22 EST)

Montreal hypervisor failure, we are still investigating.

Toronto network interruption (12 December 2016 11:20 EST)

Isolated network interruption to our Toronto region was observed. Resolved since 11:32 EST

Roubaix hypervisor failure (10 December 2016 11:59 EST)

The SSD hypervisor rbxssd1 failed due to kernel panic. Services are back online at this time. We are investigating.

Roubaix hypervisor failure (12 November 2016 17:09 EST)

We are investigating a storage array issue on rbxssd1.

Toronto hypervisor failure (14 October 2016 2:28 pm EDT)

Update: virtual machines are being booted at 2:30 pm EDT after five minutes of downtime. We do not anticipate any further issues on the upgraded kernel.

The SSD-cached hypervisor toronto1 has failed due to kernel panic. We are rebooting to newer kernel at this time.

Toronto hypervisor failure (6 October 2016 9:45 pm EDT)

One of the SSD hypervisor has failed in Toronto, we are investigating the issue. We expect to have VMs back online in three to six hours.

Toronto network downtime (6 October 2016 9:45 pm EDT)

We are investigating another router failure.

Toronto network downtime (6 September 2016 7:30 pm EDT)

Upstream network interruption resulted in 15 minutes downtime for our services in Toronto region.

Roubaix network downtime (2 August 2016 3:00 pm EDT)

A router in Roubaix has crashed again. The node has been replaced to prevent future occurrences of this issue.

Roubaix volume detach issues (7 July 2016 1:00 pm EDT)

Volumes detached over the last week may not have correctly detached from the VMs. This is due to infrastructure maintenance where controller was migrated but volume metadata was not updated correctly to point to the new controller. Any volumes still attached to deleted VMs have been detached forcefully. Please contact support if you have any issues with your volume in Roubaix.

Image storage backend migration (6 July 2016 6:00 pm EDT)

Update 1: image replication is available again at 1:00 am EDT.

We are migrating image storage backend from LizardFS to Ceph RBD in Montreal and Roubaix. Image replication may be unavailable during this migration. However, images and image download from panel will remain available.

Toronto packet loss (6 July 2016 2:40 am EDT)

Update 1: we have received more details on the maintenance. It is performed on a cable between London and Montreal by Hibernia Networks for vendor fiber network upgrades.

We are seeing heavy packet loss to destinations in Europe. Cogent (the affected upstream provider) says that this is due to maintenance on the Liverpool-Montreal link. It appears to be resolved as of 3:00 am EDT. We have requested more details on the maintenance from Cogent.

Roubaix network downtime (3 July 2016 4:55 pm EDT)

Roubaix router crashes, we have switched to backup, there is two minutes downtime before switch completed. We are investigating. We will likely decomission this router since this is second crash in a week.

Infrastructure maintenance (1 July 2016 9:00 pm EDT)

We will be performing infrastructure maintenance in all three regions to upgrade OpenStack nova to the latest version. This will fix broken resize functionality. No downtime is expected.

Roubaix infrastructure maintenance (1 July 2016 12:00 pm EDT)

We will be migrating the OpenStack controller node in Roubaix in relation to infrastructure maintenance and upgrades. No VM downtime is expected, but panel control actions will be unavailable during the maintenance. The maintenance is expected to conclude within one hour, but may take up to two hours.

Roubaix hypervisor retirement (1 July 2016 12:01 am EDT)

Virtual machines will be migrated from hypervisors that we are retiring due to unidentifiable hardware issues. During the migration, affected virtual machines will be offline. Affected clients have been notified by e-mail (only a small number of VMs in Roubaix are affected).

Roubaix IP shortage (29 June 2016)

We are still waiting for allocation of IP block in Roubaix. Meanwhile there is IP shortage and some virtual machines may not be assigned floating IP addresses. We have freed as many unneeded IP addresses as possibly from system resources.

Montreal hypervisor retirement (30 June 2016 12:00 am EDT)

Virtual machines will be migrated from hypervisors that we are retiring due to unidentifiable hardware issues. During the migration, affected virtual machines will be offline. Affected clients have been notified by e-mail (only a small number of VMs in Montreal are affected). Volume-backed VMs have already been migrated without downtime.

Roubaix network downtime (29 June 2016 9:44 am EDT)

A router went offline, network connectivity was down for six minutes.

Roubaix hypervisor retirement (28 June 2016 12:01 am EDT)

Virtual machines will be migrated from hypervisors that we are retiring due to unidentifiable hardware issues. During the migration, affected virtual machines will be offline. Affected clients have been notified by e-mail (only a small number of VMs in Roubaix are affected).

Toronto packet loss (19 June 2016 7:30 am EDT)

Network was deteriorated, with minor packet loss, from 7:30 am EDT to 8:30 am EDT. Cause was an outgoing SYN flood from a VM on the network. Automatic detection and mitigation systems failed to trigger on the SYN flood. The SYN flood appears to have stopped by itself at 8:30 am EDT. We are making adjustments to the infrastructure to prevent a recurrence of this issue.

Montreal hypervisor retirement (16 June 2016 12:00 am EDT)

Update 1: this migration has been completed and the hypervisors (montreal3 and montreal4) have been retired. We will perform similar hardware updates in Roubaix once we confirm the stability of the new equipment.

Virtual machines will be migrated from hypervisors that we are retiring due to unidentifiable hardware issues. During the migration, affected virtual machines will be offline. Affected clients have been notified by e-mail (only a small number of VMs in Montreal are affected). Volume-backed VMs have already been migrated without downtime.

Roubaix hypervisor failure (15 June 2016 3:39 am EDT)

A hypervisor in Roubaix (rbx1) failed. Services are back online after ten minutes.

Montreal hypervisor failure (11 June 2016 7:56 am EDT)

A hypervisor in Montreal (montreal3) failed. Services are back online after ten minutes.

Roubaix hypervisor failure (11 June 2016 4:31 am EDT)

A hypervisor in Roubaix (rbx-ctrl) failed. Services are back online after ten minutes.

Roubaix network downtime (31 May 2016 9:27 pm EDT)

Update 1: the primary network node is back online after datacenter technician resolved the problem; the technician did not indicate what the problem was.

Internal network interface on the network node has gone offline. We are investigating the issue. In the meantime services are back online as of 9:38 pm EDT after we switch to backup network node.

Roubaix hypervisor failure (31 May 2016 12:21 pm EDT)

A hypervisor in Roubaix (rbx1) failed. Services are back online after ten minutes.

Montreal hypervisor failure (31 May 2016 3:10 am EDT)

A hypervisor in Montreal (montreal3) failed. Services are back online after ten minutes. We have upgraded the kernel to a newer version.

Roubaix hypervisor failure (30 May 2016 11:05 pm EDT)

A hypervisor in Roubaix (rbx2) failed. Services are back online after ten minutes. We have upgraded the kernel to a newer version.

Roubaix hypervisor failure (30 May 2016 4:42 pm EDT)

A hypervisor in Roubaix (rbx1) failed. Services are back online after ten minutes.

Montreal hypervisor failure (29 May 2016 1:22 am EDT)

A hypervisor in Montreal (montreal3) failed. Services are back online after ten minutes. The hardware change appears to have not improved the server stability. We are still investigating.

Roubaix hypervisor failure (27 May 2016 8:42 pm EDT)

A hypervisor in Roubaix (rbx2) failed. Services are back online after ten minutes. We are still waiting to verify whether the hardware changes in Montreal will repair the problems.

Bitcoin payment processor adjustments (25 May 2016 11:00 pm EDT)

Bitcoin payments will now be processed via Stripe. The payment process is essentially the same, simply select the "Credit Card or Bitcoin" option from the Billing tab. If your currency is set to CAD, then you will need to select the separate "Bitcoin" option so that the Bitcoin charge is calculated in terms of USD. The payment processor is changed from Bitpay because Bitpay not good.

We will also be requiring phone number verification for all payment methods due to increased fraud, especially with terrorist websites and botnet controllers. Previously this verification was only needed for non-Paypal credit card payments. Sorry for the inconvenience.

Roubaix hypervisor failure (23 May 2016 10:00 am EDT)

A hypervisor in Roubaix (rbx-ctrl) failed. Services are back online after fifteen minutes. We are still waiting to verify whether the hardware changes in Montreal will repair the problems.

Toronto denial of service attack (21 May 2016 7:57 pm EDT)

We are currently mitigating a denial of service attack in Toronto. There is recurring packet loss due to the attack since 7:36 pm EDT.

Roubaix hypervisor failure (21 May 2016 11:14 am EDT)

A hypervisor in Roubaix (rbx2) failed. Services are back online after ten minutes. We are still waiting to verify whether the hardware changes in Montreal will repair the problems.

Montreal hypervisor maintenance (19 May 2016 6:00 pm EDT)

Update 3: we continue to monitor the situation in case the symptoms persist. Other hypervisors in Montreal are not affected by the defect, but maintenance on rbx-ctrl, rbx1, rbx2, and rbx3 may be needed.

Update 2: services are back online at 6:06 pm EDT.

Update 1: the maintenance has started, at 6:00 pm EDT.

OVH will be performing maintenance on two hypervisors in Montreal (montreal3 and montreal4) to correct a hardware defect that may be responsible for the repeated hypervisor failures in the last three months. We expect ten to twenty minutes of downtime. Specifically, there is a four-pin connector that OVH says has caused symptoms including reboots, freezes, and other motherboard-general issues on other dedicated servers of the same type; the maintenance involves disconnecting this connector.

Montreal hypervisor failure (18 May 2016 6:44 pm EDT)

A hypervisor in Montreal (montreal3) failed. Services are back online after ten minutes. We are investigating.

Roubaix hypervisor failure (15 May 2016 2:24 am EDT)

A hypervisor in Roubaix (rbx2) failed. Services are back online after ten minutes. We are still investigating, the kernel and ethernet driver updates do not appear to have resolved the stability issues.

Montreal hypervisor failure (14 May 2016 9:08 pm EDT)

A hypervisor in Montreal (montreal3) failed. Services are back online after ten minutes. We are investigating.

Montreal hypervior failure (13 May 2016 11:34 pm EDT)

A hypervisor in Montreal (montreal4) failed. Services are back online after ten minutes. We are investigating.

Roubaix hypervisor failure (5 May 2016)

A hypervisor in Roubaix (rbx3) failed. Services are back online at this time. The kernel and driver installation has been adjusted.

Montreal hypervisor failure (28 April 2016)

A hypervisor in Montreal (montreal4) failed. Services are back online after ten minutes. Kernel and ethernet driver have been updated.

Roubaix hypervisor failure (21 April 2016 2:50 pm EDT)

A hypervisor in Roubaix (rbx1) failed. Services are back online at this time.

Montreal hypervisor failure (12 April 2016 2:40 pm EDT)

Update: we have adjusted kernel and driver installation. Services are back online at 2:50 pm EDT.

Due to reasons yet unknown the hypervisor(montreal3) was rebooted.

Montreal hypervisor failure (7 April 2016 9:00 pm EDT)

Uodate: motherboard replaced

A hypervisor in Montreal(montreal4) failed, we are investigating the issue.

Montreal hypervisor failure (2 April 2016 1:30 am EDT)

Update: RAM, raid, PS replaced

A hypervisor in Montreal (montreal4) failed. We are migrating client's vm off of this node so we can perform tests and replace hardware if necessary. Clients effected by this operation has been notified over email.

Roubaix hypervisor failure (31 March 2016 3:00 am EDT)

Update: PS replaced

A hypervisor in Roubaix (rbx3) failed. Services are back online and SLA credit was issued for this outtage.

Montreal hypervisor failure (27 March 2016 5:50 am EDT)

A hypervisor in Montreal (montreal4) failed. Services are back online at this time. We have reinstalled stable kernel and continue monitoring the status.

Toronto network issue (27 March 2016 2:05 am EDT)

There is packet loss for a few minutes due to incoming denial of service attack. The attack was eventually automatically mitigated and network is normal at 2:12 am EDT.

Roubaix hypervisor failure (24 March 2016 11:48 am EDT)

Update: the hypervisor crashes again at 12:34 pm EDT. We are downgrading kernel and investigating crash dump.

A hypervisor in Roubaix (rbx3) failed. Services are back online at this time. We have enabled additional monitoring to identify the cause of the repeated failures. The ixgbe driver is also downgraded to 4.2.3.

Montreal hypervisor failure (23 March 2016 11:30 pm EDT)

A hypervisor in Montreal (montreal4) failed. Services are back online at this time. We have enabled additional monitoring to identify the cause of the repeated failures. The ixgbe driver is also downgraded to 4.2.3.

Roubaix hypervisor failure (23 March 2016 5:00 am EDT)

Update: hypervisor failed again, performing more tests. We will not post any more updates since there are no customer VMs on this hypervisor.

Update: raid contoller replaced, hostnode added back to aggregate.

Update: All virtual machines on the failed hostnode have been migrated.

The same hypervisor (rbx2) has failed again due to RAID array issue. Services are back online at this time, but we will be migrating VMs off of this hypervisor so that we can investigate the hardware failure more closely without affecting customer uptime.

Roubaix hypervisor fail (22 March 2016 2:48 pm EDT)

Update: no disk errors found. hostnode rebooted and brought back online at 2:54pm EDT.

A hypervisor (rbx2) is showing disk io error, we are currently investigating

Toronto network issue (18 March 2016 10:20 am EDT)

Update: upstream provider has resolved flapping route issue which caused dropped outgoing packets to some destinations.

We are investigating intermittent network issues in Toronto.

Montreal hypervisor issue (17 March 2016 2:00am EDT)

Update: Hypervisor was hung, hardreboot issued and vm back online at 2:21am EDT

One of our hypervisor in montreal (montreal4) is currently down, we are investigating.

Montreal hypervisor downtime (6 March 2016 4:00 am EST)

At this point in time the virtual machines should be online and reachable

Our montreal cluster controller was rebooted into rescue mode for unknown reasons. Once in rescue mode it continued to respond to our monitoring system but all essential services were down. This also caused our dashboard to be non-functional.
We contacted datacenter requesting the reason why our server was booted into rescue mode but they have not provided any.

Montreal hypervisor downtime (5 March 2016 2:00 am EST)

Update 3: Invervention by datacenter technician completed, we are no longer seeing temperature issues with this node

Update 2: Intervention in progress, node offline.

Update 1: we are still seeing high CPU temperature, an intervention request is in place for datacenter tech to investigate

montreal3 locked up after an overheating issue and was rebooted.

Roubaix hypervisor downtime (26 February 2016 1:37 pm EST)

rbx1 rebooted, we are investigating.

Toronto network downtime (22 February 2016 11:44 pm EST)

Update 3: we have temporary mitigation in place and services are back online at 2:20 am EST, we are working on permanent solution.

Update 2: the attack has resumed at 1:11 am EST.

Update 1: the attack has stopped at at 23 February 12:47 am EST. We are working with the upstream provider to set up a permanent mitigation system for attacks targeting the router.

Network connectivity in Toronto is offline due to a distributed denial of service attack targeting router.

Roubaix hypervisor downtime (18 February 2016 9:35 am EST)

Update: rbx2 is back online at 10:24 am EST.

rbx2 network is offline. We are communicating with the datacenter to investigate the issue.

Toronto network downtime (11 February 2016 11:00 pm EST)

Update 3: the blackhole session has been repaired at 3:12 am EST on 12 February 2016.

Update 2: the static route has been added and services are back online. Network connectivity was down from 11pm EST to 11:18pm EST on February 11th. We are working with the upstream provider to restore blackhole session.

Update 1: the upstream provider performed a migration of the blackhole service last night which evidently caused the blackhole entry on our side to be ignored. They are implementing a static route at this time.

A distributed denial of service attack has caused loss of network connectivity in Toronto. The targeted IP has been blackholed but an upstream provider is still passing the attack traffic. We are investigating and communicating with the upstream provider.

Roubaix hypervisor failure (7 February 2016 7:11 pm EST)

Update: we believe this is due to overheating issue.

A hypervisor in Roubaix location (rbx1) has failed, we are investigating.

Toronto infrastructure upgrades (27 December 2015 11:00 pm EST)

We will be making several upgrades to the Toronto infrastructure on 27 December 2015, from 11:00 pm EST to 11:59 pm EST. This maintenance will improve platform stability and, in particular, resolve rare guest disk freeze and live snapshot issues. We expect up to ten minutes of network downtime during the maintenance window. Virtual machines will not need to be rebooted. Some control panel functions may be inaccessible for some time within the maintenance window.

Roubaix region availability (19 December 2015)

We have launched a Roubaix region in OVH RBX in France!

Toronto volume upgrades (19 December 2015)

Volumes on legacy LizardFS backend were migrated to Ceph RADOS backend. This is done to allow easier migration to OpenStack Liberty soon, in Toronto.

Montreal network connectivity issues (17 December 2015)

Update: we have not seen loss of connectivity since the infrastructure updates. We continue to monitor the network.

There are repeated network connectivity issues due to large denial of service attacks that are unmitigated by OVH's Anti-DDoS system. We have performed modifications to our network infrastructure in Montreal to handle these attacks. First, we have installed an additional network node responsible for managing traffic. Second, we have extended the maximum duration of null routes from the null route mitigation system, which is separate from Anti-DDoS system. Third, we have rewritten the null route mitigation system to detect attacks more quickly by continuously capturing packets via pcap, and to maintain null routes when the attack is still ongoing instead of expiring the null route after a fixed duration. We are still monitoring the situation and improving the infrastructure in response to additional network downtime.

Montreal network downtime (17 December 2015 3:00 pm EST)

Denial of service attack took network offline for twenty minutes. Both DDoS filtering from OVH and null route mitigation system failed to block the attack. We are re-implementing null route system.

Toronto network downtime (16 December 2015 11:40 am EST)

Some IP addresses were unreachable for ten minutes due to a router misconfiguration.

Toronto packet loss (12 December 2015 8:26 am EST)

There was heavy packet loss for two to four minutes from 8:26 am to 8:30 am and from 8:56 am to 8:59 am EST. Services are online at this time. The downtime was due to a distributed denial of service attack. We are still investigating to prevent recurrence of this problem.

Montreal network downtime (2 November 2015 12:00 pm EST)

Update 2: network connectivity is back to normal at 6:00 pm EST. Affected customers have automatically received six days of SLA credit.

Update: we are seeing reduced packet loss at 5:00 pm EST.

Network in the OVH BHS datacenter is unstable. See http://status.ovh.net/?do=details&id=11305.

Toronto network downtime (25 October 2015 10:34 am EDT)

A large DDoS attack brought Toronto network down for three minutes. Our automatic DDoS mitigation system eventually blocked the attack, but did not trigger immediately. We have adjusted the mitigation system and will continue investigating.

Note that customers have received four hours of SLA credit for affected virtual machines.

Toronto Network Maintenance (22 October 2015)

This maintenance has been completed. There was no network downtime observed.

Network maintenance will be performed on the morning of 22 October 2015 in relation to a network architecture upgrade, where network circuits will be migrated to a more robust device. We anticipate up to fifteen minutes network downtime during the six-hour maintenance window from 12:00 am to 6:00 am EDT.

We apologize for the inconvenience caused by this planned maintenance event. Please contact support@lunanode.com if you have any questions regarding the maintenance.

Toronto network downtime (21 September 2015)

Network was offline for approximately ten minutes due to router failure. We switched to backup system.

Montreal packet loss / network downtime (15 September 2015)

There were a few short periods of packet loss today due to distributed denial of service attacks that bypassed OVH's filtering system, even on permanent mitigation. We have added additional protection to null route destination IP addresses that receive attacks that bypass the OVH filtering.

torontossd1 maintenance (15 September 2015)

We conducted emergency maintenance on torontossd1 to upgrade kernel after detecting potential filesystem corruption issues. Virtual machines were offline for approximately five minutes.

toronto5/toronto9 maintenance (9 September 2015 11:59 pm EDT)

Update: this maintenance completed without incident.

We will be performing maintenance to resolve filesystem errors on these two host nodes at 11:59 pm EDT. Downtime is expected to be less than ten minutes. Virtual machines will be shutoff at approximately 11:59 pm EDT and rebooted around 12:10 am EDT. Affected customers have been e-mailed.

toronto5 kernel panic (8 September 6:00 am EDT)

toronto5 experienced kernel panic and has been rebooted.

toronto2 hard drive failure (8 September 4:00 am EDT)

There was approximately three hours of downtime for instances on toronto2 following a hard drive failure. At this time the hard drive has been replaced and RAID array is rebuilding.

toronto14 internal network issue (4 September 2015 9:52 pm EDT)

Update: the network issue has been resolved. Four virtual machines were affected, and the associated clients have been credited six hours for the downtime. Twenty minutes of network downtime were recorded in total. The issue arose due to loose network cable coming unplugged during the installation of a new server.

We have detected a loss of internal network connectivity on this host node and are currently resolving the issue.

Toronto packet loss (22 August 2015 1:32 pm EDT)

We are getting DDoS attacks targeting several IPs. They are being automatically mitigated but there is a couple seconds of packet loss. We are working on this.

Toronto packet loss (19 August 2015 10:31 am EDT)

Incoming DDoS that bypassed automatic null route has been mitigated.

Toronto downtime (15 August 2015 1:00 pm EDT)

We performed software update on the router at approximately 1:00pm EDT on 2015-08-15 in order to prep for IPv6 implentation. Complication arose during the software update caused all routing to the virtual machines to be dropped, and as a result services were interrupted.

24 hours of SLA credits has been applied to those effected, please don't hesitate to contact us for any questions you may have.

toronto8 kernel panic (9 August 2015 7:00 am EDT)

toronto8 experienced a crash at 4:00 am EDT and was rebooted at 7:00 am EDT. Instances are now online. We are still investigating the cause.

Montreal packet loss (29 July 2015 10:30 pm EDT)

We experienced packet loss over approximately ten minutes in the Montreal location as a result of a SYN flood that was not mitigated by OVH's DDoS filtering system. The targeted IP address has now been null routed and we are investigating with OVH why a large volume of traffic went unfiltered.

toronto9 kernel panic (22 July 2015 4:00 am EDT)

toronto9 had a kernel panic at approximately 3:41 am EDT. The node has been rebooted with updated kernel.

montreal2 filesystem corruption (20 July 2015 5:34 pm EDT)

Update 3 (21 July 3:53 pm EDT): We have updated our monitoring system to catch various additional items in HDD S.M.A.R.T. event log, RAID controller MegaRaid event log, and kernel logs. We have been monitoring for some events but those did not get triggered in this incident (e.g. RAID controller did not detect any consistency issues). We will continue to test the degraded host node to determine whether it needs hardware replacement.

Update 2 (21 July 1:00 pm EDT): A SMART extended test of the hard drives did not reveal any issues. We will continue to test the degraded host node.

Update 1 (20 July 2015 10:00 pm EDT): We were able to successfully migrate only thirty-percent of the virtual machines. We were able to fully recover files on another forty-percent. For the last thirty-percent, in most cases partial recovery of files was possible (with most directories intact), however unfortunately as a result of the filesystem corruption a few disk images were irrepairable. Clients who experienced downtime or data loss as a result of this incident have been credited and contacted with available options.

We have detected filesystem corruption on montreal2. Virtual machines currently on this host node will be migrated to another node ASAP.

toronto3 failure (20 June 2015 2:18 am EDT)

Update 1 (21 July 3:57 pm EDT): There have been no further issues since the kernel upgrade.

toronto3 had a kernel panic at 1:57 am EDT. The node was rebooted shortly and services came online by 2:06 am EDT. We are upgrading kernel across the host nodes to improve reliability of services.

toronto1/toronto3

This issue has been resolved, the kernel module does not crash with writearound cache which it is now configured with.

These host nodes were rebooted following crash in SSD caching kernel module. We are still investigating, although virtual machines are online at this time.

Scheduled maintenance in Toronto: 31 May 2015 12:00 am

Cogent will be performing network maintenance in the Toronto datacenter next week on Saturday night / Sunday morning, from 12:00 am 31 May 2015 to 3:00 am 31 May 2015. This involves application of software upgrades to core and edge routers.

Two network interruptions are expected, the first lasting 30 minutes and the second lasting 15 minutes.

This maintenance affects Toronto services only.

Please contact support@lunanode.com if you have any questions regarding this planned maintenance event. We apologize for any inconvenience this causes.

Packet loss along some routes in Montreal 27 May 2015 7:00 pm EDT

We are again seeing packet loss to Montreal along some routes. This is a datacenter-wide incident. See http://status.ovh.net/?do=details&id=9603 for details.

Packet loss in Montreal 27 May 2015 4:30 pm EDT

We are seeing packet loss to Montreal. It looks like datacenter-wide incident.

Packet loss: incoming denial of service attack

We are mitigating an attack that bypassed our automatic null route system.

Security vulnerability: "VENOM" floppy disk controller buffer overflow (13 May 2015 2:26 pm EDT)

This maintenance event has completed. It did not go according to plan, and three host nodes needed to be completely rebooted. Approximately 30% of instances were able to be suspended and resumed, but others needed a full reboot. We apologize for the inconvenience.

Details on a buffer overflow vulnerability in the qemu hypervisor floppy disk controller were released today morning. This vulnerability affects KVM, Xen, and qemu virtualization technologies, and is severe, potentially allowing guest virtual machine instances to execute arbitrary code on the host node. We have patched our KVM installation.

Due to the severity of the problem and security risk it poses, we will be performing a suspend/resume operations tonight (13 May 2015) at 9:00 pm EDT. If you stop and then start your VM from the control panel between 13 May 2015 2:26 pm EDT and 13 May 2015 9:00 pm EDT, then the operation will not be required. Note that we will be live-migrating volume-backed virtual machines, so VMs whose root partition is on a volume will not experience downtime.

Nodes crashing (Toronto)

We are investigating.

API/panel downtime in Montreal region (9 May 2015 12:20 am EDT)

API and panel operations were inaccessible for two to three hours due to backend communication issues, which have now been resolved and made more robust.

Downtime: network downtime (4 May 2015 10:20 pm EDT)

There was 20-minute network downtime due to router failure. We have swapped the router.

Minor/feature: Ubuntu 15.04 / Debian 8.0 templates (1 May 2015)

Templates and ISO images for the new Ubuntu 15.04 and Debian 8.0 releases are now available for provisioning.

Montreal location announcement! (30 April 2015)

We are excited to announce our new Montreal location in the OVH BHS datacenter! We offer the same array of features as in Toronto, and you can immediately begin provisioning virtual machines from the lndynamic panel. Images can be copied between regions by selecting the image from Images sidebar tab, and then scrolling down to the image replication tool.

Feature: image properties (24 April 2015)

The following are now configurable per-image: network driver, disk driver, video driver, and CPU mode. These can be modified after selecting an image from the Images sidebar tab. All virtual machines booted from the image will acquire the configuration according to the image properties.

Feature: SSL certificates (23 April 2015)

SSL certificates can now be purchased from the control panel. Both single-domain and wildcard certificates are available.

Minor/downtime: network downtime (21 April 2015 6:11 pm EDT)

Loss of network connectivity for a couple of minutes due to firewall misconfiguration. We are investigating to prevent recurrence of issue.

Downtime: network downtime (19 April 2015 7:59 am EDT)

We received RFO from Cogent regarding this network downtime incident, which was emailed to lndynamic-updates. Cogent indicates that there was a network routing error caused by software bug in routing software, which will be upgraded to avoid recurrences of the problem.

Loss of network connectivity from 7:59 am to 8:06 am EDT on morning of Sunday 19 April 2015. Traceroute data points to datacenter-wide outage, waiting for RFO.

Minor/maintenance: emergency electric-system-related maintenance (14 April 2015 3:30 pm EDT)

Cogent performed maintenance on the electricity system to remove a raccoon. No downtime was expected, and the maintenance completed without incident. The maintenance window began at 3:30 pm EDT, without a specified end time; the maintenance completed before 7:20 pm EDT.

Minor/downtime: panel downtime (5 April 2015 12:25 am EST)

The issue below has been resolved.

Panel currently offline due to database issue. Will be fixed in ten minutes.

Minor/maintenance: datacenter power test (27 February 2015 12:30 am EST)

Cogent performed power switch test. This maintenance completed without incident.

Minor/downtime: emergency maintenance (30 January 2015 6:30 pm EST)

We are performing emergency maintenance relating to memory unit on toronto10.

Minor/downtime: denial of service attack (25 January 2015, 9:27 am EST)

Automatic null route system failed to block denial of service attack, leading to brief downtime (two minutes) from UDP attack. The target IP has been manually null routed. We are investigating.

Update: the system has been fixed.

Scheduled maintenance: 16 January 2015

Update 2: The network swap has been completed. This resolves some TCP throughput issues we've been seeing over the last week.

...

Update: This maintenance has been completed. A complication during maintenance involving incorrect servers being powered off due to human error at approximately 4:12 am resulted in failure of volume storage system, this meant that all volume-backed virtual machines needed to be rebooted. We will swap internal network to the 10 gigabit NICs on 17 January 2015 around 11:00 pm EST (only a brief blip in network connectivity is expected).

...

We will be performing maintenance on 16 January 2015 from 11:59 pm to 6:00 am EST, which will involve a sequential reboot of each compute node, along with downtime of up to thirty minutes. This is to upgrade internal infrastructure to 10 gigabit network as some of the current links are reaching capacity.

Volume-backed instances will be live migrated to new compute nodes. Some volume-backed instances (re)booted before 10 December 2014 may require a quick reboot before migration due to updated configuration.

Please contact support@lunanode.com if you have any questions or concerns regarding this maintenance.

Downtime: toronto8 lockup (5 January 2015 2:02 am EST)

Investigating toronto8 lockup. Update: updated kernel and rebooted, this should resolve issue. Virtual machines were down for ten minutes on average.

Minor/downtime: network drops (4 January 2015 7:12 am EST)

Investigating some minor drops in network connectivity. Should be resolved at this time.

Minor/feature: image details page improvements (1 January 2015 6:55 am EST)

A few small improvements to the image details: the style looks better and add MD5 checksum field.

Downtime: toronto3 (29 December 2014 4:23 pm EST)

The toronto3 host node experienced a kernel panic at 4:23 pm EST; services were restored at 4:31 pm EST. Virtual machines on toronto3 were offline for approximately 15 minutes. We continue to investigate the kernel panic problems.

Minor/downtime: panel non-responsive (28 December 2014 11:59 pm EST)

Panel becomes non-responsive due to image downloads. Added session_write_close to avoid PHP from locking session and blocking additional requests on same session. Unclear why originally became non-responsive for non-concurrent sessions.

Minor/feature: receipts on website (28 December 2014 10:00 pm EST)

Receipts for payments can now be viewed in lndynamic panel (future payments only).

Minor/issue: lndyn.com secondary nameserver (28 December 2014 9:00 am EST)

Identify problem with secondary nameserver configuration. Resolved and opened ticket in tracker to move secondary nameserver to France server at future date. Unknown if related to reported symptom of returned IP from DNS bouncing back and forth.

Minor/misc: toronto9 setup (27 December 2014 3:00 am EST)

toronto9 is available for provisioning.

Minor/issue: new API broken (26 December 2014 7:00 pm EST)

Web server setup changes to allow streaming of image downloads caused new API to stop functioning (legacy API remained operational). New API is back online at this time.

Feature: download images (26 December 2014 8:00 am EST)

Allow downloading images from lndynamic.

Minor/misc toronto8 toronto9 setup (21 December 2014 8:00 pm EST)

toronto8 online (skip reboot), toronto9 reserved for testing.

Minor/misc toronto8 toronto9 setup (21 December 2014 5:00 pm EST)

Setup of toronto8 and toronto9 is complete, but not connected yet, needs reboot.

Minor/issue: detaching floating IP on first boot (20 December 2014 2:16 am EST)

Floating IP occasionally will not show up in OpenStack instance network info on first boot. Work-around to display in panel was to select from database, however the detach floating IP was not similarly updated (so it would show attach floating IP button even though floating IP already attached). The update has been made so that detach button will properly show.

Minor/misc: toronto8 toronto9 setup (20 December 2014 1:00 am EST)

Setup of new toronto8 and toronto9 host nodes progresses.

Minor/issue: disk capacity (19 December 2014 11:00 pm EST)

OpenStack disk filter preventing instances from launching. Host aggregate members updated to resolve.

Minor/issue: multiple startup scripts (15 December 2014 11:02 pm EST)

Backend bug preventing multiple startup scripts from being selected to boot new instance was fixed. Scripts will now be combined into single MIME multipart message before being forwarded to OpenStack API.

Minor/security: API create/delete logging (15 December 2014 10:00 pm EST)

API creation and deletion actions will now be logged under account activity. These were previously not logged there.

Downtime: toronto3 (15 December 2014 12:13 pm EST)

The toronto3 host node experienced a kernel panic at 12:13 pm EST; services were restored at 12:38 pm EST. Virtual machines on toronto3 were offline for approximately 25 minutes.