NovaCloud-Hosting - XEON-01-VHOST HDD-Pool inaccessible – Incident details

XEON-01-VHOST HDD-Pool inaccessible

Monitoring
Major outage
Started 13 days ago

Affected

SkyLink Data Center

Degraded performance from 11:15 AM to 1:40 PM, Operational from 11:15 AM to 11:35 PM, Degraded performance from 11:35 PM to 11:41 PM, Operational from 11:35 PM to 11:41 PM, Partial outage from 11:41 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

Root-Server Vhosts EYG1

Degraded performance from 11:15 AM to 1:40 PM, Operational from 1:40 PM to 11:35 PM, Degraded performance from 11:35 PM to 11:41 PM, Partial outage from 11:41 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

XEON-01-VHOST

Degraded performance from 11:15 AM to 1:40 PM, Operational from 1:40 PM to 11:35 PM, Degraded performance from 11:35 PM to 11:41 PM, Partial outage from 11:41 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

Game-Server Nodes EYG1

Operational from 11:15 AM to 11:41 PM, Partial outage from 11:41 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

Generic-01

Operational from 11:15 AM to 11:41 PM, Partial outage from 11:41 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

Proxmox-Backup-Servers

Operational from 11:15 AM to 11:35 PM, Major outage from 11:35 PM to 12:26 AM, Operational from 12:26 AM to 12:00 AM

Updates
  • Monitoring
    Monitoring
    We implemented a fix and are currently monitoring the result.
  • Identified
    Identified

    We have made some changes and are currently rebooting the hostsystem in order to apply the changes.

    Additionally, the NVMe cache will be disabled to test whether this issue is caused by the caching method we are currently using.

  • Investigating
    Investigating

    Any IO operations on the HDD pool are not possible again. We are currently investigating the issue and trying to get the HDD pool operational.

  • Monitoring
    Monitoring

    A routine Raid-Check resulted in a bug which locked up read/write operations.

    We have managed to restore normal operation, but with degraded speed of the HDD-Pool in order to prevent another issue. As soon as we have found a solution, full speed will be restored.

    MAINTENANCE: https://novacloud.instatus.com/cm9e9zh6o00nl79u2h3ibnqhn

  • Investigating
    Investigating

    Some servers cannot be started and the Disk-IO of the HDD-pool is very low. We are currently investigating the cause for this issue.