Back to overview
Downtime

Elevated upstream errors (us-east-1)

Oct 20 at 03:38am EDT
Affected services
Cerebrium Dashboard
Build Service
Registry US EAST 1
US EAST 1
Files US EAST 1
Metrics Virginia

Resolved
Oct 20 at 07:36pm EDT

Resolved

Updated
Oct 20 at 03:17pm EDT

We continue to observe recovery across all AWS services, and instance launches are succeeding across multiple Availability Zones in the US-EAST-1 Regions

Updated
Oct 20 at 02:24pm EDT

AWS's mitigations to resolve launch failures for new EC2 instances continue to progress and we are seeing increased launches of new EC2 instances.

Updated
Oct 20 at 01:48pm EDT

AWS have resolved launch failures and are rolling out the changes to all AZ's at which point we expect launch errors and network connectivity issues to subside.

Updated
Oct 20 at 01:04pm EDT

AWS is in the process of validating a fix for EC2 launches and will deploy to the first AZ as soon as they have confidence we can do so safely.

Updated
Oct 20 at 11:47am EDT

AWS have narrowed down the source of the network connectivity issues that have impacted their services. They are throttling requests for new EC2 instance launches to aid recovery and actively working on mitigations.

Updated
Oct 20 at 10:01am EDT

AWS has applied fixes but is still experiencing problems launching instances in us-east-1. Builds and endpoint calls remain broken. We'll keep you posted.

Updated
Oct 20 at 09:28am EDT

The AWS outage is ongoing. Builds are currently broken due to an outage with EC2. We're waiting on AWS to resolve the issue and will keep you updated.

Updated
Oct 20 at 07:10am EDT

All services have now been restored fully. We will continue to monitor for any anomalies. Thank you for your patience and we apologise for the inconvenience.

Updated
Oct 20 at 05:43am EDT

Most services have now recovered. You may still experience issues building apps on Cerebrium while AWS continues to resolve the remaining problems. We'll update you once everything is back to normal.

Updated
Oct 20 at 05:31am EDT

AWS has applied a fix and some services are starting to recover. You may still see some errors or slower response times as things fully stabilize. If something fails, please try again. We'll keep you posted as more services are restored.

Updated
Oct 20 at 05:01am EDT

AWS has identified the root cause as a DNS resolution issue affecting DynamoDB and other services in US-EAST-1. They're working on multiple recovery paths to accelerate the fix. Cerebrium services remain impacted during this time. If you encounter errors, please continue to retry your requests. AWS will provide their next update by 2:45 AM.

Updated
Oct 20 at 04:29am EDT

The AWS team have narrowed critically affected services down, however, these services are core to the Cerebrium platform and your dashboards, builds, and endpoint calls are still affected. We are continuing to investigate and will provide more updates within the next 45 minutes.

Created
Oct 20 at 03:38am EDT

We are seeing elevated error rates from upstream AWS errors across the majority of our services in the us-east-1 region. We will share an update as soon as possible.