Incidents | Cerebrium Incidents reported on status page for Cerebrium https://status.cerebrium.ai/ en Build Service recovered https://status.cerebrium.ai/ Thu, 23 Apr 2026 21:04:36 +0000 https://status.cerebrium.ai/#c46b2fe66e15b89d4c0b45ab2eab8e23ac49fd6a0c43490fbcb43f35b5a8e96b Build Service recovered Build Service went down https://status.cerebrium.ai/ Thu, 23 Apr 2026 19:58:33 +0000 https://status.cerebrium.ai/#c46b2fe66e15b89d4c0b45ab2eab8e23ac49fd6a0c43490fbcb43f35b5a8e96b Build Service went down Build Service recovered https://status.cerebrium.ai/ Thu, 16 Apr 2026 01:50:32 +0000 https://status.cerebrium.ai/#9387fbf6e6eb3889d4f2d84e81a5405d6422ebf2dba08f04858f6c6784520760 Build Service recovered Build Service went down https://status.cerebrium.ai/ Thu, 16 Apr 2026 01:07:23 +0000 https://status.cerebrium.ai/#9387fbf6e6eb3889d4f2d84e81a5405d6422ebf2dba08f04858f6c6784520760 Build Service went down Registry US EAST 1 recovered https://status.cerebrium.ai/ Wed, 15 Apr 2026 22:22:51 +0000 https://status.cerebrium.ai/#64e0c33e8256b7fe3eaaf18a7531583709290cbecd56ee6762bdbfc6ce776e93 Registry US EAST 1 recovered US EAST 1 recovered https://status.cerebrium.ai/ Wed, 15 Apr 2026 22:22:45 +0000 https://status.cerebrium.ai/#2af11cc47285d06d408d77168d849fcd890cf63b8a4c2a88588ef078561f0937 US EAST 1 recovered Registry US EAST 1 went down https://status.cerebrium.ai/ Wed, 15 Apr 2026 22:06:52 +0000 https://status.cerebrium.ai/#64e0c33e8256b7fe3eaaf18a7531583709290cbecd56ee6762bdbfc6ce776e93 Registry US EAST 1 went down US EAST 1 went down https://status.cerebrium.ai/ Wed, 15 Apr 2026 22:06:41 +0000 https://status.cerebrium.ai/#2af11cc47285d06d408d77168d849fcd890cf63b8a4c2a88588ef078561f0937 US EAST 1 went down Build Service recovered https://status.cerebrium.ai/ Tue, 14 Apr 2026 23:13:31 +0000 https://status.cerebrium.ai/#8be01f67d9c9eea3018ddf299e65174283c5fb5c988401e1993ceba583f7cfc6 Build Service recovered Build Service went down https://status.cerebrium.ai/ Tue, 14 Apr 2026 22:42:26 +0000 https://status.cerebrium.ai/#8be01f67d9c9eea3018ddf299e65174283c5fb5c988401e1993ceba583f7cfc6 Build Service went down Registry US EAST 1 recovered https://status.cerebrium.ai/ Sun, 05 Apr 2026 22:28:58 +0000 https://status.cerebrium.ai/#0e942516d029940b42333ad4c7e3a8fe840efe9f262515762d78efadaa592408 Registry US EAST 1 recovered US EAST 1 recovered https://status.cerebrium.ai/ Sun, 05 Apr 2026 22:28:44 +0000 https://status.cerebrium.ai/#a554f512d9038c3d23c6711d6a8040522aaaabff8ac3c46e6bc48ec9bbbe49c4 US EAST 1 recovered Registry US EAST 1 went down https://status.cerebrium.ai/ Sun, 05 Apr 2026 22:08:45 +0000 https://status.cerebrium.ai/#0e942516d029940b42333ad4c7e3a8fe840efe9f262515762d78efadaa592408 Registry US EAST 1 went down US EAST 1 went down https://status.cerebrium.ai/ Sun, 05 Apr 2026 22:08:35 +0000 https://status.cerebrium.ai/#a554f512d9038c3d23c6711d6a8040522aaaabff8ac3c46e6bc48ec9bbbe49c4 US EAST 1 went down EU WEST 2 recovered https://status.cerebrium.ai/ Sun, 05 Apr 2026 19:51:20 +0000 https://status.cerebrium.ai/#f7d320dbc4432f9b5993a12d76f02f9d86cf9802ac1e5c95a5e49a4353be94a9 EU WEST 2 recovered Registry EU WEST 2 recovered https://status.cerebrium.ai/ Sun, 05 Apr 2026 19:50:24 +0000 https://status.cerebrium.ai/#14832e3915d9f705d45375f2e4f7215b9bdcda6e4c4ec22b8e178ad54da006ee Registry EU WEST 2 recovered EU WEST 2 went down https://status.cerebrium.ai/ Sun, 05 Apr 2026 19:47:24 +0000 https://status.cerebrium.ai/#f7d320dbc4432f9b5993a12d76f02f9d86cf9802ac1e5c95a5e49a4353be94a9 EU WEST 2 went down Registry EU WEST 2 went down https://status.cerebrium.ai/ Sun, 05 Apr 2026 19:47:24 +0000 https://status.cerebrium.ai/#14832e3915d9f705d45375f2e4f7215b9bdcda6e4c4ec22b8e178ad54da006ee Registry EU WEST 2 went down Cerebrium Landing Page recovered https://status.cerebrium.ai/ Fri, 03 Apr 2026 10:55:17 +0000 https://status.cerebrium.ai/#f1d4200dbc2e3472adb7297163c5f93e40a6ebf4fa71242bf491e937101204d5 Cerebrium Landing Page recovered Cerebrium Landing Page went down https://status.cerebrium.ai/ Fri, 03 Apr 2026 10:53:15 +0000 https://status.cerebrium.ai/#f1d4200dbc2e3472adb7297163c5f93e40a6ebf4fa71242bf491e937101204d5 Cerebrium Landing Page went down Metrics London recovered https://status.cerebrium.ai/ Tue, 24 Mar 2026 00:57:07 +0000 https://status.cerebrium.ai/#674fc2e1e80c797f17500dd34ae5b6ab14bf065e18d39469589b98a830e21f8e Metrics London recovered Metrics Virginia recovered https://status.cerebrium.ai/ Tue, 24 Mar 2026 00:57:00 +0000 https://status.cerebrium.ai/#dee548de38bbe9a71dfc5bd14d49dc6ef3c11df9198c853e50b9dc3689abd639 Metrics Virginia recovered Metrics London went down https://status.cerebrium.ai/ Tue, 24 Mar 2026 00:48:07 +0000 https://status.cerebrium.ai/#674fc2e1e80c797f17500dd34ae5b6ab14bf065e18d39469589b98a830e21f8e Metrics London went down Metrics Virginia went down https://status.cerebrium.ai/ Tue, 24 Mar 2026 00:48:03 +0000 https://status.cerebrium.ai/#dee548de38bbe9a71dfc5bd14d49dc6ef3c11df9198c853e50b9dc3689abd639 Metrics Virginia went down Build Service recovered https://status.cerebrium.ai/ Mon, 23 Mar 2026 17:11:24 +0000 https://status.cerebrium.ai/#71df83942c2c99b008eadb03cb761f1244539a53a089b0f3df575d309e51e88f Build Service recovered Build Service went down https://status.cerebrium.ai/ Mon, 23 Mar 2026 16:48:27 +0000 https://status.cerebrium.ai/#71df83942c2c99b008eadb03cb761f1244539a53a089b0f3df575d309e51e88f Build Service went down Filesystem & Infrastructure Scaling Improvements https://status.cerebrium.ai/incident/851226 Sun, 22 Mar 2026 15:00:37 -0000 https://status.cerebrium.ai/incident/851226#aab4b4cb4e0860bd64fb1e2fb28d20af46dc1a28ebeb3ea99db95721a251d763 Maintenance completed Filesystem & Infrastructure Scaling Improvements https://status.cerebrium.ai/incident/851226 Sun, 22 Mar 2026 14:00:37 -0000 https://status.cerebrium.ai/incident/851226#6f28296c463fa72e42d355ff8c2e520f3c380aa4c8e13684c2766a1954a50a2d We are carrying out planned infrastructure work on 22 March (09:00 EST, for 1 hour) to improve workload scaling and filesystem performance across our clusters. During this window you should expect intermittent downtime that may affect active runs, along with increased latency on deployments and inference requests. Not all regions will be affected simultaneously. Registry US EAST 1 recovered https://status.cerebrium.ai/ Sat, 07 Mar 2026 19:27:56 +0000 https://status.cerebrium.ai/#43cf51aaa19d1168049dbf8790c49d03ffa39fbcf667d786261bfbeb99f98825 Registry US EAST 1 recovered US EAST 1 recovered https://status.cerebrium.ai/ Sat, 07 Mar 2026 19:27:20 +0000 https://status.cerebrium.ai/#125fcc988cc278f84e5d887e77ee58358b6ccf3c847874f5b71ece2ab717b945 US EAST 1 recovered US-east-1 is down https://status.cerebrium.ai/incident/843719 Sat, 07 Mar 2026 19:20:00 -0000 https://status.cerebrium.ai/incident/843719#2e1739e0fd9e25dc8dba63407c6472b0c8ffe9acabd0df76baf3ff04b3fa9a1d Our AWS us-east-1 region is down - we have identified the issue and the team is working on resolving it. It has been resolved Registry US EAST 1 went down https://status.cerebrium.ai/ Sat, 07 Mar 2026 19:04:50 +0000 https://status.cerebrium.ai/#43cf51aaa19d1168049dbf8790c49d03ffa39fbcf667d786261bfbeb99f98825 Registry US EAST 1 went down US EAST 1 went down https://status.cerebrium.ai/ Sat, 07 Mar 2026 19:03:56 +0000 https://status.cerebrium.ai/#125fcc988cc278f84e5d887e77ee58358b6ccf3c847874f5b71ece2ab717b945 US EAST 1 went down Metrics Virginia recovered https://status.cerebrium.ai/ Fri, 06 Mar 2026 18:06:04 +0000 https://status.cerebrium.ai/#eec70256b26ac472b899c577c0a6de439700169629a3fed1e715a4f6a1894240 Metrics Virginia recovered Metrics Virginia went down https://status.cerebrium.ai/ Fri, 06 Mar 2026 18:03:03 +0000 https://status.cerebrium.ai/#eec70256b26ac472b899c577c0a6de439700169629a3fed1e715a4f6a1894240 Metrics Virginia went down Build Service Maintenance recovered https://status.cerebrium.ai/ Fri, 06 Mar 2026 11:11:33 +0000 https://status.cerebrium.ai/#759448aaa10fd97fb9a01a995e4b37faed040d38b46828b5181fe49974fcb673 Build Service Maintenance recovered Build Service recovered https://status.cerebrium.ai/ Fri, 06 Mar 2026 11:11:24 +0000 https://status.cerebrium.ai/#cc357e27b7f2a01df8fe2bf86aefbe09f8ee14fb1d75418ed747f87edaa3a4d5 Build Service recovered Build Service Maintenance went down https://status.cerebrium.ai/ Fri, 06 Mar 2026 11:01:26 +0000 https://status.cerebrium.ai/#759448aaa10fd97fb9a01a995e4b37faed040d38b46828b5181fe49974fcb673 Build Service Maintenance went down Build Service went down https://status.cerebrium.ai/ Fri, 06 Mar 2026 11:01:22 +0000 https://status.cerebrium.ai/#cc357e27b7f2a01df8fe2bf86aefbe09f8ee14fb1d75418ed747f87edaa3a4d5 Build Service went down Metrics Virginia recovered https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:47:56 +0000 https://status.cerebrium.ai/#0180c0fe624d640415fde0de6a0bdc5b61db1af2dc4a5b1da450862c92abc159 Metrics Virginia recovered Metrics Virginia went down https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:44:57 +0000 https://status.cerebrium.ai/#0180c0fe624d640415fde0de6a0bdc5b61db1af2dc4a5b1da450862c92abc159 Metrics Virginia went down Build Service Maintenance recovered https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:11:28 +0000 https://status.cerebrium.ai/#68be92b197bbfc9e626c3764f8c5d2b6b7b9fbcc68b2c62b9629f8e9edecfe6a Build Service Maintenance recovered Build Service recovered https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:11:22 +0000 https://status.cerebrium.ai/#61f1d4d06569786df06372757409f4b92c0792d8980aa1edd0904d8de7f91ec3 Build Service recovered Build Service Maintenance went down https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:01:37 +0000 https://status.cerebrium.ai/#68be92b197bbfc9e626c3764f8c5d2b6b7b9fbcc68b2c62b9629f8e9edecfe6a Build Service Maintenance went down Build Service went down https://status.cerebrium.ai/ Thu, 05 Mar 2026 20:01:23 +0000 https://status.cerebrium.ai/#61f1d4d06569786df06372757409f4b92c0792d8980aa1edd0904d8de7f91ec3 Build Service went down Metrics London recovered https://status.cerebrium.ai/ Wed, 04 Mar 2026 21:02:15 +0000 https://status.cerebrium.ai/#ce581de3bcd881c8cc73a58ddedd48e50d594e4937dda74034277c8b56a4926e Metrics London recovered Metrics London went down https://status.cerebrium.ai/ Wed, 04 Mar 2026 21:01:10 +0000 https://status.cerebrium.ai/#ce581de3bcd881c8cc73a58ddedd48e50d594e4937dda74034277c8b56a4926e Metrics London went down Build Service recovered https://status.cerebrium.ai/ Sun, 01 Mar 2026 20:03:28 +0000 https://status.cerebrium.ai/#d31be440f7e43e681ed951c1eed4af29b92e99fd181481424a892b9e39b750ec Build Service recovered Build Service went down https://status.cerebrium.ai/ Sun, 01 Mar 2026 19:06:30 +0000 https://status.cerebrium.ai/#d31be440f7e43e681ed951c1eed4af29b92e99fd181481424a892b9e39b750ec Build Service went down Build Service recovered https://status.cerebrium.ai/ Thu, 26 Feb 2026 20:34:29 +0000 https://status.cerebrium.ai/#efac17761c0597f6860db205f96a1bf59a3f55d8f756948d14c5c486b1210dcb Build Service recovered Build Service went down https://status.cerebrium.ai/ Thu, 26 Feb 2026 20:13:23 +0000 https://status.cerebrium.ai/#efac17761c0597f6860db205f96a1bf59a3f55d8f756948d14c5c486b1210dcb Build Service went down Metrics Virginia recovered https://status.cerebrium.ai/ Thu, 26 Feb 2026 19:37:04 +0000 https://status.cerebrium.ai/#75556cc5d2f2ab79e53e90d2bc79c93c603c02dc65aad716e5ad635245bbc56f Metrics Virginia recovered Metrics Virginia went down https://status.cerebrium.ai/ Thu, 26 Feb 2026 19:34:59 +0000 https://status.cerebrium.ai/#75556cc5d2f2ab79e53e90d2bc79c93c603c02dc65aad716e5ad635245bbc56f Metrics Virginia went down US EAST 1 recovered https://status.cerebrium.ai/ Thu, 26 Feb 2026 15:33:44 +0000 https://status.cerebrium.ai/#36e5474203bdfe72e20ef7cf8478e6c6ba3edfc176e667e7dad46cb27caec376 US EAST 1 recovered US EAST 1 went down https://status.cerebrium.ai/ Thu, 26 Feb 2026 15:22:42 +0000 https://status.cerebrium.ai/#36e5474203bdfe72e20ef7cf8478e6c6ba3edfc176e667e7dad46cb27caec376 US EAST 1 went down Metrics Virginia recovered https://status.cerebrium.ai/ Fri, 20 Feb 2026 17:58:06 +0000 https://status.cerebrium.ai/#966526fd729f0743bb552827025f154b72664bac6793075a8f8451dc1c4e25b6 Metrics Virginia recovered Metrics Virginia went down https://status.cerebrium.ai/ Fri, 20 Feb 2026 17:55:06 +0000 https://status.cerebrium.ai/#966526fd729f0743bb552827025f154b72664bac6793075a8f8451dc1c4e25b6 Metrics Virginia went down Metrics London recovered https://status.cerebrium.ai/ Wed, 18 Feb 2026 21:17:14 +0000 https://status.cerebrium.ai/#738f33bcdceb4e58593084e5118ca573bc23e4ad652ff3469651764bfe975500 Metrics London recovered Metrics London went down https://status.cerebrium.ai/ Wed, 18 Feb 2026 20:54:12 +0000 https://status.cerebrium.ai/#738f33bcdceb4e58593084e5118ca573bc23e4ad652ff3469651764bfe975500 Metrics London went down Metrics London recovered https://status.cerebrium.ai/ Sun, 15 Feb 2026 07:44:05 +0000 https://status.cerebrium.ai/#462d94a0c19de89988c693a7f539df22d3b35085c098edd0000e706ba379ffd7 Metrics London recovered Metrics London went down https://status.cerebrium.ai/ Sun, 15 Feb 2026 07:42:11 +0000 https://status.cerebrium.ai/#462d94a0c19de89988c693a7f539df22d3b35085c098edd0000e706ba379ffd7 Metrics London went down Metrics Virginia recovered https://status.cerebrium.ai/ Sun, 15 Feb 2026 07:40:58 +0000 https://status.cerebrium.ai/#075ad6fb234844e1229c5e834a3f22f30da3bd782ea875996b7cc11bdab56729 Metrics Virginia recovered Metrics Virginia went down https://status.cerebrium.ai/ Sun, 15 Feb 2026 07:39:52 +0000 https://status.cerebrium.ai/#075ad6fb234844e1229c5e834a3f22f30da3bd782ea875996b7cc11bdab56729 Metrics Virginia went down Increase in 502 errors https://status.cerebrium.ai/incident/819123 Thu, 05 Feb 2026 08:48:00 -0000 https://status.cerebrium.ai/incident/819123#1a6978205cac36d53e7bc13591a56c5a4dc24233b09559ec76f84b222312d822 Some customers are experiencing an increase in 502 errors in US-EAST-1 due to a contention issue on the platform. The team is currently investigating and will revert back as soon as there is more information. We sincerely apologise for this issue and are working to get it resolved as quickly as possible CLI authentication failing https://status.cerebrium.ai/incident/818383 Wed, 04 Feb 2026 10:38:00 -0000 https://status.cerebrium.ai/incident/818383#e00a3f355dc663548b7d3e02bf6b1f26062fb86953948827592deadcb0b42a3f The issue has been resolved. An incorrectly configured DNS record caused users to be unable to sign in using the CLI Build Service recovered https://status.cerebrium.ai/ Sat, 31 Jan 2026 11:02:25 +0000 https://status.cerebrium.ai/#2f99f4bf1a6fe0c97cecf2c75e2f284b674e72da765ff265721492e98de1e772 Build Service recovered Build Service went down https://status.cerebrium.ai/ Sat, 31 Jan 2026 09:52:31 +0000 https://status.cerebrium.ai/#2f99f4bf1a6fe0c97cecf2c75e2f284b674e72da765ff265721492e98de1e772 Build Service went down Increase in request queuing on AWS workloads https://status.cerebrium.ai/incident/802912 Mon, 12 Jan 2026 09:00:00 -0000 https://status.cerebrium.ai/incident/802912#7ab384d4aba5d5dcf8222a210d787a36433e8c2ff3171bf5f85060e21b8cd863 We're currently experiencing degraded performance on workloads being scheduled to the AWS provider. This issue currently only affects GPU-based workloads. This issue is intermittent and may not be affecting all apps. The team is currently investigating the issue and we will provide an update as we uncover any new information. Problem starting new workloads. Existing apps are unaffected. https://status.cerebrium.ai/incident/783164 Tue, 09 Dec 2025 19:08:00 -0000 https://status.cerebrium.ai/incident/783164#3a148ec3d4aca0a7568662b6de72a63d3307b7004630d8db4f39a2d78be6ec4c The issue has been resolved Problem starting new workloads. Existing apps are unaffected. https://status.cerebrium.ai/incident/783164 Tue, 09 Dec 2025 18:44:00 -0000 https://status.cerebrium.ai/incident/783164#830358f004199aa5af28e313f89f76798f7c9008f45ffd0d748217510683a6ce New apps are unable to start at present. Elevated Errors in US-East-1 https://status.cerebrium.ai/incident/778505 Tue, 02 Dec 2025 23:54:00 -0000 https://status.cerebrium.ai/incident/778505#9a6a4a594b4a98c27a6518f481d7a24a1c5d001b1b7369a32cd3ff823a3829aa Our platform is current struggling to schedule new containers on incoming requests. Our team is working on identifying the error and resolving ASAP Resolved: The issue was caused by a failure in a managed component from one of our infrastructure providers, which temporarily prevented us from scheduling new capacity. We’ve worked with the provider to restore functionality and are now implementing additional safeguards to ensure this does not recur. Updating various cluster components https://status.cerebrium.ai/incident/765784 Sun, 16 Nov 2025 15:35:00 -0000 https://status.cerebrium.ai/incident/765784#44c162f4b64153670bac6f17c25bfa4e676dc9f436b6a01c2f3a84cc52e0defd We are performing a series of infrastructure optimizations to improve performance and reliability. While we don’t expect customer traffic to be impacted, there may be brief periods of elevated latency or volatility during the upgrade window. Our team is closely monitoring the rollout and will update this page with any relevant changes. Emergency node maintenance in US-East-1 https://status.cerebrium.ai/incident/757186 Tue, 04 Nov 2025 04:00:34 -0000 https://status.cerebrium.ai/incident/757186#d47ae91f32582e55a5a2dcc9e6bc40e24a2191052cb85532b3e4de37ecdcefe7 A critical error in the mechanism GPU devices use to attach to containers is affecting several workloads on the platform, causing NVML to show "Device not found" when calling nvidia-smi or attempting to use the GPU (Mentioned in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/troubleshooting.html#containers-losing-access-to-gpus-with-error-failed-to-initialize-nvml-unknown-error). This maintenance will update all GPU nodes to use the CDI, as well as a few container runtime upgrades. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 23:36:00 -0000 https://status.cerebrium.ai/incident/746816#def6d05d3ec66619875cc72f480e59b5e4fc16b651f34fefe86783f281a574ad Resolved Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 19:17:00 -0000 https://status.cerebrium.ai/incident/746816#c2796bb6bcf9aeb2ead994d0816196a44f82c71df6e0c874a4db8826faec0b59 We continue to observe recovery across all AWS services, and instance launches are succeeding across multiple Availability Zones in the US-EAST-1 Regions Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 18:24:00 -0000 https://status.cerebrium.ai/incident/746816#08caa7295d5150a3985a744fff62124415b3fe892f24c413ded26bc73486cbcb AWS's mitigations to resolve launch failures for new EC2 instances continue to progress and we are seeing increased launches of new EC2 instances. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 17:48:00 -0000 https://status.cerebrium.ai/incident/746816#63d5cc546455aa540f4c11553e1ee571569501a82cc40bf8190db9cf776ad430 AWS have resolved launch failures and are rolling out the changes to all AZ's at which point we expect launch errors and network connectivity issues to subside. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 17:04:00 -0000 https://status.cerebrium.ai/incident/746816#0344617f774b7af62a0b35ad079fe58cd65549059c279b494e519764a530a924 AWS is in the process of validating a fix for EC2 launches and will deploy to the first AZ as soon as they have confidence we can do so safely. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 15:47:00 -0000 https://status.cerebrium.ai/incident/746816#869f54ebbeff72d23a7f83e3ee9b40b543149323c691df5acef5f39efa5e3be7 AWS have narrowed down the source of the network connectivity issues that have impacted their services. They are throttling requests for new EC2 instance launches to aid recovery and actively working on mitigations. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 14:01:00 -0000 https://status.cerebrium.ai/incident/746816#55398d34ca052b66a663b8a6fafb6229c9d65baf791efe2ae6318d5cc992ecff AWS has applied fixes but is still experiencing problems launching instances in us-east-1. Builds and endpoint calls remain broken. We'll keep you posted. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 13:28:00 -0000 https://status.cerebrium.ai/incident/746816#67599f05df32c991402005470e8eaf57294cf54e0e8c0e1a09a50c5bef88da37 The AWS outage is ongoing. Builds are currently broken due to an outage with EC2. We're waiting on AWS to resolve the issue and will keep you updated. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 11:10:00 -0000 https://status.cerebrium.ai/incident/746816#a49214b9a48be3601aad264a1fdf6dc91ff8867170cd7b4c97618fc61a65bc16 All services have now been restored fully. We will continue to monitor for any anomalies. Thank you for your patience and we apologise for the inconvenience. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 09:43:00 -0000 https://status.cerebrium.ai/incident/746816#15bc346de987b0c270ff70ae21f1a5339045ee3e609949ccc47943dbc02a18d0 Most services have now recovered. You may still experience issues building apps on Cerebrium while AWS continues to resolve the remaining problems. We'll update you once everything is back to normal. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 09:31:00 -0000 https://status.cerebrium.ai/incident/746816#f2755af8f9d9beb9133349e36c2cb6dd9b14b1d56cc67d2fc1b92ca5cee1077f AWS has applied a fix and some services are starting to recover. You may still see some errors or slower response times as things fully stabilize. If something fails, please try again. We'll keep you posted as more services are restored. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 09:01:00 -0000 https://status.cerebrium.ai/incident/746816#7f5682cdb78d1f389b7f350a2e1e75fd236a69267522cbc2fbf643b51989e0ad AWS has identified the root cause as a DNS resolution issue affecting DynamoDB and other services in US-EAST-1. They're working on multiple recovery paths to accelerate the fix. Cerebrium services remain impacted during this time. If you encounter errors, please continue to retry your requests. AWS will provide their next update by 2:45 AM. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 08:29:00 -0000 https://status.cerebrium.ai/incident/746816#871cd301c7bebebef8f179e43876babbef14a6d7fbd37f53a467067d6240c74e The AWS team have narrowed critically affected services down, however, these services are core to the Cerebrium platform and your dashboards, builds, and endpoint calls are still affected. We are continuing to investigate and will provide more updates within the next 45 minutes. Elevated upstream errors (us-east-1) https://status.cerebrium.ai/incident/746816 Mon, 20 Oct 2025 07:38:00 -0000 https://status.cerebrium.ai/incident/746816#5a509bc68dcfde22169faca0750514fa7e5c34b578ec1b50a44df545757ed329 We are seeing elevated error rates from upstream AWS errors across the majority of our services in the us-east-1 region. We will share an update as soon as possible. Degraded Inference API in US-EAST-1 https://status.cerebrium.ai/incident/740083 Wed, 08 Oct 2025 18:28:00 -0000 https://status.cerebrium.ai/incident/740083#89d8d4e8dd689746d3c782842aa817ffbafe52467e5adfe5607a9365aceac920 The Inference API is currently experiencing degraded performance in US-EAST-1. Our team is working on a fix ASAP Inference API https://status.cerebrium.ai/incident/737024 Fri, 03 Oct 2025 13:13:00 -0000 https://status.cerebrium.ai/incident/737024#e1c9c6e4c4fdf5e170832cdabbc8311af2f5a5ebda3688b6218f48be2e12c17e Inference API is currently experiencing a High 502 failure rate. Roughly 45% of all requests are affected. Our team is currently investigating the cause of the issue as a matter of high urgency. Container Count is down https://status.cerebrium.ai/incident/726877 Thu, 18 Sep 2025 23:05:00 -0000 https://status.cerebrium.ai/incident/726877#02b232bc3e6fa758f3a2ce6d5b1043c6c6f3e30c2573ee0fa7cd895a0e38bb5f A 3rd party provider is down affecting the container count on the dashboard.