Datadog
All Systems Operational

About This Site

The status of Datadog.
If you would like to see the status of third party integrations you might have enabled with Datadog check out https://datadogintegrations.statuspage.io
If you are a customer running in our EU region check out https://status.datadoghq.eu

Alerting Engine Operational
API Operational
API Crawlers Operational
APM ? Operational
Corporate Site (www.datadoghq.com) ? Operational
Daily and Weekly Reports Operational
Event Pipeline Operational
Historical Data Operational
Logs Operational
Metrics Pipeline Operational
Package Repositories Operational
Processes Operational
Synthetics Operational
Web Application Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Jun 3, 2020

No incidents reported today.

Jun 2, 2020

No incidents reported.

Jun 1, 2020

No incidents reported.

May 31, 2020

No incidents reported.

May 30, 2020
Resolved - The incident has been resolved. For additional information on upgrading agents 3.6 to 5.32.6, see https://docs.datadoghq.com/agent/faq/certificate_verify_failed-error/.
May 30, 16:22 EDT
Update - We've released Datadog agent 5.32.7 which fixes the certificate problem and restores connectivity to Datadog.

To upgrade from a previous version of agent 5, use the install command for your platform:

Centos/Red Hat: sudo yum check-update && sudo yum install datadog-agent
Debian/Ubuntu: sudo apt-get update && sudo apt-get install datadog-agent
Windows: Download the Datadog Agent installer (https://s3.amazonaws.com/ddagent-windows-stable/ddagent-cli-latest.msi). start /wait msiexec /qn /i ddagent-cli-latest.msi
More platforms and configuration management options detailed at https://app.datadoghq.com/account/settings?agent_version=5#agent
May 30, 15:54 EDT
Update - We are continuing to monitor for any further issues.
May 30, 13:20 EDT
Update - We are continuing to monitor for any further issues.
May 30, 13:17 EDT
Update - What happened?

On Saturday May 30th, 2020, at 10:48 UTC, an SSL root certificate belonging to a certificate authority and used to sign some of the Datadog certificates expired, and caused some agents to lose connectivity with Datadog endpoints. Because this root certificate is embedded in agents from version 3.6 to version 5.32.6, some users will need to take action to restore connectivity.

What versions of the agent are affected?

Agent versions spanning 3.6.x to 5.32.6 embed the expired root certificate and are affected. Agent versions 6.x and 7.x are unaffected; if you are using these agents, no action is required on your part.

Fixing without updating the agent

We’re actively working on a new version of agent 5 but if you’d like to address this without an update, the following is the quickest path to resolution.

On Linux:

sudo rm /opt/datadog-agent/agent/datadog-cert.pem && sudo service datadog-agent restart

On Windows:

rm "C:\Program Files (x86)\Datadog\Datadog Agent\files\datadog-cert.pem"
net stop /y datadogagent ; net start /y datadogagent

May 30, 12:49 EDT
Monitoring - A fix has been implemented and we are monitoring the results. We are contacting customers directly who are running versions of the agent which may still have connectivity issues.
May 30, 10:54 EDT
Update - We've identified connectivity issues affecting certain versions of the agents - most notably agent 5, and are working to mitigate.
May 30, 10:17 EDT
Update - The API is back to normal and alerts have been re-enabled. We're still investigating connectivity issues between some agents and Datadog endpoints.
May 30, 09:26 EDT
Identified - The issue has been identified and a fix is being implemented. To prevent spurious alerts, we have temporarily disabled metric, event, service check, and composite monitors.
May 30, 08:17 EDT
Investigating - We're investigating increased errors receiving payloads from agents and delivering notifications, and snapshots are failing.
May 30, 07:50 EDT
May 29, 2020
Resolved - Alerts for metric monitors have been restored.
May 29, 10:48 EDT
Investigating - We're investigating an issue with delayed monitors notifications, for metric monitors, since 14:00 UTC.
May 29, 10:43 EDT
May 28, 2020

No incidents reported.

May 27, 2020

No incidents reported.

May 26, 2020

No incidents reported.

May 25, 2020

No incidents reported.

May 24, 2020

No incidents reported.

May 23, 2020
Resolved - API requests, snapshots and dashboards are responding to all requests normally.
May 23, 17:53 EDT
Identified - We've identified the issue causing elevated error rates and are working to resolve errors.
May 23, 17:21 EDT
Investigating - We are investigating elevated error rates for dashboards, API calls and snapshots
May 23, 15:39 EDT
May 22, 2020
Resolved - Metrics latencies have recovered and alerting is re-enabled for all customers.
May 22, 13:33 EDT
Monitoring - We have identified the cause of the increased metric latencies and are recovering. We are continuing to monitor closely, and Metric Monitor alerts remain disabled.
May 22, 13:17 EDT
Investigating - We’re investigating increased metric latencies. Graphs may be delayed. To avoid spurious alerts, we’ve temporarily disabled alerts for Metric Monitors.
May 22, 13:04 EDT
May 21, 2020
Resolved - We are processing live trace data for App Analytics for all customers.
May 21, 23:19 EDT
Update - Live App Analytics data is being processed for 80% of customers, though all customers may still have gaps in historical App Analytics data. We are continuing to work to restore live data for the remaining customers, as well as backfilling any missing historical data. No data has been lost.
May 21, 22:41 EDT
Update - Live App Analytics data is being processed for 2/3rds of customers, though all customers may still have gaps in historical App Analytics data. We are continuing to work to restore live data for the remaining customers, as well as backfilling any missing historical data. No data has been lost.
May 21, 21:33 EDT
Identified - We have begun to restore live App Analytics data for a subset of customers. We are also working to restore data that may be missing. No data has been lost, all App Analytics data will be backfilled.
May 21, 21:13 EDT
Update - We're continuing to investigate the delay in App Analytics data and working to restore live data.
May 21, 20:29 EDT
Investigating - We're actively investigating increased trace intake latencies. As an effect, data from App Analytics may be delayed. No data has been lost.
May 21, 20:00 EDT
May 20, 2020
Resolved - Event based monitors are being processed as normal.
May 20, 00:10 EDT
Identified - Event based monitors processing is currently delayed. As a result you might experience a delay in the notifications for event based monitors. Other type of monitors are unaffected.
May 20, 00:05 EDT