Elevated error rates on Logs, Trace Analytics, RUM, NPM
Incident Report for Datadog
Resolved
This incident has been resolved.
Posted Feb 23, 2021 - 11:32 EST
Update
All impacted products are now returning results as expected (Logs, Trace Analytics, RUM, NPM). We are investigating possible degraded performance for some specific Logs queries that impact a subset of customers and continue monitoring status until complete resolution of this incident.
Posted Feb 23, 2021 - 10:42 EST
Monitoring
At this time error rates are down across all products, including web queries accessing historical data for Logs, Trace Analytics, RUM or NPM. We are monitoring the status until full resolution.
Posted Feb 23, 2021 - 09:53 EST
Update
We are continuing to work on a fix for this issue.
Posted Feb 23, 2021 - 08:58 EST
Update
We are continuing to work on a fix for this issue.
Posted Feb 23, 2021 - 08:25 EST
Update
We are continuing to work on a fix for this issue.
Posted Feb 23, 2021 - 07:49 EST
Update
We are continuing to work on this issue, customers might still experience errors on web app pages including Logs, Trace Analytics, RUM and Network Monitoring.
It is important to note that data is still being ingested and processed, data will be available once the incident is resolved.
Posted Feb 23, 2021 - 07:01 EST
Update
We have deployed additional resources to mitigate the impact and the web app errors are subsiding. At this point some queries might still return errors. We are monitoring the status until full resolution.
Posted Feb 23, 2021 - 06:18 EST
Update
We are continuing to work on a fix for this issue, the error rates continue to be elevated on web app pages including Logs, Trace Analytics, RUM and Network Monitoring.
It is important to note that data is still being ingested and processed, data will be available once the incident is resolved.
Posted Feb 23, 2021 - 05:42 EST
Update
We are continuing to work on this issue and mitigate the web app impact. At this point, the error rate is still high on web app queries for Logs, Trace Analytics and RUM, causing empty queries or error messages to be returned for these pages. Delays for monitor evaluations are almost back to normal.
Posted Feb 23, 2021 - 04:59 EST
Update
We are continuing to work on a fix for this issue.
Posted Feb 23, 2021 - 04:45 EST
Update
Error rates for affected monitors are returning to normal.

Error rates and latencies for web app queries are elevated for Logs, Trace Analytics, and RUM.
Posted Feb 23, 2021 - 04:33 EST
Identified
The issue has been identified and a fix is being implemented.
Posted Feb 23, 2021 - 04:09 EST
Investigating
We’re investigating processing delays for Logs, Trace Analytics, and RUM monitors. As a result, alerts for these monitor types might be delayed.

Dashboards, Logs Search, and other views of this data are fully operational. Other monitor types such as metrics and event alerts are unaffected.
Posted Feb 23, 2021 - 03:29 EST
This incident affected: Alerting Engine, APM, and Logs.