Elevated error rate on the web application
Incident Report for Datadog
Resolved
This incident has been resolved.
Posted Jul 28, 2020 - 07:15 EDT
Update
Error rates and latencies have recovered for most systems. Some alerts may still be delayed.
Posted Jul 28, 2020 - 07:12 EDT
Update
AWS metrics have caught up and all error rates and latencies continue to decrease.
Posted Jul 28, 2020 - 06:58 EDT
Update
Error rates have decreased and continue to come down, although are still higher than normal. AWS metrics are still delayed. We are continuing to work on mitigations.
Posted Jul 28, 2020 - 06:51 EDT
Identified
Several systems are experiencing increased error rates due to an outage with our dns provider. Some notifications are delayed and some x-ray traces may be dropped.
Posted Jul 28, 2020 - 05:44 EDT
Investigating
We are seeing an elevated error rate on the web application. We are currently investigating the issue.
Posted Jul 28, 2020 - 04:48 EDT
This incident affected: Alerting Engine, API, API Crawlers, APM, Logs, Processes, and Web Application.