Logs Delay

Incident Report for Datadog US1

Resolved

Logs are fully restored and stable for all customers.
Posted Nov 16, 2019 - 02:54 EST

Update

The majority of delayed logs should now be restored for most customers. We're working to restore the remaining delayed data as fast as possible. Live logs data continues to be stable.
Posted Nov 16, 2019 - 01:27 EST

Update

Logs data backfill is progressing and live traffic is stable. Efforts to speed up backfill are continuing. We will continue to update as we make progress.
Posted Nov 15, 2019 - 23:14 EST

Update

Logs data backfill is progressing and live traffic is stable. We are continuing to monitor the situation and are working to ensure the backfill happens as fast as possible.
Posted Nov 15, 2019 - 22:38 EST

Monitoring

Backfill of historic logs data is under way and live traffic is stable. We are actively working on steps to restore historic data faster and will provide updates as we make progress.
Posted Nov 15, 2019 - 21:28 EST

Update

Live logs traffic is now restored for all customers. Work to restore delayed historic logs data continues. We will update as that work progresses.
Posted Nov 15, 2019 - 20:47 EST

Update

Mitigation work continues. We are starting to see live logs data for a growing number of customers. Work to restore live data for all customers and backfill historic data is on-going.
Posted Nov 15, 2019 - 20:30 EST

Identified

We have identified the cause of logs delays and are actively working on mitigations. Will update as those mitigations progress.
Posted Nov 15, 2019 - 19:39 EST

Investigating

We're actively investigating increased log intake latencies. As a result, data from Log Explorer, Live Tail, and Analytics may be delayed.

These delays may result in "below threshold" alert conditions for log monitors; to avoid spurious alerts we've temporarily disabled these alert types.
Posted Nov 15, 2019 - 18:33 EST
This incident affected: APM and Log Management.