Summary
Cloud Dataprep service unavailable for all customers
Root Cause
During an infrastructure update to our cloud database service a set of core permissions were inadvertently removed by one of our engineers. Those revoked permissions prevented our backend services from accessing the database service and resulted in failed requests made by our backend services.
Timeline (UTC)
13:22 - Customers first reported issues accessing the service
13:27 - Elevated error alerts trigger alarms to the on-call team
13:34 - Team identifies connection errors in logs and begins incident response
14:18 - Team identifies the missing permission set
14:53 - Permissions are restored and backend services are redeployed
14:55 - Service is restored
Migitation & Next Steps
The issue was resolved after restoring the core permissions to the backend services to the database service. The backend services were redeployed and service availability was restored.The team is in the process of implementing operational alarms for core permission sets, and improving approval/deployment processes for infrastructure changes. We take avoidable incidents very seriously and strive to continually improve our service and operational standards.