On May 8, 2026, Hasura experienced a service disruption affecting Hasura Cloud, DDN, and PromptQL platforms. The incident lasted approximately 50 minutes, during which customers were unable to access services (data plane APIs) hosted on the hasura.app domain.
Timeline (UTC)
14:18: First customer reports of connectivity issues
14:25: Internal alerts triggered and incident response team assembled
14:52: Root cause identified as DNS resolution failure
15:02: Issue resolved, services recovering
15:30: Full service restoration confirmed
Impact
Duration: ~50 minutes
Affected Services: All services on the hasura.app domain (Hasura Cloud, DDN, PromptQL) these are typical project/data plane endpoints
Customer Impact: Customers experienced "host not found" or connection errors when accessing their projects
Data planes / projects with custom domains were not implacted
Root Cause
The incident was caused by a DNS configuration issue affecting the hasura.app domain. The domain's nameserver delegation was temporarily disrupted, which prevented DNS resolvers from locating Hasura services. This resulted in DNS resolution failures for all subdomains under hasura.app.
Resolution
The DNS configuration was corrected and nameserver delegation was restored. Due to DNS caching behavior across the internet, some customers experienced a gradual recovery as DNS caches refreshed with the correct records.
Corrective Actions
To prevent similar incidents in the future, we are implementing the following measures:
1. Domain Monitoring: Implementing automated monitoring and alerting for all critical domain configurations
2. DNS Infrastructure Redundancy: Improving redundancy of DNS providers for critical production domains
3. Redundant Notifications: Establishing multiple notification channels for domain-related events
4. Regular Audits: Quarterly review of all production domain configurations
5. Using multiple domains: We'll spread our workloads across multiple domains for different systems.
---
We sincerely apologize for the disruption this caused to your operations. We are committed to maintaining the reliability our customers depend on, and we are taking concrete steps to ensure this type of incident does not recur.