Table of Contents
Today’s (11/18/2025) Cloudflare outage disrupted major platforms globally—OpenAI, X, Canva, and countless others—adding yet another chapter to a growing narrative: our digital infrastructure is only as resilient as its weakest link. I have previously commented and written articles on this topic, within a span of month or more, this is a third major outage affecting the global internet.
The Pattern We Cannot Ignore
- AWS: A DNS race condition crippled DynamoDB, cascading across services.
- Azure: A faulty configuration in Azure Front Door bypassed safety checks, degrading core services.
- Cloudflare: A single provider outage impacted DNS, CDN, and security layers for a significant portion of the internet.
These events highlight two systemic risks:
- Too Many Eggs in One Basket
When a single vendor underpins critical layers—compute, networking, security—the blast radius of failure is enormous. - DNS Dependency in Cloud Security
Modern application security often relies on DNS-based controls. Even with multi-cloud compute strategies, if DNS and security are centralized, resilience is an illusion.
Is Multi-Cloud Enough?
Multi-cloud strategies promise redundancy, but they often fail to address vendor concentration in security and DNS. If your WAF, DDoS protection, and DNS resolution all depend on one provider, you have simply shifted the single point of failure—not eliminated it.
Hybrid Cloud as an Alternative
A hybrid approach—combining cloud-native services with on-prem or third-party security layers—can reduce systemic risk:
- Independent DNS resolution via enterprise-managed resolvers.
- Distributed security controls using multiple vendors for WAF, DDoS, and API protection.
- Failover paths that bypass centralized CDN/security providers during outages.
The Repatriation Question
Should enterprises repatriate critical resources—like DNS, identity, or security policy enforcement—from the cloud back to controlled environments?
Why consider it?
- Reduced blast radius: Critical services remain operational even if a cloud provider fails.
- Policy sovereignty: Enterprises maintain control over security posture without relying on external orchestration.
Challenges
- Uniform Policy Enforcement: Maintaining consistent security across hybrid environments is complex.
- Operational Overhead: Managing on-prem and cloud simultaneously increases cost and complexity.
- Performance Trade-offs: Routing through enterprise-managed layers can introduce latency.
- Cost of Resilience: Engineering resiliency build redundancies to enable continuity but also results in increased costs.
The Way Forward
Resilience requires architectural diversity, continuous validation, and human-in-the-loop intervention. It is time to move beyond redundancy planning toward resilience engineering—where adaptability and observability are as critical as automation.
Question for the community
Are we ready to rethink the cloud-first mantra? Should critical resources like DNS and identity return to enterprise control—or is the future still fully cloud-native?
Raj Vadi
Senior Solutions Architect at Corero Network Security

