A massive Cloudflare outage disrupted Internet services on June 21. The company announced on Tuesday that it was investigating the incident and admitted that the outage affected traffic in 19 of its data centers. Hundreds of online platforms and services were affected by the network configuration error.
The company issued a statement that “Unfortunately, these 19 locations handle a significant proportion of our global traffic. This outage was caused by a change that was part of a long-running project to increase resilience in our busiest locations.” Cloudflare fixes outages after it impacted websites and services including Amazon, Telegram, Amazon Web Services, Twitch, Coinbase, DoorDash, and Steam amongst many others. Customers took to social media to air their grievances, with one Twitter user posting that “Half of the internet is down due to a @Cloudflare outage.”
Founded in 2009, Cloudflare is headquartered in San Francisco and is a major content delivery network with clients across the globe. It majorly acts as a reverse proxy between a website visitor and the Cloudflare customer’s hosting provider.
The company started investigating the incident after reports about service disruptions started pouring in worldwide from clients and companies. Panic set in as Cloudflare works a shield between website visitors and hosts to prevent DDoS attacks. The CloudFlare outage took a little over one and half hours to resolve. The Cloudflare outage timeline starts from 3:56 UTC when the change was deployed to the first location to nearly two and half hours later at 6:27 UTC where the incident started, taking 19 locations offline.
Cloudflare stated, “A change to the network configuration in those locations caused an outage which started at 06:27 UTC. At 06:58 UTC the first data center was brought back online and by 07:42 UTC all data centers were online and working correctly.” The outage occurred as the company worked on equipping Cloudflare’s busiest locations with more resilient architecture, known internally as Multi-Colo PoP (MCP).
“Depending on your location in the world you may have been unable to access websites and services that rely on Cloudflare. In other locations, Cloudflare continued to operate normally.”
The Cloudflare outage was most problematic for users of Cloudflare’s DNS lookup service. “Customers attempting to reach Cloudflare sites in impacted regions will observe 500 errors. The incident impacts all data plane services in our network,” said the company. Most customers were unable to access any website. The easiest solution at the time was to change one’s DNS configuration. The Verge reported that its staff used this method and that using their ISP’s default DNS settings resolved most of their issues.
Although the Cloudflare outage only affected about 4% of its entire network, it impacted nearly 50% of all HTTP requests handled by the company globally. The list of affected data centers include Amsterdam, Atlanta, Ashburn, Chicago, Frankfurt, London, Los Angeles, Madrid, Manchester, Miami, Milan, Mumbai, Newark, Osaka, São Paulo, San Jose, Singapore, Sydney, and Tokyo.
After Cloudflare fixes outage, the company apologized for the error that was the result of an ongoing project.
Similar outages have occurred in the past wherein brought down a significant chunk of the web. In 2019, a bad software deployment caused service disruptions to last for almost 30 minutes, before things were set right. Furthermore, another outage occurred in 2020 where a misconfiguration of a router brought services down in major European cities.
The company has also come under the scanner for its stance on free speech – where it refuses to ban websites for hate speech content. In early 2022, Clouldflare refused to exit Russia after world leaders called for sanctions with regards to Russia’s invasion of Ukraine.
The post Network Configuration Error Causes Cloudflare Outage Across 19 Locations appeared first on Industry Leaders Magazine.