Facebook’s one-day outage is by far the longest and most extreme in years. At around 9:00 a.m. PDT on the west coast of the United States – where the headquarters of the corporate giant is located – Facebook, WhatsApp, Instagram and Facebook Messenger appeared to disappear from the internet.
The blackout continued until market close, with the company’s shares falling about 5% below their opening price on Monday. By mid-afternoon, services were starting to resume after Facebook sent a team to its Santa Clara data center to “manually reset” the company’s servers.
But what makes the outage unique is how extremely offline Facebook was.
In the morning, Facebook sent a brief tweet apologize that “some people have difficulty accessing our applications and products.” Then reports revealed that the outage affected not only its users, but the business itself. The employees would have been unable to enter their office buildings, and the staff called it a ‘snow day’ – they couldn’t do any work because the outage also affected internal collaboration applications.
Facebook has not commented on the cause of the outage, although security experts said there was evidence pointing to an issue with the company’s network that cut Facebook off from the wider Internet and itself.
The first signs of trouble were around 8:50 a.m.. PDT in California, according to John Graham-Cumming, chief technical officer of network giant Cloudflare, who said Facebook “disappeared from the Internet in a flurry of BGP updates” over a two-minute window, referring to BGP, or Border Gateway Protocol, the system that networks use to find the fastest way to send data over the Internet to another network.
The updates were specifically BGP route withdrawals. Essentially, Facebook had sent a message to the internet saying it was closed for business, like closing its castle’s drawbridge. Without any routes to the network, Facebook was essentially isolated from the rest of the internet, and due to the way Facebook’s network is structured, route takedowns also wiped out WhatsApp, Instagram, Facebook Messenger, and everything in it. inside its digital walls.
Within minutes of removing BGP routes, users started noticing issues. Internet traffic that should have gone to Facebook basically got lost on the Internet and went nowhere, said Rob Graham, founder of Errata Security, in a tweet thread.
Users started noticing that their Facebook apps had stopped working and websites were not loading and reported issues with DNS or the Domain Name System which is another essential part of how the internet works. . DNS converts human readable web addresses to machine readable IP addresses to find where a web page is located on the Internet. Without access to Facebook’s servers, apps and browsers would continue to push back what looked like DNS errors.
It is not clear exactly why the BGP routes were withdrawn. BGP, which has been around since the advent of the internet, can be manipulated and maliciously exploited in ways that lead to massive outages.
What is more likely is that a Facebook configuration update went horribly wrong and its failure spread across the internet. A now deleted Reddit thread from a Facebook engineer described a BGP configuration error long before it was widely known.
But while the fix may be straightforward, the recovery may stretch for the next few hours to days due to how the internet works. ISPs typically update their DNS records every few hours, but it can take several days for full propagation.
“To the huge community of people and businesses around the world who depend on us: we’re sorry,” Facebook tweeted around 3:30 p.m. local time. “We have worked hard to restore access to our applications and services and are happy to announce that they are coming back online now. Thanks for being with us.