Such problems – especially after they have been ongoing for hours – likely indicates there is a major problem with the technology underpinning Facebook’s services.
And those issues can easily last for hours. In 2019, when it suffered from its biggest ever outage, it was more than 24 hours from the beginnings of the problem until Facebook said it was resolved.
What’s more, Facebook might never truly reveal what caused the problems. After that record outage in 2019, it said only that the problems were “a result of a server configuration change”.
This time around, at least some of the problems were related to the domain name system, or DNS, which works something like a phone book for the internet. When a user types in a web address – such as facebook.com – then the computer needs to turn that into a an IP address, which is a series of numbers, so that it can access the data that makes up the page you want to see.
When Facebook was down, however, that system was not working: the computer searches for the numbers it wants to see, but the numbers aren’t there. Facebook’s servers should have provided them, but the phone book is in effect blank.
When it comes to Facebook, that meant anyone attempting to access the site will see an error code, depending on what browser they use. Apps might work a little differently – they would still show existing content, such as WhatsApp messages or Instagram posts – that have already been downloaded, but they were not be able to ask Facebook’s servers for new ones.
It is far from the only company to suffer such issues. In July, many major websites – including those of seemingly unconnected companies such as Home Depot and Delta Airlines – went down because of problems at Akamai, which offers DNS to its customers.
But Facebook’s DNS problems were only a symptom, even if they are the one that means many people are unable to access those sites. The system would not break spontaneously, and so it is likely that something has happened to the underlying infrastructure – a stray settings change, a physical outage at a server, or something else entirely – that has stopped it from working.
It appeared, at least from the outside, that Facebook had done that to itself; the company maintains its own DNS, unlike other smaller companies, and the changes were made from inside the company. At some point during Tuesday, the relevant directions to web browsers appeared to have been removed – though, at the time of publication, Facebook was yet to explain how or why.
The fact that Facebook is so extensively run on its own systems also meant that it, too, was affected by the outage, with internal communications tools going offline. It also reportedly kept engineers from being able to fix the problems remotely, since they were unable to access the system to do so – meaning that the company was forced to send engineers to physically deal with the servers in person.