Another day, another Microsoft outage. Today, the third Office 365 outage in 10 days brought Outlook down for some users around the US.
Some users also saw issues today with Microsoft Teams—the collaboration app that’s part of Office 365—as part of the outage. Those apps are considered essential by countless organisations, with many workforces continuing to work remotely amid the COVID-19 pandemic.
On Wednesday afternoon, Microsoft said in a statements the services were recovering after the company rolled back a recent network update.
Here are five things you should know about the latest Microsoft Office 365 outage.
Office 365 status: Down
The issues began Wednesday afternoon at about 2:10 pm US Eastern Time, according to Microsoft. User reports of Office 365 problems on downdetector.com spiked at about 2:26 pm, ET.
At 2:48 pm ET, the Microsoft 365 Status account on Twitter acknowledged the Office 365 outage. “We‘re investigating an issue affecting access to Microsoft 365 services,” the account tweeted. “Users may see impact to Microsoft Teams, Outlook, SharePoint Online, OneDrive for Business, and Outlook.com.”
An outage map on downdetector.com showed the Office 365 outage affecting users in several regions of the US, including the Northeast, parts of the Midwest and much of California.
Cause of the outage
In an update late Wednesday afternoon ET to the Microsoft 365 Service health status page, Microsoft blamed a “recent update to network infrastructure” for today’s outage.
“Further investigation has confirmed that a recent update to network infrastructure resulted in impact to Microsoft 365 services,” Microsoft said on the page. “Our telemetry indicates continued recovery within the environment following the reversion of the update.”
Microsoft Teams was among the first services to fully recover, while Exchange Online and Outlook.com were recovering as of just before 5 pm ET.
Azure network infrastructure
On its Azure status page, Microsoft indicated that the outage was tied to the Azure cloud, which underpins Office 365. Between 2:10 pm ET and 3:42 pm ET, “a subset of customers may have experienced issues connecting to resources that leverage Azure network infrastructure across regions.”
The preliminary root cause for the issues, according to Microsoft: “A recent change was applied to WAN resources causing connectivity latency or failures between regions.”
Microsoft said it mitigated the problem by rolling back the recent change “to a healthy configuration.”
“We will continue to investigate to establish the full root cause and prevent future occurrences, and a full Post Incident Report (PIR) will be published within the next 72 hours,” Microsoft said.
Today’s Office 365 outage—the third in less than two weeks—has set off alarm bells for many users and partners, including some who have aired their concerns to CRN US.
A channel source, who has been impacted by all three Office 365 outages in the last 10 days, said today that he couldn’t even get into an administrator’s console to get an update on the outage.
“Microsoft needs to get its act together,” he said. “I’ve got customers reaching out to me telling me their email is not working. Microsoft needs to take corrective action to get this solved.”
The incident today followed a five-hour outage on 28 September and a four-hour outage on 1 October. After the 28 September outage, Microsoft blamed a software “code issue.”
“A code issue caused a portion of our infrastructure to experience delays processing authentication requests, which prevented users from being able to access multiple M365 services,” Microsoft said in an email update to Microsoft administrators impacted by the outage.
The channel source told CRN US the outages may be the result of a nagging DevOps issue--putting software code into production that is causing the outages—or could also be a problem caused by dramatic increases in the usage of Microsoft Teams.
“In 20-plus years of doing this I have never seen this many outages from Microsoft in such a short period of time,” he said. “I’m surprised. They usually have their act together. This is very frustrating.”
Following the second outage on 1 October, a senior executive for one of Microsoft’s top partners, who did not want to be identified, said he sees the recent outages as clearly DevOps-related.
“The REST functionality within Office 365 cited in the latest outage is all about DevOps and quality of code,” he said. “It totally looks like a DevOps issue. Remember DevOps is supposed to ensure good code quality and integration with existing code.”