Amazon Web Services’ Elastic Compute Cloud (EC2) wobbled in Sydney on Tuesday.
The timing was bad – it took place on the eve of its Sydney Summit at which the faithful gathered to continue their embrace of all things cloudy and Amazonian.
But the wobble was – perversely – actually pretty good news for AWS, for a few reasons.
One was that the problem only affected “some non-Nitro instances in a single Availability Zone in the AP-SOUTHEAST-2 Region.” Nitro is AWS’ new hardware that separates security and compute functions. Nitro was only announced in late 2017 and implemented later. So this problem only hit older, less-resilient infrastructure from an earlier cloud age.
Second, it only took out some instances, which tells us that AWS was able to isolate the problem.
Third, the cause of the problem was a “network connectivity issue”, so was probably not entirely within AWS’ ability to control. Note, however, that a single connectivity issue does suggest that redundancy measures were not effective. Which isn’t good.
Fourth, the issue was resolved in just over 90 minutes. And even during the problem, it was possible to manually restore connectivity to affected instances.
Fifth, the problem only hit one of the multiple availability zones in the region. So those who use multiple availability zones, which is known good practice, would not have experienced issues.
Lastly, AWS’ RSS feed for EC2 in the AP-SOUTHEAST-2 Region records that the previous incident in the region was on 5 June 2016. Or 1059 days before this incident.
So maybe the timing wasn’t so bad after all: on the eve of Summit users were shown that a fraction of AWS’ old infrastructure suffered a limited brownout, for the first in almost three years! Which could make this incident more of an advertisement for the cloud than something to worry about …