Amazon’s fall shows the vulnerability of interconnected technologies • Technology • Forbes México

0
4


Amazon.com’s cloud service returned to normal operations Monday afternoon, the company said, after an Internet outage that caused global upheaval among thousands of sites, including some of the web’s most popular apps like Snapchat and Reddit.

Still, some AWS services had a backlog of messages that would take a few hours to process.

AWS hosts applications and computing processes for companies around the world, and the outage knocked out workers from London to Tokyo and prevented others from performing everyday tasks like paying the hairdresser or changing their plane tickets. On Monday afternoon, users complained of persistent difficulties using services such as the Venmo digital wallet and the Zoom video calling platform.

It was the largest internet outage since last year’s CrowdStrike malfunction paralyzed technology systems at hospitals, banks and airports, highlighting the vulnerability of the world’s interconnected technologies.

It was at least the third time in five years that AWS’s northern Virginia cluster, known as US-EAST-1, contributed to a major Internet meltdown.

More context: Amazon’s AWS struggles to recover after disruption of its applications and services around the world

Amazon did not address a request for more clarity on why that particular data center continues to be affected. The problems stemmed from what is known as the Domain Name System, or DNS, which prevented applications from finding the correct address for AWS’s DynamoDB API, a cloud database it relied on to store user information and other critical data.

Root cause is network health monitor

Previously, AWS said the root cause of the outage was an underlying subsystem that monitors the status of its network load balancers used to distribute traffic across multiple servers.

The problem, AWS said, originated within the “EC2 internal network,” Amazon’s “Elastic Compute Cloud” service, which provides on-demand cloud capacity within AWS.

Shortly after 3 pm (22:00 GMT) Amazon said that “all AWS services returned to normal operations. Some services such as AWS Config, Redshift, and Connect continue to have a backlog of messages that they will finish processing in the next few hours.”

Ken Birman, a computer science professor at Cornell University, said software developers need to build better fault tolerance. He said AWS provides tools that developers can use to protect themselves in the event of a problem in one of its extensive network of data centers, and developers can also create backups with other cloud providers.

“When people cut costs and cut corners to try to get an application, and then forget that they skipped that last step and didn’t really protect against an outage, those companies are the ones that should really be looked at later,” Birman told Reuters.

Issue originating from AWS site known from previous outages

AWS provides computing power, data storage, and other digital services to businesses, governments, and individuals and is the world’s largest cloud provider, followed by Microsoft, Azure, and Alphabet’s.

Bar charts showing revenue and market share data for Amazon, Microsoft, Google, and other peers

Outages to your servers can cause disruptions to websites and platforms, ranging from food delivery apps to gaming platforms to airline systems, that rely on your cloud infrastructure.

AWS said on its status page that Monday’s outage originated from its location between the United States and EAST-1, its oldest and largest web service. The site suffered interruptions in 2021 and 2020.

According to the documentation on the AWS website, the US-EAST-1 site is often the default region for many AWS services.

Also read: Fall of Amazon Web Services affects streaming services around the world

“Fragile infrastructures”

The issue highlights how interconnected everyday digital services have become and their dependence on a small number of global cloud providers, with one failure wreaking havoc on business and everyday life, experts and academics said.

“This outage once again highlights the dependence we have on relatively fragile infrastructures,” said Jake Moore, global cybersecurity advisor at European cybersecurity firm ESET.

In Britain, Lloyd Bank, Bank of Scotland and telecoms service providers Vodafone were all affected, according to Downdetector’s UK website, as was the UK tax, payments and customs authority HMRC website.

“The main reason for this problem is that all these large companies have relied on a single service,” said Nishanth Sastry, research director at the University of Surrey’s Department of Computer Science.

Ookla, owner of Downdetector, said more than 4 million users reported problems due to the incident.

“For major enterprises, hours of cloud downtime translate into millions in lost productivity and revenue,” said Ryan Griffin, leader of the U.S. cyber practice at insurance broker McGill and Partners.

Wall Street was largely unfazed and Amazon shares rose 1.6% to $216.48.

From Snapchat to Venmo: an interruption leaves applications down

Ookla said at least a thousand businesses were affected by the outage. Apps like Reddit, Roblox, Snapchat and Duolingo. Everyone had been affected.

Artificial intelligence startup Perplexity, cryptocurrency exchange Coinbase, and trading app Robinhood all experienced outages on their platforms and attributed them to AWS. Amazon’s own services, including its shopping website, Prime Video and Alexa, were also affected.

Fortnite, owned by Epic Games, Clash Royale and Clash of Clans were among the gaming platforms affected. Uber (New York) and rival Lyft were also shot down in the United States.

In a post on X, Signal president Meredith Whittaker confirmed that the messaging app was hit by the outage, although billionaire Elon Musk, owner of

With information from Reuters.

Little text and great information in our X, follow us!




LEAVE A REPLY

Please enter your comment!
Please enter your name here