The Internet goes down with us-east-1

Hey folks,

I’m sure everyone’s had their fill of log4j findings for the weekend, so I’ll spare you any more commentary – like a bad holiday party with too much stuffing and rancid egg nog. Not like I’ve been to a holiday party like that, but one can imagine. But I digress…

Fresh off an action packed re:Invent week, AWS topped the news again, but not for the right reasons – us-east-1 was down for a couple of hours on Tuesday, taking a notable piece of the Internet with it. You have to feel for the AWS team, hope they catch a breather during the holidays.

As is always the case, the incident sparked every hybrid and multi-cloud take there is – some good, some bad. Preparing for these scenarios with a failover strategy is the right thing to do, however a lot is left for interpretation in terms of what is feasible – not just for your organization, but also the providers you consume.

While AWS always recommends a multi-region architecture as a best practice, there are services and support functions that don’t allow for it. For example, one thing I learned the hard way among many others, is that AWS SSO only runs in the single region that you set it up in. Second, the AWS Status Page failed to deliver accurate updates partially because a number of global services such as CloudTrail are in fact centralized within us-east-1.

I tend to avoid the nuclear reaction of multi-cloud for disaster recovery purposes. It’s simply far too much work for most teams. I do advocate for multi-cloud if the workload benefits from select advantages from one provider over another. But as the major cloud providers continue to operate at record levels of scale, it does beg the question how much faith we put into them for critical data and services. It’s always smart to have a “backup plan”, but a strategy is only as good as the ability to execute, so always sprinkle a dose of reality into the mix.



IAM checking these out...

The Cloudless Cloud Company – Platformonomics

This article was a fun read – seems like every couple of years we rehash the platform as a service concept. On the topic of abstractions, we got some mixed signals from AWS at re:invent – on one hand there was emphasis on moving further up the stack to solve customer needs, and on the other hand there was emphasis on primitives over frameworks. Regardless, there’s a lot of room for a healthy ecosystem building on top of the major providers.

Snaring the Bad Folks. Project by Netflix’s Cloud… | by Netflix Technology Blog | Dec, 2021 | Netflix TechBlog

The team at Netflix, always generous with sharing their practices and internal tooling, released a new open source project called Snare, a cloud detection and remediation service, capable of handling tens of millions log records per minute. Impressive results, and even more impressive is how brave they are to enable self-remediation for certain things.

re:Invent 2021 Recap - Chris Farris

Of all the re:Invent re:Cap pieces, this was my favorite. Chris Farris goes through all the major security announcements with a 1-2 line highlight. Helpful and humorous at the same time.

IAM listening to this...

The Blue Mitchell Quintet – Down With It (1965, Vinyl) - Discogs

Back with another Blue Note release, this time one that still flies under the radar in my mind. I always knew that I’d pull this one out when there was a major outage to write about. Blue Mitchell was a trumpet player with a long career, releasing a steady amount of releases on Riverside, Blue Note, and Mainstream from the late 50s to mid 70s. This is my favorite from him, a nice mix of hard bop and modal, with one notable bossa nova track for good measure. Perception is my favorite on this album, a real moody modal number.

