CINXE.COM
Last Week in AWS
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" > <channel> <title>Last Week in AWS</title> <atom:link href="https://www.lastweekinaws.com/feed/" rel="self" type="application/rss+xml" /> <link>https://www.lastweekinaws.com/</link> <description>AWS News Sprinkled With Snark</description> <lastBuildDate>Wed, 13 Nov 2024 22:20:58 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod> hourly </sy:updatePeriod> <sy:updateFrequency> 1 </sy:updateFrequency> <generator>https://wordpress.org/?v=6.6.2</generator> <image> <url>https://www.lastweekinaws.com/wp-content/uploads/2022/01/cropped-favicon-umber-512-32x32.png</url> <title>Last Week in AWS</title> <link>https://www.lastweekinaws.com/</link> <width>32</width> <height>32</height> </image> <item> <title>The Cold, Hard Truth About Your Cloud DR Strategy</title> <link>https://www.lastweekinaws.com/blog/the-cold-hard-truth-about-your-cloud-dr-strategy/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 13 Nov 2024 22:20:55 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14808</guid> <description><![CDATA[<p>Disaster recovery / business continuity / "backups" are always an interesting subject for very large scale cloud environments. Many of the old data-center strategies that grumpy old sysadmins (that's me!) relied upon don't hold water anymore.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-cold-hard-truth-about-your-cloud-dr-strategy/">The Cold, Hard Truth About Your Cloud DR Strategy</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>Disaster recovery / business continuity / “backups” are always an interesting subject for very large scale cloud environments. Many of the old data-center strategies that grumpy old sysadmins (that’s me!) relied upon don’t hold water anymore. I mentioned <a href="https://www.lastweekinaws.com/blog/s3-is-not-a-backup/">a couple of years ago</a> that S3 isn’t a backup, and that’s true in isolation. AWS’s vaunted “11 9’s of durability” solely apply to disk durability math; disasters, human error, and the earth crashing into the sun aren’t accounted for in that math. </p> <h2>What Cloud Providers Tell You vs. What Actually Happens</h2> <p>Cloud providers love to talk about their redundancy, their availability zones, and their durability numbers. But here’s what they don’t emphasize enough: most “disasters” aren’t actually externally triggered disasters – they’re mundane mistakes made by sleep-deprived humans who thought they were in the staging environment, or well-intentioned folks making a small configuration mistake that compounds when something else intersects with that error. </p> <h3>The Human Element: Your Biggest Threat</h3> <p>Let’s be honest: the chance that you’ll fat-finger something into oblivion is orders of magnitude more likely than the odds of a simultaneous failure across multiple AWS availability zones. You’ll delete the wrong object from a bucket, run that terrifying production script in the wrong terminal window, or (my personal favorite) discover that your production environment credentials somehow made it into your staging configuration. This is why privilege separation isn’t just a nice-to-have – it’s a must-have. The folks who can access your backups shouldn’t have access to production data, and vice versa. Why? Because when (not if) credentials get compromised or someone goes rogue, you don’t want them to have the keys to both your castle and your backup fortress. “Steven is trustworthy” may be well and good, but the person who exploits Steven’s laptop and steals credentials absolutely is not. </p> <h3>The Multi-Cloud Backup Conundrum</h3> <p>“But Corey,” you might say, “shouldn’t we just back up everything to another cloud provider?” Well, yes, but not for the reason you think. You should maintain a “rehydrate the business” level of backup with another provider not because it’s technically superior, but because it’s easier than explaining to your board why you didn’t when everything goes sideways. Remember: your cloud provider – and your relationship with them – remains a single point of failure. And while AWS’s durability math is impressive, it won’t help you when someone accidentally deletes that critical CloudFormation stack or when your account gets suspended due to a billing snafu. </p> <h2>The Reality of Restore Operations</h2> <p>Here’s something they don’t tell you in disaster recovery school: most restores aren’t dramatic, full-environment recoveries. They’re boring, single-object restores because someone accidentally deleted an important file or overwrote some crucial data. Embrace this reality. Design your backup strategy around it. This means: – Making common restore operations quick and simple – Maintaining granular access controls – Keeping detailed logs of what changed and when – Testing restore procedures regularly (and not just during that annual DR test that everyone dreads) </p> <h2> The “Back Up Everything” Trap </h2> <p>Here’s a controversial opinion: backing up everything in S3 to another location is both fiendishly expensive and completely impractical. Instead: – Figure out what data actually matters to your business – Determine different levels of backup needs for different types of data – Don’t waste resources backing up things you can easily recreate – Document what you’re NOT backing up (and why) so future-you doesn’t curse present-you </p> <h2>The Uncomfortable Truth About DR Planning</h2> <p>If there’s one constant about disasters, it’s that they never quite match our carefully crafted scenarios. The decision to activate your DR plan is rarely clear-cut. It’s usually made under pressure, with incomplete information, and with the knowledge that a false alarm could be just as costly as a missed crisis. </p> <p>This is why your DR strategy needs to be: </p> <ul class="wp-block-list"> <li>Flexible enough to handle partial failures </li> <li>Clear about who can make the call </li> <li>Tested regularly (and in weird ways – not just your standard scenarios) </li> <li>Documented in a way that panicked people can actually follow </li> </ul> <h2>The Bottom Line</h2> <p>Your DR strategy needs to account for both the dramatic (multi-region failures) and the mundane (someone ran `rm -rf` in the wrong directory). Build your systems assuming that mistakes will happen, credentials will be compromised, and disasters will never look quite like what you expected. And remember: the best DR strategy isn’t the one that looks most impressive in your architecture diagrams – it’s the one that actually works when everything else doesn’t.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-cold-hard-truth-about-your-cloud-dr-strategy/">The Cold, Hard Truth About Your Cloud DR Strategy</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>AWS’s Valkey Play: When a Fork Becomes a Price Cut</title> <link>https://www.lastweekinaws.com/blog/aws-valkey-play-when-a-fork-becomes-a-price-cut/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Thu, 10 Oct 2024 15:37:38 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14772</guid> <description><![CDATA[<p>In a move that's equal parts predictable and surprising, AWS has decided to make their Valkey-based services significantly cheaper than their Redis counterparts.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/aws-valkey-play-when-a-fork-becomes-a-price-cut/">AWS’s Valkey Play: When a Fork Becomes a Price Cut</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>In a move that’s equal parts predictable and surprising, AWS has decided to make their Valkey-based services significantly cheaper than their Redis counterparts. For those of you who’ve been living under a rock (or perhaps just sensibly ignoring the never-ending open-source licensing drama), Valkey is the successor fork of Redis, spearheaded by AWS in conjunction with several other players in the space and currently <a href="https://www.linuxfoundation.org/press/linux-foundation-launches-open-source-valkey-community">residing at the Linux Foundation</a>.</p> <h3 class="wp-block-heading" id="h-the-numbers-game">The Numbers Game</h3> <p>Let’s cut to the chase: AWS has <a href="https://aws.amazon.com/blogs/database/get-started-with-amazon-elasticache-for-valkey/">launched ElastiCache for Valkey</a> and <a href="https://aws.amazon.com/blogs/database/get-started-with-amazon-memorydb-for-valkey/">MemoryDB for Valkey</a> at significant discounts compared to their Elasticache for Redis version offering. These offer the same features and APIs Redis users know and… tolerate, with trivial migration – just at a lower price tag. It’s almost as if AWS discovered that Redis’ service margin was just taking up space in their massive bank vaults.</p> <h3 class="wp-block-heading" id="h-the-strategic-play">The Strategic Play</h3> <p>Here’s where it gets interesting. By slashing prices on the Valkey versions, AWS is essentially paying customers to switch – and more importantly, to start thinking of “Valkey” as something distinct from Redis. It’s a move that’s both customer-friendly and strategically brilliant, reminiscent of the Day One AWS of old. Customers get lower costs, and AWS gets to shift the ecosystem towards what’s shaping up to be the obvious Redis successor. It’s the cloud equivalent of having your cake and eating it too, except in this case, the cake is a third off, AWS gets to keep itself in a leadership spot with regards to a technology, and still turning a profit on the whole thing.</p> <h3 class="wp-block-heading" id="h-the-customer-obsession-angle">The Customer Obsession Angle</h3> <p>AWS loves to tout their customer obsession, and for what feels like the first time in forever, they’ve found a way to align it perfectly with their money obsession. By offering a significant discount on feature-equivalent services, they’re giving customers a tangible benefit while simultaneously advancing their own interests. It’s a rare win-win in the typically zero-sum game of cloud economics.</p> <h3 class="wp-block-heading" id="h-comparing-apples-to-significantly-cheaper-apples">Comparing Apples to Significantly Cheaper Apples</h3> <p>This isn’t AWS’s first pricing rodeo, but it’s certainly one of their more interesting ones. Usually, AWS price cuts are about as exciting as watching paint dry – a fraction of a cent here, a microscopic percentage there, and generally in some subset of far-flung regions where relatively few customers actually run workloads. This move, however, is substantial enough to make even the most jaded Cloud Economist raise an eyebrow.</p> <h3 class="wp-block-heading" id="h-the-technical-nitty-gritty">The Technical Nitty-Gritty</h3> <p>Let’s not forget the technical side of things. Valkey isn’t just Redis with AWS’s name slapped on it. As per GitHub and as of this writing, it’s got roughly 10 times as many contributors and is hundreds of commits ahead of Redis. It’s like Redis, but with more caffeine, a bigger dev team, and a community that’s suddenly not beholden to keeping requested features stuffed behind a paywall.</p> <h3 class="wp-block-heading" id="h-the-bottom-line">The Bottom Line</h3> <p>AWS has managed to pull off something rather clever here. They’re providing significant cost savings to customers, pushing their strategic agenda forward, and still likely making a hefty profit. It’s a masterclass in cloud economics that doesn’t require a PhD to appreciate – just a willingness to switch to a product with a slightly sillier name.</p> <p>In the end, this move shows that even in the cut-throat world of cloud computing, there’s still room for surprises. And if those surprises come with a massive third off the price tag, well, who are we to complain?</p> <p>The post <a href="https://www.lastweekinaws.com/blog/aws-valkey-play-when-a-fork-becomes-a-price-cut/">AWS’s Valkey Play: When a Fork Becomes a Price Cut</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>Amazon GenAI Services</title> <link>https://www.lastweekinaws.com/blog/amazon-genai-services/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Fri, 12 Jul 2024 20:31:35 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14668</guid> <description><![CDATA[<p>I was in New York this week for the AWS Summit, and while it’s always great to catch up with readers (thanks to those of you who came out to the drinkup!), AWS friends, and others, I found myself rather taken aback by the overwhelming strength behind the Generative AI theme of the entire event. […]</p> <p>The post <a href="https://www.lastweekinaws.com/blog/amazon-genai-services/">Amazon GenAI Services</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>I was in New York this week for the AWS Summit, and while it’s always great to catch up with readers (thanks to those of you who came out to the drinkup!), AWS friends, and others, I found myself rather taken aback by the overwhelming strength behind the Generative AI theme of the entire event.</p> <p>I know, I know–all of the industry is currently consumed by a paroxysm of GenAI hype; I’m not ignorant of this fact. However, the <a href="https://twitter.com/QuinnyPig/status/1811007182357377360">badge lanyards were GenAI branded</a>, the <a href="https://twitter.com/QuinnyPig/status/1811053439910056280">keynote opening slide simply said “Generative AI,”</a> and every single offering discussed during the ~98 minute keynote was centered around Generative AI. Nearly every expo hall booth was dripping with GenAI; <a href="https://elastic.co">Elastic</a> even went so far as to rebrand itself as “The Search AI company.”</p> <p>I am not an AI refusenik; I use it myself for a variety of tasks. That said, I’m old; “because it’s cool” (and let’s be clear, AI is very cool) fails to state a compelling business case. There has to be downstream value derived from any business initiative. Terms like “transformation” are bandied around cheaply; so far all of the value business that I’ve spoken with are deriving from GenAI initiatives represent evolutionary steps rather than revolutionary steps. </p> <p>And so, the hyperfocus on GenAI is concerning to me because of what’s being shunted aside to create room for it.</p> <p>They’re Amazon <strong>WEB</strong> Services, not Amazon GenAI Services. I <a href="http://www.duckbillgroup.com">fix large AWS bills</a> for large enterprises for a living; my customers have a raft of very large-scale challenges that don’t involve GenAI in the slightest. Two years ago the AWS event keynotes were full of offerings that made solid strides towards assuaging those customer concerns. This year there was none of that; the non-GenAI section of the keynote <em>simply didn’t exist</em>. </p> <p>It’s not that AWS doesn’t have things to talk about beyond GenAI; the day before the summit <a href="https://www.aboutamazon.com/news/aws/graviton4-aws-cloud-computing-chip">Graviton4 instances became generally available</a> eight months after being announced at re:Invent. This wasn’t even <em>mentioned</em> on stage, nor was anything else germane to running infrastructure.</p> <figure class="wp-block-image"><img decoding="async" src="https://lh7-us.googleusercontent.com/docsz/AD_4nXf1IZGGbdLX2CYJ-a0-_9xrNi5YGbmr4gasiUdls4YR218wreWRlBe1JRfxEvrMTwYHFfKQYDYwhN5BXijTgjb4v3YUeWeFA6-jmP7_ZUFrJcVJ780BXKyYlFhdM5_NaPXzvkMqOILkP_2gh24jEWdWRWk?key=6dezwv6WVmWlKLt99k4Bng" alt=""/></figure> <p><em>326 bites at the apple and the other two companies are still beating the pants off of you?</em></p> <p>AWS did make the time to take a cheap shot at Google and Microsoft about how AWS has had 326 AI/ML feature releases, which is “more than twice as many as the other two companies combined.” I cannot fathom who in their right mind thought that this would be a good thing to say in public. “Number of features” is a metric about which no customer cares (particularly when ‘deploying a thing to a new region’ is counted as a feature), and it’s needlessly bringing up competitors at a moment when it’s very apparent that AWS is having its corporate butt handed to it by Google and Microsoft in the GenAI space. It stank of competitor focus, insecurity, and fear. In short, it was emblematic of “Day 2” thinking that many former Amazonians report has taken root inside the culture. It’s becoming increasingly apparent that when it feels threatened, Amazon will abandon the values it once held dear–and that should be terrifying enough to give anyone pause.</p> <p>The hell of it is that though they don’t seem to realize it themselves, AWS has many things to offer this space that showcase them at their best. Whether your workloads are GenAI or not, you need performant data services to feed them, and high throughput networking to convey that data to compute services that will crunch that data to perform tasks. When it comes to infrastructure, nobody can touch Amazon’s level of operational excellence, no matter what those other companies say in their own marketing to the contrary. I don’t build workloads on AWS because I’m some kind of shill; I do it because from an infrastructure perspective they are pretty clearly the best there is.</p> <p>But as soon as you start moving up the stack into GenAI applications and GenAI assistants, Amazon’s leadership position evaporate as they begin trailing significantly behind their competition. It’s very hard to contend with a straight face that Amazon Q Developer can outcompete GitHub Copilot, and while the just-launched low-code internal app builder <a href="https://aws.amazon.com/blogs/aws/build-custom-business-applications-without-cloud-expertise-using-aws-app-studio-preview/">App Studio</a> seems promising, it remains to see if it can even outcompete the now-deprecated Amazon Honeycode, let alone something like my own beloved Retool for quickly throwing together inward-facing applications. (As one example, it’s not at all clear exactly what constitutes a user-hour; does a user session time out, or does it linger forever like SageMaker Canvas’s user sessions do? At 25¢ per user-hour, that’ll add up once the free trial expires, and the documentation as of this writing remains utterly silent on this.)</p> <p>The longer term problem here is that as long as AWS is going to say things that are clearly not true (“Amazon Q Developer is the most capable generative AI–powered assistant for software development”), it gives license to call into question how true other statements AWS makes, such as “we don’t train our AI services on customer data.” This leads to an erosion of trust, as AWS’s historical “show, don’t tell” marketing approach gives way to overhyping the bejeezus of of its offerings to the point where the truly remarkable aspects of what they’re building gets lost amid a sea of noise.</p> <p>I’m worried about the future of AWS in a way I never envisioned I would be.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/amazon-genai-services/">Amazon GenAI Services</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>“Apparently I Stuttered: A Compute Optimizer Clarification”</title> <link>https://www.lastweekinaws.com/blog/apparently-i-stuttered-a-compute-optimizer-clarification/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 26 Jun 2024 19:40:36 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14646</guid> <description><![CDATA[<p>There have been some noises about this week's newsletter issue in which I criticized the release of AWS Compute Optimizer offering RDS recommendations...Let me clarify my position and commentary on this feature announcement.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/apparently-i-stuttered-a-compute-optimizer-clarification/">“Apparently I Stuttered: A Compute Optimizer Clarification”</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>There have been some noises about this week’s newsletter issue in which I criticized the release of <a href="https://aws.amazon.com/about-aws/whats-new/2024/06/aws-compute-optimizer-amazon-rds-rightsizing-recommendations-mysql-postgresql/">AWS Compute Optimizer offering RDS recommendations</a> thusly:</p> <ul> <p> Too bad it’s completely useless for most customers, because RDS only has its own bespoke Reserved Instances, which are wildly inflexible. The fact that Savings Plans don’t extend to cover RDS is one of the more customer-hostile things AWS does, and a number of large customers are annoyed by it. So yeah, use this if you want recommendations you can’t take advantage of without leaving bushels of money on the table, I guess.</p> </ul> <p>Let me clarify my position and commentary on this feature announcement.</p> <p><strong>The feature itself is fine, bordering on great.</strong> “You’re running RDS instances of type X, consider type Y instead” is a solid enhancement. For extra style points, it even supports a whole slew of customizations around the recommendations: RI awareness (which we’ll get into in a sec!), idle detection, storage, the lookback period under analysis, and integration with RDS memory metrics for deeper inspection. This is a solid feature enhancement that I’m sure will brighten the days of many customers and represents what I know to be a lot of hard work and internal negotiation to develop and launch.</p> <p>However!</p> <p>My concern with the feature is that customers are inherently limited in their ability to migrate between RDS instance types due to the inflexibility of RDS Reserved Instances and the RDS org not deciding not to support Savings Plans, or even a similar structure that’s worse in every way–like SageMaker’s own imagining of Savings Plans versus supporting the existing ones. While this feature announcement is RI-aware and will make recommendations that take those into account, if a customer has existing high RI coverage on RDS, they may not see recommendations to downsize their over-provisioned RDS instances. </p> <p>That’s my issue: it’s not about this announcement, it’s about the capability being hamstrung by RDS RIs making this less effective than it could be–which is entirely an RDS issue, not an issue with the feature. If there were more flexibility in RDS RIs (Savings Plans!) then this feature might show substantially more optimization opportunities.</p> <p>What do I mean about RDS RI inflexibility? While the discounts can be high (up to 69% discounting off of on-demand pricing), the RIs are bound to a combination of region, database engine, instance class, and deployment type, roughly equivalent to the inflexibility we had with EC2 Standard RIs–and why Compute Savings Plans were such a massive improvement. One of the best parts about Compute Savings Plans is that it doesn’t matter whether you’re using Lambda, Fargate, EC2, what instance family you’re using, etc–as long as you’re spending at least some committed hourly spend amount, there aren’t artificial economic barriers that constrain your architectural decisions.</p> <p>This isn’t the <a href="https://www.lastweekinaws.com/blog/the-great-lie/">first time I’ve made this observation</a>, <a href="https://x.com/nocksers/status/1583597077044822017">as</a> <a href="https://x.com/nikovirtala/status/1318615778128809984">have</a> <a href="https://x.com/philsweeney/status/1285878916717010944">many</a> <a href="https://x.com/fastchicken/status/1717674710639710507">others</a>, including the vast majority of our clients privately, and I continue to give the RDS organization a remarkably low score for Customer Obsession. (as another example: gp3 is 20% less expensive per gigabyte than gp2 on EBS, but the per GB cost remains the same between gp2 and gp3 on RDS.)</p> <p>In summary, if you’re running a lot of RDS on-demand (something I strongly advise folks not to do in almost every circumstance) this feature is the cat’s pajamas. If you have high RI coverage on RDS, this feature instead serves as a tantalizing glimpse of a world that the RDS org has firmly shut the door on, at least for the time being.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/apparently-i-stuttered-a-compute-optimizer-clarification/">“Apparently I Stuttered: A Compute Optimizer Clarification”</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>Changing of the Guard: “AWS Appoints Matt Garman as CEO”</title> <link>https://www.lastweekinaws.com/blog/changing-of-the-guard-aws-appoints-matt-garman-as-ceo/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Tue, 14 May 2024 15:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14600</guid> <description><![CDATA[<p>This morning's <a href="https://www.aboutamazon.com/news/company-news/leadership-update-aws-adam-selipsky-matt-garman">announcement</a> that Adam Selipsky would be stepping down as AWS CEO, with longtime Amazonian Matt Garman stepping into the role, feels like a natural correction. Garman has long been seen as the heir apparent to AWS's leadership. When Selipsky was named CEO in the last succession, my initial reaction was a baffled, "I'm sorry, who?"</p> <p>The post <a href="https://www.lastweekinaws.com/blog/changing-of-the-guard-aws-appoints-matt-garman-as-ceo/">Changing of the Guard: “AWS Appoints Matt Garman as CEO”</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>This morning’s <a href="https://www.aboutamazon.com/news/company-news/leadership-update-aws-adam-selipsky-matt-garman">announcement</a> that Adam Selipsky would be stepping down as AWS CEO, with longtime Amazonian Matt Garman stepping into the role, feels like a natural correction. Garman has long been seen as the heir apparent to AWS’s leadership. When Selipsky was named CEO in the last succession, my initial reaction was a baffled, “I’m sorry, who?”</p> <p>To me, Selipsky has always been a bit of an enigma–extremely polished and always on-message, making it hard to see the person behind the public persona. His tenure, while marked by an impressive degree of message discipline, has been shadowed by an overemphasis on AI, obscuring other crucial areas.</p> <p>Matt Garman is a different story. Despite any faults, he embodies the spirit of AWS, bleeding Squid Ink and Amazon Orange. He has either shaped the organization or been shaped by it to the point where they are indistinguishable. For example, he once assured me that I could contact him directly if I ever felt a customer wasn’t being treated fairly, and I am perhaps recklessly confident he would still take that call today. This level of customer obsession has been notably absent at Amazon in recent years, and I am hopeful we are on the brink of its revival.</p> <p>I hope a few other things are about to change as well. I’ve gotten the distinct sense that whatever else it says about itself, AWS has shifted to being competitor focused at the expense of customers–and become a far less interesting company for it. Literally yesterday I had a conversation about this with a former Amazonian. We came to the reluctant conclusion that we were witnessing the long term decline of AWS into initially the number 2 cloud player in the next couple of years; neither of us had any idea how to arrest the slide.</p> <p>Today is apparently a new day–and I feel that pessimism falling away. AWS is in dire need of a shakeup–from moving past the GenAI hype and the problematic practice of launching confusing, costly secondary services rather than improving the originals, to addressing the internal power struggles reminiscent of “The Battle of Conway’s Law” playing out across their product catalog. Installing a CEO who has been with the company since his internship might just be the fresh start AWS needs.</p> <p>I remain cautiously optimistic. Managing AWS at its current $100 billion+ run rate is a colossal task, and at such scales, many corporate cultures struggle to maintain their integrity. Yet, with Garman at the helm, there’s a genuine chance for a return to the foundational values that made AWS the leader it’s become, hopefully steering it back to innovation and customer-centric strategies.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/changing-of-the-guard-aws-appoints-matt-garman-as-ceo/">Changing of the Guard: “AWS Appoints Matt Garman as CEO”</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>AWS’s (de)Generative AI Blunder</title> <link>https://www.lastweekinaws.com/blog/aws-degenerative-ai-blunder/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 06 Dec 2023 19:00:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14411</guid> <description><![CDATA[<p>AWS has been very publicly insecure about the perception that it’s lagging behind in the Generative AI space for the past year. Unfortunately, rather than setting those perceptions to rest, AWS’s GenAI extravaganza at re:Invent 2023 seemed to prove them true.  Of the 22 GenAI-related announcements, half of them are still in preview. Many were […]</p> <p>The post <a href="https://www.lastweekinaws.com/blog/aws-degenerative-ai-blunder/">AWS’s (de)Generative AI Blunder</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>AWS has been <a href="https://www.theinformation.com/articles/amazon-cranks-up-its-ai-euphoria">very publicly insecure</a> about the perception that it’s lagging behind in the Generative AI space for the past year. Unfortunately, rather than setting those perceptions to rest, AWS’s GenAI extravaganza at re:Invent 2023 seemed to prove them true. </p> <p>Of the 22 GenAI-related announcements, <em>half</em> of them are still in preview. Many were seemingly developed in crash programs launched since ChatGPT was released a year ago. <em>All</em> distracted AWS from resolving more relevant challenges that customers are facing.</p> <h2 class="wp-block-heading" id="h-amazon-q-quartermaster-or-omnipotent-being-nbsp">Amazon Q: Quartermaster or omnipotent being? </h2> <p>Nowhere is this more evident than in <a href="https://aws.amazon.com/q/)">Amazon Q</a> (currently, you guessed it, in preview).</p> <p>In case you’re blissfully unaware of the re:Invent announcement, Amazon Q is a catchall brand (similar to GitHub Copilot or AWS’s own SageMaker) that encompasses coding assistance, a chatbot in the AWS console, a chatbot on AWS’s marketing website, a natural language processing interface for QuickSight, a version for AWS Supply Chain, and several more. It’s hard to pin down just which incarnation of “Amazon Q” someone might be talking about (itself a significant problem!), but here I’m focused on the chatbot available on the website and inside the AWS Console. </p> <p>The Amazon Q chatbot is the equivalent of Microsoft Clippy without the personality, but add a traumatic brain injury. It now shows up on every AWS page load. It periodically <a href="https://x.com/QuinnyPig/status/1730378799257354653">opens a giant chat window that obscures content</a> and requires multiple mouse clicks to minimize. It’s an omnipresent icon in the lower right of the page that cannot be removed short of using a browser extension.</p> <p>However bad you might guess Amazon Q is, the hour I spent plying it with questions demonstrates that it’s oh-so-very much worse than that.</p> <h2 class="wp-block-heading" id="h-amazon-q-s-challenges">Amazon Q’s … challenges</h2> <p>In one short session playing with Amazon Q, it has given me false information about:</p> <p><span style="text-decoration: underline;">historical AWS price hikes:</span></p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="728" height="1004" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image10.jpg" alt="" class="wp-image-14420" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image10.jpg 728w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image10-218x300.jpg 218w" sizes="(max-width: 728px) 100vw, 728px" /><figcaption class="wp-element-caption">None of these are true.</figcaption></figure></div> <p><span style="text-decoration: underline;">how to configure AWS services:</span></p> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" width="687" height="1024" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/font-687x1024.jpg" alt="" class="wp-image-14414" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/font-687x1024.jpg 687w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/font-201x300.jpg 201w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/font.jpg 730w" sizes="(max-width: 687px) 100vw, 687px" /><figcaption class="wp-element-caption">Just like Amazon itself, DynamoDB doesn’t support unions.</figcaption></figure></div> <p><span style="text-decoration: underline;">unannounced upcoming region launches:</span></p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" width="500" height="500" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/bryan-cantrill.jpg" alt="" class="wp-image-14418" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/bryan-cantrill.jpg 500w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/bryan-cantrill-300x300.jpg 300w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/bryan-cantrill-150x150.jpg 150w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/bryan-cantrill-350x350.jpg 350w" sizes="(max-width: 500px) 100vw, 500px" /><figcaption class="wp-element-caption">Unlike Amazon Q, I am not a complete moron.</figcaption></figure></div> <p><span style="text-decoration: underline;">how <a href="https://www.duckbillgroup.com/services/aws-contract-negotiation/">AWS contractual discounts work</a> (which, by the way, AWS doesn’t even publicly acknowledge <strong>exist</strong>):</span></p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" width="700" height="1004" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image6.jpg" alt="" class="wp-image-14429" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image6.jpg 700w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image6-209x300.jpg 209w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption class="wp-element-caption">Every single one of these points is wrong in some way.</figcaption></figure></div> <p><span style="text-decoration: underline;">which services see dismal customer adoption:</span></p> <div class="wp-block-image"> <figure class="aligncenter size-large is-resized"><img decoding="async" width="508" height="1024" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image9-508x1024.png" alt="" class="wp-image-14421" style="aspect-ratio:0.49609375;width:454px;height:auto" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image9-508x1024.png 508w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image9-149x300.png 149w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image9.png 710w" sizes="(max-width: 508px) 100vw, 508px" /><figcaption class="wp-element-caption">AWS releases usage numbers as frequently<br>as it releases wolves onto the streets of Seattle</figcaption></figure></div> <p>… and so many other things that AWS employees would never ever tell me, even under the strictest of NDAs.</p> <p>In fact, if an AWS employee were to answer my questions the way Amazon Q does, I imagine Andy Jassy would parachute into the room to personally fire them on the spot. Instead, AWS has given Amazon Q a place of prominence on every webpage and promoted it heavily. </p> <p>Lest you think I’ve somehow subverted the service to grind an ax, let me give you a poignant example. This image shows a very reasonable request that a customer might ask and a very questionable response from Amazon Q:</p> <figure class="wp-block-image size-full"><img decoding="async" width="1024" height="653" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image3.png" alt="" class="wp-image-14422" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image3.png 1024w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image3-300x191.png 300w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image3-768x490.png 768w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Note the extreme level of prompt engineering knowhow <br>that went into crafting this question.</figcaption></figure> <p>A few things to note about Amazon Q’s answer:</p> <p>1. The specific instance recommendation is incorrect. If cost is your primary concern, r7iz instances are not where you should be looking.</p> <p>2. The general sweep of instance recommendations is incorrect. If you want to save money per node, that’s one of the strongest selling points for AWS’s Graviton processors, which AWS has been heavily promoting for several years.</p> <p>3. This image is a screenshot that was posted to the official Amazon Q <a href="https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/">launch blog post</a>.</p> <p>After I pointed these flaws out to AWS, it silently updated the blog post image to this version:</p> <div class="wp-block-image"> <figure class="aligncenter size-full is-resized"><img decoding="async" width="795" height="1024" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image1.png" alt="" class="wp-image-14423" style="aspect-ratio:0.7763671875;width:621px;height:auto" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image1.png 795w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image1-233x300.png 233w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image1-768x989.png 768w" sizes="(max-width: 795px) 100vw, 795px" /><figcaption class="wp-element-caption">Also observe that the “Max 200 words” notation has been <br>updated to “Max 1000 characters” as the team <br>frantically polishes the service post-release.</figcaption></figure></div> <p>Note that the AWS team still explicitly has to tell Amazon Q that it wants Graviton instances — something that customers are not likely to know that they have to ask for. To its credit, Amazon Q correctly notes that r6g instances are less expensive than the newer r7g instances, which amusingly is an observation that AWS has been adamant about challenging on the grounds of price for performance.</p> <p>My own business partner, thinking I was simply exaggerating for effect, made some legitimate attempts to communicate with Amazon Q himself, with now-predictable results. And he’s not the only one, either. Other people have gotten incorrect answers <a href="https://twitter.com/mr_woop/status/1731206036579610953">about SES</a>, <a href="https://twitter.com/matteosonoioo/status/1731283364424864165">the history of AWS services launches</a>, and Amazon Q <a href="https://twitter.com/awsexp/status/1731768666364989923">refused to compare AWS to Azure compute</a>. There’s also this gem, where Amazon Q makes up all sorts of BS about my company, <a href="https://www.duckbillgroup.com/services/">The Duckbill Group</a>!</p> <div class="wp-block-image"> <figure class="aligncenter size-large is-resized"><img decoding="async" width="408" height="1024" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image8-408x1024.png" alt="" class="wp-image-14424" style="width:437px;height:auto" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image8-408x1024.png 408w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image8-120x300.png 120w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image8.png 674w" sizes="(max-width: 408px) 100vw, 408px" /><figcaption class="wp-element-caption">Literally not a single correct <br>piece of information.</figcaption></figure></div> <p>Perhaps even more worrisome is that re-running the query yielded completely different results!</p> <div class="wp-block-image"> <figure class="aligncenter size-full is-resized"><img decoding="async" width="384" height="256" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image4.png" alt="" class="wp-image-14425" style="aspect-ratio:1.5;width:440px;height:auto" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image4.png 384w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image4-300x200.png 300w" sizes="(max-width: 384px) 100vw, 384px" /><figcaption class="wp-element-caption">I’m OK with this answer, though.</figcaption></figure></div> <p>I know there’s a small army of people who worked on these releases. So before you think I’m punching down: I believe AWS’s headlong rush to GenAI, dropping the ball in the process, is a violation of its own leadership principles, <strong>not</strong> the fault of individual employees on the team. The company failed them; they didn’t fail the company. In the process, AWS has failed its customers. We’re all worse off for this.</p> <p>On the plus side, given its prominent placement and positioning, it sure does seem like I should be able to take everything Amazon Q says as an official statement on behalf of AWS, which is actually an upgrade over Amazon’s own public relations. It’s truly an official AWS spokesmodel (hat tip to Forrest Brazeal for <a href="https://x.com/forrestbrazeal/status/1730270865789456730?s=20">coining</a> that amazing turn of phrase).</p> <h2 class="wp-block-heading" id="h-chatgpt-still-reigns-supreme">ChatGPT still reigns supreme</h2> <p><a href="https://twitter.com/keithmattix/status/1731137114224361731">Keith Mattix has observed</a> that customer expectations for chatbots such as these are now sky-high thanks to just how great ChatGPT is. Three years ago, we might have all dismissed Amazon Q as another silly, overly-simple chatbot attempt. But ChatGPT changed the world, and Amazon Q makes it look like Amazon didn’t notice, didn’t care, or shipped it anyway.</p> <p>In fact, that’s what makes all this so galling: For each of these examples, ChatGPT answers the same prompt <strong>correctly</strong>. It’s clearly possible to solve the types of problems that Amazon Q is struggling with — but those solutions take significant time and investment, plus an incubation period that may not align with Amazon’s annual conference. </p> <p>AWS made a choice here. In so doing, it has cemented its reputation for now as a company that cannot deliver on Generative AI at anything above an infrastructure level (but happily tries to mislead its customers to believe otherwise).</p> <h2 class="wp-block-heading" id="h-genai-lies-outside-aws-s-current-competency">GenAI lies outside AWS’s current competency</h2> <p>I’m somewhat sympathetic to the position that AWS found itself in that led to this point. There’s been pressure from market analysts to demonstrate that they along with everyone else are in fact, generative AI companies.</p> <p>The problem is that from where I sit, AWS simply isn’t good at these sorts of large-scale, cross-functional, higher-order problems. The things AWS is excellent at all revolve around innovation of infrastructure. </p> <p>Run the GPUs required for other companies’ GenAI plays, improve the data storage services and the network to make that better … but as soon as AWS attempts to move up the stack into the application space, the wheels fall off in major ways. It requires a competency that AWS does not have and has not built up since its inception. </p> <p>It leaves AWS stuck in the unenviable position of being the rails atop which the <a href="https://github.com/features/copilot">actually exciting companies</a> in this space ride. Amazon seemingly either cannot or will not accept this, and I do get that; the ultimate fate of “plumbing” companies is to become like today’s Tier 1 backbone providers: incredibly important to making everything else work but largely invisible to most people. More than attention, the problem for these companies is that it’s the folks building on top of them that capture the lion’s share of the value delivered. </p> <p>As a result, Amazon apparently torpedoed what could have been a great 2023 product roadmap to focus on things that, frankly, are not good and don’t advance AWS’s reputation. Amazon can talk as much as it wants about how it’s been doing AI and ML for the past 20 years, but it’s obvious that everything AWS announced in the GenAI space last week was less than a year in the making.</p> <h2 class="wp-block-heading" id="h-an-aws-spokesmodel-declined-to-comment-for-this-article">An AWS spokesmodel declined to comment for this article</h2> <div class="wp-block-image"> <figure class="aligncenter size-full is-resized"><img decoding="async" width="732" height="638" src="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image2.png" alt="" class="wp-image-14426" style="aspect-ratio:1.1473354231974922;width:450px;height:auto" srcset="https://www.lastweekinaws.com/wp-content/uploads/2023/12/image2.png 732w, https://www.lastweekinaws.com/wp-content/uploads/2023/12/image2-300x261.png 300w" sizes="(max-width: 732px) 100vw, 732px" /></figure></div> <h2 class="wp-block-heading" id="h-nbsp-amazon-s-prepared-statement"> Amazon’s prepared statement</h2> <p>Shortly before this article was published, presumably in response to <a href="https://www.platformer.news/p/amazons-q-has-severe-hallucinations">Platformer’s reporting of Amazon Q leaking confidential information</a>, AWS sent us this prepared statement.</p> <blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"> <p>“Some employees are sharing feedback through internal channels and ticketing systems, which is standard practice at Amazon. No security issue was identified as a result of that feedback. We appreciate all of the feedback we’ve already received and will continue to tune Q as it transitions from being a product in preview to being generally available. Amazon Q has not leaked confidential information.”</p> </blockquote> <p></p> <p>The post <a href="https://www.lastweekinaws.com/blog/aws-degenerative-ai-blunder/">AWS’s (de)Generative AI Blunder</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>Generative AI Builds a re:Invent Scavenger Hunt</title> <link>https://www.lastweekinaws.com/blog/generative-ai-builds-a-reinvent-scavenger-hunt/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 22 Nov 2023 15:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14383</guid> <description><![CDATA[<p>Let's begin with the tl;dr: At this year's re:Invent, I'm hosting a photo scavenger hunt with significant prizes for "most items found" and "most creative entry." Sign up through my webapp at findme.lastweekinaws.com. The rest of this post details how I built this app.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/generative-ai-builds-a-reinvent-scavenger-hunt/">Generative AI Builds a re:Invent Scavenger Hunt</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>Let’s begin with the tl;dr: At this year’s re:Invent, I’m hosting a photo scavenger hunt with significant prizes for “most items found” and “most creative entry.” Sign up through my webapp at <a href="https://findme.lastweekinaws.com">findme.lastweekinaws.com</a>. The rest of this post details how I built this app.</p> <h2 class="wp-block-heading" id="h-i-am-bad-at-programming">I am bad at programming</h2> <p>I’ve often said that the two programming languages in which I’m fluent are “brute force” and “enthusiasm.” Never have I felt that to be more true than when wandering into the mists of TypeScript. Even the name is terrible! If it were honest about how developers write code today (copying it off of Stack Overflow), it would be called PasteScript instead. It’s a variant of JavaScript that whines at you for the most pedantic of reasons, like a <em>Star Wars</em> fan on the internet telling you that it’s actually GNU/Linux.</p> <p>All of this is to say that I am absolutely not a frontend developer, and I’m bad at it when I foolishly pretend otherwise. And yet this app exists due to a confluence of three factors: AWS Amplify, Generative AI, and Danny Banks. I will get to each of those three in turn.</p> <h2 class="wp-block-heading" id="h-aws-amplify">AWS Amplify</h2> <p>Amplify seems designed for the opposite of me. I’m a backend infrastructure guy, clueless about the frontend universe. Amplify allows you to define backend infrastructure on AWS (including through their no-code <a href="https://aws.amazon.com/amplify/studio/">Amplify Studio</a>) and presents it as components for your JavaScript framework of choice–provided that it’s React, apparently.</p> <p>This is, in my case, more helpful than that description sounds. Without Amplify I would have undoubtedly been pushed into a place of rolling my own API calls, done so insecurely, and been left in a place of embarrassing myself more than I usually do, which is via the words coming out of my mouth where other people can hear them.</p> <p>I hit my fair share of snags along the way; I’ve given roughly two dozen distinct pieces of feedback about the travails I encountered along the way to the Amplify product team along the way. I’m not going to rehash them here, for the simple reason that it doesn’t seem quite fair. As I write this, they’re in the midst of a <a href="https://aws.amazon.com/blogs/mobile/join-us-for-a-week-of-aws-amplify-launches/">week of Amplify launches</a>; it’s probable that some of the painful parts I slammed into will no longer be an issue before you read these words. Further, a lot of the issues I smacked into are doubtless caused by my frontend naivety, and are the digital equivalent of me mouthing off about the laptop you’ve sold me because it makes for a crappy hammer.</p> <p>Once I grasped the basics, Amplify was a significant simplifying factor to building this thing. It’s worth watching in the months ahead.</p> <h2 class="wp-block-heading" id="h-generative-ai">Generative AI</h2> <p>I didn’t start in a vacuum; that’s a terrible idea. One of the tools in my toolbelt is increasingly ChatGPT (pronounced “Chat Gippety”). It’s a wellspring of creativity for me; I view it as something of a “writers’ room” contributor to some forms of content generation. You can think of it as a creative springboard.</p> <p>That said, I think you’re doing yourself a disservice if you pass over creative control of things to generative AI. If you let it speak on your behalf without reviewing its output first you’re almost certainly acting a fool at this point, and you’re risking your reputation. But where it shines is in breaking a task into steps, building a methodical approach, and in its sheer breadth of coverage.</p> <p>I also use GitHub’s Copilot and Copilot Chat. I strongly suspect that one of the reasons that engineers are raving about the future of generative AI more so than most folks is that, unlike human languages, programming languages are themselves syntactically rigid. There’s far less ambivalence, meaning that computers are far more effective at writing code than they are writing customer service responses.</p> <p>With a combination of these two products, I was able to start with a general approach of questions (“How do I build a webapp?”), refine what I want to do further (“How do I build a webapp in React that uses AWS Amplify as the backend?”), and ultimately have it spit out a prototype (“Write the code for a webapp that authenticates users, offers a list of scavenger hunt items, and lets each user upload a picture for the item that they’ve found.”). It wasn’t nearly as smooth as that might make it sound; over the course of this experiment, I asked ChatGPT dozens of questions and spent more time last week talking to Copilot than I did my wife. But by the end of the process, I had something that worked–but also had a few flaws.</p> <p>The most obvious of those is something that you might find yourself if you attempt to build a computer program entirely via copying and pasting code from Stack Overflow, which is basically an industry-wide best practice at this point. They don’t call them “Full Stack Overflow developers” for nothing. Each code segment you slap in with your meaty paste-paws is written by someone else; it uses different programming idioms, it’s stylistically different than what surrounds it, and you end up with a Frankenstein’s monster of a program by the time you’re done. So it “worked,” but it also looked like something that was put together by multiple developers who weren’t allowed to talk to each other. That only works when you’re attempting to build something out of several AWS services.</p> <p>Clearly, something had to be done to fix this, lest I end up in a death spiral of unmaintainability.</p> <p>And that’s where Danny Banks factors in.</p> <h2 class="wp-block-heading" id="h-danny-banks">Danny Banks</h2> <p>I asked for a bit of best practice guidance along with assistance resolving a particularly odd error in the <a href="https://aws.amazon.com/developer/community/community-builders/">AWS Community Builders</a> Slack team, and was directed to AWS Principal Design Technologist <a href="https://dbanks.design/">Danny Banks</a>. Danny is, to put it quite simply, a mensch. He helped me work through the error I was encountering–but he didn’t stop there. Presumably, because he didn’t want ever again to have to see something as eye-searingly awful as the code I’d written, he set about refactoring my “technically, it does run without immediately erroring out” code into something that’s almost elegant. I want to be very clear here: he had no insight or input into the <em>content</em> of this ridiculous scavenger hunt, and I would be most displeased to learn that his instinct to help a customer was in any way held against him. I repeat: he is a mensch.</p> <p>More than that, he’s a perfect example of what many have been saying around the generative AI space: senior engineers aren’t being replaced by computers. Rather, they’re being accelerated by robot helpers, but their judgment and experience still heavily factors into making systems usable. All jokes aside, you’re not likely to achieve the result you’re after by simply copying and pasting other people’s code into place–and when something doesn’t work the way you’d expect, it’s very hard to give an AI assistant the context of the entire scope of the project and problem that you’re working on. This is the programmer’s expression of the trouble with plagiarizing something for an assignment: you won’t be able to talk about it in any real depth.</p> <h2 class="wp-block-heading" id="h-seek-and-ye-shall-find">Seek and ye shall find</h2> <p>Overall, I’m very pleased with how this played out. I learned a lot along the way, and this has gone a long way towards shoring up my enthusiasm for applying Generative AI to modern principles, my belief that there’s still a core of Customer Obsession running through Amazon, and that every AWS service is in fact for someone–even though I don’t fit the customer profile for an awful lot of them.</p> <p>Now please: enjoy re:Invent, and happy hunting!</p> <p>The post <a href="https://www.lastweekinaws.com/blog/generative-ai-builds-a-reinvent-scavenger-hunt/">Generative AI Builds a re:Invent Scavenger Hunt</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>How to Stop Feeding AWS’s AI With Your Data</title> <link>https://www.lastweekinaws.com/blog/how-to-stop-feeding-awss-ai-with-your-data/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 08 Nov 2023 15:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14361</guid> <description><![CDATA[<p>AWS may be using your data to train its AI models, and you may have unwittingly consented to it. Prepare to jump through a series of complex hoops to stop it.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/how-to-stop-feeding-awss-ai-with-your-data/">How to Stop Feeding AWS’s AI With Your Data</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>AWS has been making a lot of <a href="https://aws.amazon.com/generative-ai/">noise about generative AI</a>, emphasizing risk mitigation and the need for control over your data.</p> <p>Unlike its competitors, AWS doesn’t train its models on “the entire internet, regardless of various intellectual property restrictions.” This is laudable! (Though unlike the other large cloud providers, AWS currently doesn’t offer indemnification against intellectual property infringement claims when using its AI services, but that’s not the bone I wish to pick today.)</p> <p>What I want to draw your attention to is the hidden catch AWS clearly hopes you won’t notice: It <em>is</em> training its AI services on your usage of a subset of their own cloud services.</p> <h2 class="wp-block-heading" id="h-aws-s-data-paradox">AWS’s data paradox</h2> <p>AWS has long treated your data as sacrosanct, and I’ve found it very hard to argue with the company on this point. They don’t snoop into your S3 buckets to see what data you’re hosting, nor do they tailor the AWS customer console experience to individual customers in any meaningful way (even when they perhaps should!).</p> <p>This position supports AWS’s leadership principles of Earning Trust and Customer Obsession. AWS even specifically commits to Transparency on its <a href="https://aws.amazon.com/machine-learning/responsible-ai/">Responsible AI</a> page, stating that it’s “Communicating information about an AI system so stakeholders can make informed choices about their use of the system.”</p> <p>The truth is that AWS is training its AI models on your use of a subset of their services. Moreover, it’s been doing this for <a href="https://www.theregister.com/2020/07/15/aws_ai_data/">quite a while</a>.</p> <p>As per <a href="https://aws.amazon.com/service-terms/">AWS’s Service Terms</a>, the data it hoovers up for its own AI training applies to Amazon CodeGuru Profiler, Amazon CodeWhisperer Individual, Amazon Comprehend, Amazon Lex, Amazon Polly, Amazon Rekognition, Amazon Textract, Amazon Transcribe, and Amazon Translate.</p> <p>Fortunately, this is clearly disclosed when you first use these services.</p> <p>Ha ha! I am, of course, joking.</p> <p>Rather than presenting its AI training on your usage to you front and center, so you can make an informed decision, this little nugget of trivia is buried deep within the terms of service. A quick spot check of several of these services shows that this disclosure isn’t presented to the user in any meaningful way. AWS also suffers an overdose of irony by stating in its terms of service, “You will not, and will not allow any third-party to, use the AI Services to, directly or indirectly, develop or improve a similar or competing product or service.”</p> <p>Even should you be fine with volunteering for a $1.252 trillion company, you probably want to make sure that you notify your own customers that some of this data will be processed outside of the regions your account operates within. This is very much “check with your attorneys if this might be a problem for your business” territory.</p> <h2 class="wp-block-heading" id="h-the-process-to-opt-out-of-aws-s-ai-model-training">The process to opt out of AWS’s AI model training</h2> <p>OK, OK. You’re aware of the issue <em>now</em>, and you realize you don’t want to let AWS get free training from your use of their services, because there’s very clearly value there for someone, and it’s most unlikely they’re going to give you that value for free in return for your training data. Or maybe you’re worried about it taking that data outside of the region you thought it was bound within. So, like most sensible companies, you want to opt out of this.</p> <p>Good thing there’s a clearly marked organizationwide opt-out switch in the console.</p> <p>Ha ha! I am, of course, joking again.</p> <p>If you want to opt your organization out of AWS’s AI training, you first have to enable AI opt-out policies in your org, which is a switch flip in the console.</p> <p>Next, Amazon has <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_ai-opt-out.html">modified its own management policy language</a>, so you have to go look some stuff up unless you want to paste random things in all william-nilliam. You need to craft your own policy; Amazon gives an example that’s polluted with “helpful” annotations, which means you cannot copy and paste the example itself, in which you opt out of everything org-wide. Here it is, all cleaned up:</p> <pre class="wp-block-code"><code>json { "services": { "@@operators_allowed_for_child_policies": ["@@none"], "default": { "@@operators_allowed_for_child_policies": ["@@none"], "opt_out_policy": { "@@operators_allowed_for_child_policies": ["@@none"], "@@assign": "optOut" } } } }</code></pre> <p>AWS helpfully gives additional opt-out examples, as if there exists a universe in which you’d want to give Amazon access to <em>some</em> of your AI service usage for free, but not all of it. This is a great example of busywork for whomever was tasked with creating these examples that will never, ever be used.</p> <p>As of this writing, you can opt out of the above-mentioned services, <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_ai-opt-out_syntax.html">as well as</a>: Supply Chain by Amazon, Amazon Chime SDK Voice Analytics, Amazon DataZone, Amazon Connect, Amazon Fraud Detector, Amazon GuardDuty, Amazon QuickSight Q, and Amazon Security Lake. None of these services is explicitly named in section 50.3 of the Amazon service terms, so it makes me wonder why they were left out. I’ll give the benefit of the doubt and assume simple oversight, but then again, lawyers aren’t exactly known for missing things.</p> <p><em>“Now, that wasn’t so bad …”</em> , you might be thinking.</p> <p>Slow down there, hasty pudding. You are nowhere near done.</p> <p>Next, you have to take that policy you crafted and attach it to your organization’s root OU. “What the hell is that?” asks the obviously small minority of readers who don’t have a Directory Services background (that minority of folks who didn’t grow up having nightmares about LDAP rounds to “nearly everybody”). You get to hunt around in the console a bit to figure out how this works; alternately, you can use a handy <a href="https://github.com/gblues/aws-ml-opt-out">Terraform module</a> or <a href="https://blog.karims.cloud/2020/08/09/aws-ai-opt-out-copy.html">Python script</a> if that’s more your style.</p> <p>Lastly, you have to validate that the assembly of all the various policies in your org do the actual thing that you want, as the complexity of this policy language means that the interplay might not work out exactly the way you expect.</p> <p>Finally, a few hefty steps and much research time later, you’ve opted your org out of AWS’s AI training. Probably. I don’t see a way to validate that this policy is doing what you think it’s doing.</p> <h2 class="wp-block-heading" id="h-a-departure-from-aws-principles">A departure from AWS principles</h2> <p>Why is this not a simple switch flip in the console? It clearly <em>could be</em>, making me wonder if this whole rigamarole is intentional. AWS’s approach here, far from being customer-obsessed, trustworthy, or transparent, seems to be mired in obfuscation and self-interest. It feels to me to be decidedly underhanded.</p> <p>As it stands, AWS may be using your data to train its AI models, and you may have unwittingly consented to it. If you wish to prevent or stop this, be prepared to jump through a series of complex hoops.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/how-to-stop-feeding-awss-ai-with-your-data/">How to Stop Feeding AWS’s AI With Your Data</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>The New Frontier of Cloud Economics: Why AWS Costs Are a Weighty Issue</title> <link>https://www.lastweekinaws.com/blog/the-new-frontier-of-cloud-economics-why-aws-costs-are-a-weighty-issue/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 25 Oct 2023 14:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14335</guid> <description><![CDATA[<p>AWS re:Invent looms larger on the calendar with each passing day, promising not just an avalanche of new services but also--let's face it--some truly perplexing names. However, the oddity of AWS service names is low-hanging fruit. The true enigma lies in their labyrinthine pricing dimensions.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-new-frontier-of-cloud-economics-why-aws-costs-are-a-weighty-issue/">The New Frontier of Cloud Economics: Why AWS Costs Are a Weighty Issue</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>AWS re:Invent looms larger on the calendar with each passing day, promising not just an avalanche of new services but also–let’s face it–some truly perplexing names. However, the oddity of AWS service names is low-hanging fruit. The true enigma lies in their labyrinthine pricing dimensions.</p> <p>Gone are the days when “instance hours” and “gigabyte months” were the only units of cloud measurement we had to worry about. Now, we grapple with constructs like “I/O Operations” and “LCUs,” which depend on multiple dimensions tied to your choice of Load Balancer. The downside of these obscure metrics is not their complexity per se, but their disconnect from any intuitive understanding of application resource consumption. More often than not, the only way to gauge what a workload will cost on AWS is to deploy it, let it run, and then consult your bill. This is hardly sustainable.</p> <p>Amazon isn’t completely oblivious to this struggle. Earlier this year, they launched <a href="https://aws.amazon.com/about-aws/whats-new/2023/05/amazon-aurora-i-o-optimized/">Amazon Aurora I/O Optimized</a>, a variation that eliminates the fee-per-million-I/O-requests for a modest 30% increase in instance cost. The simplicity in pricing was enough to attract customers like Open Raven, who not only achieved better cost predictability but also <a href="https://aws.amazon.com/blogs/database/unlocking-cost-optimization-open-ravens-journey-with-amazon-aurora-i-o-optimized-leads-to-60-savings/">enjoyed a 60% reduction in their expenses</a>.</p> <p>In anticipation of the next wave of pricing complexity, I aim to outdo AWS by inventing an even more absurd pricing metric: the Terabyte-Pound-Month.</p> <p>AWS’s lineup includes physical devices like Snowcones, which fit on your desk; Snowballs, which fit in your office; Snowmobiles, which fit in your loading dock; and Outposts, which fit in your data centers. These offerings come with a fixed monthly charge, storage capacity, and–since they’re physically shipped–a weight. Disregarding the anomaly of the first month’s charges, let’s examine cost through the lens of weight, yielding our new metric: the Terabyte-Pound-Month.</p> <figure class="wp-block-table"><table><thead><tr><th>Offering</th><th>Cost per Terabyte-Pound-Month</th></tr></thead><tbody><tr><td>Snowcone</td><td>$4.17</td></tr><tr><td>Snowball</td><td>$0.23</td></tr><tr><td>Snowmobile</td><td>$0.0000735294</td></tr><tr><td>Outpost Rack (OR-HUZEI16)</td><td>$0.90</td></tr></tbody></table></figure> <p>Before you take these numbers as absolute gospel, please note that they are representative configurations; for numbers specific to your exact requirements, please reach out to your AWS Account Team with what is guaranteed to be the most surreal request they get all year. Also be sure to note your operating altitude, as these values are normalized for the product weights at sea-level.</p> <p>By standardizing costs in terms of weight, AWS customers can now more easily calculate their monthly storage cost-to-weight ratio. This new metric will undoubtedly make it simpler for customers to plan their infrastructure–even down to ensuring their floors can withstand the hardware they’re deploying.</p> <p>I’ll <a href="https://requinnvent.com">see you at re:Invent</a>, where I can’t wait to learn more.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-new-frontier-of-cloud-economics-why-aws-costs-are-a-weighty-issue/">The New Frontier of Cloud Economics: Why AWS Costs Are a Weighty Issue</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>The Missed Opportunity: AWS, re:Invent, and the Community That Cared</title> <link>https://www.lastweekinaws.com/blog/the-missed-opportunity-aws-reinvent-and-the-community-that-cared/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Tue, 17 Oct 2023 16:58:29 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14284</guid> <description><![CDATA[<p>The AWS re:Invent session tracker leaves much to be desired, a point that many in the community have lamented for years. Its glaring shortcomings range from the absence of a calendar view to a lackluster search function and the inability to share links to individual sessions. Frustrated attendees have long been in need of a better solution, and several community members rose to the challenge.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-missed-opportunity-aws-reinvent-and-the-community-that-cared/">The Missed Opportunity: AWS, re:Invent, and the Community That Cared</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <pre class="wp-block-code"><code>An Update to this post on 10/18/2023: Immediately after Luc's original tweet went out, I was in contact with Amazon. A few hours later, AWS provided this statement: > AWS has confirmed that they later clarified with these two community developers to remove session capacity information from their session guides. They confirmed they did this because the room capacities will change as they make final room decisions based on session demand. In my personal opinion, while I appreciate the public statement and the private apologies I've heard they've given to Luc and Rafael, this could have been entirely avoided by simply asking for what they wanted (session capacity information removed) rather than what they asked for (taking down the entire site). This mistake on Amazon's part has caused a lot of harm that can't be fixed by an apology, unfortunately. As of October 17, 2023, <a href="https://reinvent-23.vercel.app/">Rafael's session tracker</a> is back up, however Luc's is permanently gone, as he deleted the database and backups out an abundance of caution combined with the very understandable panicked reaction when a trillion dollar company scares the bejeezus out of you. I still believe timing of the takedown notice is suspicious (leading to the trackers being taken offline just in time for the initial rush of registrations). And the untold damage AWS has done to its own community relations, with the inherent chilling effect that this boneheaded foray into "threatening your most passionate community members" is doubtless going to have. As of this writing, the public catalog API returns session capacity information, which is specifically what AWS incorrectly claimed was only available via credentialed access–and what started this whole fiasco in the first place.</code></pre> <p>The AWS re:Invent session tracker leaves much to be desired, a point that many in the community have lamented for years. Its glaring shortcomings range from the absence of a calendar view to a lackluster search function and the inability to share links to individual sessions. Frustrated attendees have long been in need of a better solution, and several community members rose to the challenge. Notable contributors include AWS Serverless Hero <a href="https://x.com/donkersgood">Luc van Donkersgoed</a> and <a href="https://twitter.com/RaphaelManke">Raphael Manke</a>, who have developed alternative session viewers to enhance the conference experience. These third-party solutions were thriving until AWS abruptly pulled the plug this morning with a barrage of Cease & Desist emails.</p> <p>Here’s the text of one such email:</p> <blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"> <p>We were made aware of the version of the re:Invent schedule you made available at https://github.com/donkersgoed/reinvent-2023-session-fetcher. Per our AWS Site Terms, this is not an authorized use of AWS site content. We gate information that you made available here for registered attendees only, and would like to request that you remove the content.</p> <p>Thank you, AWS events customer support</p> </blockquote> <p>(Yes, that’s a cease & desist email; there’s no standard form, and the threat of lawyers ruining your year need not be explicit. AWS isn’t really “requesting” here.)</p> <p>This move is bewildering on multiple fronts, displaying not just a failure in customer service and a complete turnabout on the Leadership Principle of “Customer Obsession,” but also raising serious questions about AWS’s priorities and understanding of its own systems and community dynamics.</p> <p>Firstly, let’s address the fact that one of the recipients of these letters, Luc, is an AWS Serverless Hero. This title signifies an individual’s leadership within the AWS community, yet AWS’s approach here is impersonal and intimidating. Instead of discussing the matter privately, they issue a generic legal warning. This type of interaction undoubtedly makes other Community Heroes reconsider whether their volunteer efforts for a trillion-dollar enterprise are well-placed.</p> <p>Next, AWS’s assertion that the information is “gated” seems off the mark. The APIs used by these alternative trackers are publicly accessible, contradicting the claim that the data is restricted to registered attendees. This blunder highlights failures not only in customer communication but also in basic security protocols.</p> <p>Finally, it’s worth scrutinizing AWS’s lack of progress on their official session scheduler, which has seen no significant improvement in nearly a decade. If part-time hobbyists can outperform a trillion-dollar company in delivering a superior customer experience, it brings AWS’s commitment into question. One would assume that helping conference attendees effectively plan their experience would be a priority. AWS appears unbothered by its own shortcomings but takes issue with third parties filling the gaps. This action doesn’t just display a lack of customer obsession; it also raises the question: what harm, exactly, is AWS suffering here?</p> <p>By shutting down these community-driven solutions, AWS is not just shooting itself in the foot, it’s disheartening its most engaged and active users. A reconsideration of this decision would serve not just AWS’s reputation, but also the community that supports it.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-missed-opportunity-aws-reinvent-and-the-community-that-cared/">The Missed Opportunity: AWS, re:Invent, and the Community That Cared</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>The Cloud Devil You Know</title> <link>https://www.lastweekinaws.com/blog/the-cloud-devil-you-know/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 11 Oct 2023 14:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14272</guid> <description><![CDATA[<p>My Route53 database is humming along nicely, my podcast interview backlog is full, and I've outsourced my thinking to ChatGPT, so I have some unprecedented free time to build a side project. Awesome! What cloud provider should I use?</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-cloud-devil-you-know/">The Cloud Devil You Know</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>My Route53 database is humming along nicely, my <a href="https://screaminginthecloud.com">podcast interview backlog</a> is full, and I’ve outsourced my thinking to ChatGPT, so I have some unprecedented free time to build a side project. Awesome! What cloud provider should I use?</p> <p>The obvious and correct answer is “the one you’re already familiar with,” which for me and many others is AWS (due in no small part to their 5-year head start). But if AWS isn’t an option for whatever reason, and we turn the decision into an open field, a whole mess of questions arise.</p> <h2 class="wp-block-heading" id="h-reliability">Reliability</h2> <p>To start, I’m going to care if the potential provider’s offering is down for large swaths of time. Predicting that is deceptively challenging. Even with AWS, if you search for common outage terms, you’ll see complaining on various forums, dire warnings to avoid us-east-1, and news articles that make it sound like the entire environment is held together with spit and baling wire. I assure you, there’s no data center you’re going to build with anything approaching cloud provider availability—and that’s unlikely to be a design goal for most use cases. But if you’re picking a new provider for which you have no track record, you can’t reasonably make that assertion.</p> <h2 class="wp-block-heading" id="h-provisioning-time">Provisioning Time</h2> <p>One of the best things about AWS is that it made provisioning rapidly an expectation rather than the exception. I was recently reminded of this when paying for a cloud-hosted gaming PC through another company (because it’s Starfield quarter for me): spinning up the computer took roughly an hour. “Oh, right,” I said. “This used to be the norm.” For a long time, Akamai had what I’ve dismissively referred to as a “Jason API,” since changes apparently were turned into a ServiceNow ticket for some guy named Jason to take action on. That’s kind of a problem when it comes to rapid iteration, autoscaling, and having to jump through hoops to reconfigure things when you realize you’ve gone down a wrong path and need to change some stuff. If setting up a new database instance takes three hours, I’m not going to be making a whole lot of changes—so the database I start with is the one I’m keeping. If the RDS team is suddenly feeling uncomfortable about how long it takes to restore a database from a snapshot, good: it’s been the single longest downtime-causing part of any VPC migration project I’ve ever done.</p> <h2 class="wp-block-heading" id="h-pricing-and-billing">Pricing and Billing</h2> <p>The problem with cloud pricing isn’t necessarily the dimensions they charge for, but rather that there are so many of them—and how they interact isn’t easy to predict. Over time, you gain a familiarity with them where you can expect a small EC2 instance to cost you $7 or so per month, and you know that 10GB of storage will cost you about a quarter in S3. You can sort of assume that the other providers are pricing reasonably similarly, but there’s a big difference between “safe assumption” and “bet the financial viability of your new business.”</p> <h2 class="wp-block-heading" id="h-how-it-fails">How It Fails</h2> <p>When an AWS service takes an outage, the way it manifests is pretty well known. You start seeing things not work, the AWS status page remains green, Down Detector shows a sharp spike in error rates, social media starts buzzing, seasons pass as summer becomes winter becomes summer again, the AWS status page shows “increased error rates,” and so on. It’s frustrating, but you know what’s going on. Services don’t assure you that they have your transactions safely recorded and then drop them on the floor and AWS’s recovery processes don’t turn your production environment into a pristine parking lot. The key to safely running something is to know how it degrades when it breaks, and the only way to learn that is over long periods of time.</p> <h2 class="wp-block-heading" id="h-trust-and-safety">Trust and Safety</h2> <p>If your provider thinks something suspicious is going on, how are they going to handle it? Are they going to <a href="https://fortune.com/2016/08/26/cloud-computing-lessons-google/">turn your entire environment off</a> if one of your hosts gets compromised? If a payment doesn’t go through because a credit card expires, are they going to reach out so you can fix it or disable your site after the first transaction fails to clear? If user-generated content violates their rules, are they going to reach out to you about your bad actor customer, or are they going to assume that all of your users are de facto “you and your company” and treat you as the problem?</p> <h2 class="wp-block-heading" id="h-their-own-providers">Their Own Providers</h2> <p>Not only do you have to run this gauntlet of evaluating a provider, but the provider has to evaluate their own vendors through the same lens. For example, Wasabi took <a href="https://www.bleepingcomputer.com/news/security/wasabi-cloud-storage-service-knocked-offline-for-hosting-malware/">a thirteen hour outage</a> due to choosing GoDaddy for domain services instead of a real company that understands how business works. A Wasabi customer uploaded ToS-violating content, and rather than following a communication process with the customer, GoDaddy decided to turn off their entire domain for thirteen hours. This could have been avoided via the simple rule of not doing enterprise scale business with a company whose name contains the word “Daddy,” but that’s only obvious the second time.</p> <h2 class="wp-block-heading" id="h-community">Community</h2> <p>One of the greatest evaluation criteria to use is quite simply the strength of the provider’s community. That encompasses a lot, but can be distilled down to “if I’m attempting to do something, can I find blog posts and guidance from this provider’s customer on how to do this thing?” Eventually you’re going to encounter something where the answer is “no, I cannot.”. If it’s attempting to use their DNS service as a globally distributed database, it’s probably a good sign that you Should Not Be Doing whatever the hell it is you’re attempting to do, because there’s almost certainly a better way. If it’s “put three web servers behind a load balancer,” you’re going to start wondering if the “cloud” you’re evaluating has any customers whatsoever. You don’t want to spend your time trailblazing things that have long since become undifferentiated heavy lifting via other providers.</p> <p>More importantly, you want to make sure that there are lots of other customers using the provider you choose. Yes, I trust that my $450 a month in AWS spend is valued by the company, but there’s also something reassuring about having giant, multinational banks depending upon the same services that I’m using when it comes to lighting a fire under a provider’s urgency about service disruption. This also unlocks community knowledge, the sorts of things that “everybody knows” about a provider— or at least, they sure do the second time. “I’m starting out with AWS, what should I be aware of?” means you’re about to get absolutely firehosed off of your chair with the deluge of tips you’re about to receive. The one thing you won’t hear is “what’s AWS?”</p> <h2 class="wp-block-heading" id="h-conclusion">Conclusion</h2> <p>Collectively, these points create a very high bar for a relatively new provider to surmount. Some providers have cleared this bar. Google Cloud has, after a shaky first few years. So has Azure if you <a href="https://www.insiderintelligence.com/content/microsoft-azure-confronts-security-challenges">don’t give a single damn about security</a>. But other providers remain a significant question mark in many of these categories. Most of us don’t want to have to think about all of these categories all over again; we’d rather save our energies for our own unique business challenges rather than conducting vendor evaluations that are, frankly, exhausting. And so the big cloud providers get bigger, and the gap widens between the hyperscalers and everyone else.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/the-cloud-devil-you-know/">The Cloud Devil You Know</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> <item> <title>Why Your CPU-Based Utilisation Metric is Absolute Nonsense</title> <link>https://www.lastweekinaws.com/blog/why-your-cpu-based-utilisation-metric-is-absolute-nonsense/</link> <dc:creator><![CDATA[Corey Quinn]]></dc:creator> <pubDate>Wed, 13 Sep 2023 14:30:00 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.lastweekinaws.com/?p=14184</guid> <description><![CDATA[<p>Picture this: You’re in your swivel chair, feet propped up on your standing desk because you are a glorious acrobat, and you’re looking over your company’s Amazon EC2 fleet utilization report. You’re captivated by the custom colorful dashboard, carefully tuned to a 1st-grade reading level. You see the overall number in its soft, non-threatening font, […]</p> <p>The post <a href="https://www.lastweekinaws.com/blog/why-your-cpu-based-utilisation-metric-is-absolute-nonsense/">Why Your CPU-Based Utilisation Metric is Absolute Nonsense</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>Picture this: You’re in your swivel chair, feet propped up on your standing desk because you are a glorious acrobat, and you’re looking over your company’s Amazon EC2 fleet utilization report. You’re captivated by the custom colorful dashboard, carefully tuned to a 1st-grade reading level. You see the overall number in its soft, non-threatening font, and you say to yourself, “We’re operating at 70% capacity — we’re golden!”</p> <p>That metric is usually nothing more than a feel-good security blanket that doesn’t give you better insight into the efficiency of your spend. Why? Because the number at the top of your report is simply your CPU utilization, which is being passed off as a standalone metric for fleet utilization. It’s the cloud’s equivalent of ordering a pizza based on the box’s size, without giving a second thought to what’s actually inside.</p> <p>What’s more disturbing is the number of cost optimization vendors who similarly adhere to that metric without context; it’s in danger of becoming a de facto “best practice” that will lead you down the primrose path.</p> <h2 class="wp-block-heading" id="h-the-limits-of-cpu-utilization-metrics">The limits of CPU utilization metrics</h2> <p>CPU utilization tells you, perhaps obviously, how much of your CPU resources are being used at any given time. Any cloud provider can easily query, “How busy is the compute on this instance?”</p> <p>In a theoretical world where disk, RAM, burst capacity, network throughput, and latency are all irrelevant, then — yes! OK! terrific! — the question of utilization would strictly come down to how many CPU cores can you throw at the problem, and then using CPU as a proxy for utilization is great. If that’s you, stop reading now, go <a href="https://store.lastweekinaws.com/">buy some Last Week in AWS mugs</a> or something, and carry on with your charmed existence. If it isn’t you, keep reading.</p> <p>Huh, we just lost a couple HPC folks and several of the more naive analyst firms, but the rest of you are all still here. Imagine that …</p> <p>For the rest of us, the problem is that CPU utilization is a single data point that doesn’t tell much of a story. It’s impossible to say at first glance whether your CPU numbers are worrying or an indicator that all is well, regardless of what the actual numbers are.</p> <p><br>High CPU usage could mean:</p> <ul class="wp-block-list"> <li>your applications are working efficiently, or</li> <li>they’re straining under the load, desperately crying out for relief.</li> </ul> <p>Low CPU usage could mean:</p> <ul class="wp-block-list"> <li>your instances are idling, wasting precious cloud dollars,</li> <li>your applications are well-optimized and aren’t CPU bound, or</li> <li>you need idle capacity to burst into when a bunch of your users all show up at once.<br></li> </ul> <p>The CPU utilization metric blissfully ignores other critical aspects of your instances’ operation, such as network activity, disk I/O, and memory usage. A high CPU usage with low network activity could signal a performance bottleneck that leads to a data-starved instance, or it could be an application that barely needs to talk to other things on the internet. A low CPU usage with high memory utilization could mean your application is inefficiently coded, or that it’s a database that lives in RAM for latency purposes.</p> <h2 class="wp-block-heading" id="h-the-risks-of-relying-on-cpu-metrics">The risks of relying on CPU metrics</h2> <p>This reductionism of cloud instance health to CPU utilization stems from its ease of access. It’s readily available, easy to measure, and undeniably simplistic to interpret. Cloud providers can grab it via API, slap it onto a pretty graph, and voila, they’ve got themselves a utilization report. And the resulting CPU metric seems to level the playing field to reason about workloads that are remarkably diverse, making it easier to benchmark yourself against other companies (<a href="https://www.duckbillgroup.com/blog/why-benchmarks-miss-the-mark-for-aws-spend/">which you <em>should not</em> do</a>). But easy access doesn’t equal quality insight.</p> <p>Take a look at the fact that a c7g.large instance in EC2 is about <a href="https://aws.amazon.com/ec2/pricing/on-demand/">6% more expensive</a> than a c6g.large instance. Amazon points out that the price/performance of that instance means you get improved price/performance, but that assumes an awful lot of things about your workload. If you need a cluster of 10 nodes to chew on a problem because that’s how your application works, then your cluster just got 6% more expensive if you upgrade to the latest generation — without a clear upside benefit that accrues to you.</p> <h2 class="wp-block-heading" id="h-how-to-actually-determine-your-fleet-utilization">How to actually determine your fleet utilization</h2> <p>A nuanced approach, taking into account a bouquet of metrics including network I/O, disk read/write speeds, and memory usage alongside CPU utilization, provides a holistic picture of your cloud instance fleet. Those metrics require a lot more insight into the environment and, in the case of memory, an agent running on the actual instances themselves. Cloud providers <em>could</em> deliver these kinds of nuanced reports, but the effort required from them is likely too high.</p> <p>So, next time you’re in your swivel chair, resist the temptation to rely solely on the CPU utilization column. Dive deeper, venture beyond, and ask probing questions. In so doing, uncover the true health of your server fleets. Because in the world of cloud economics, ignorance isn’t bliss; it’s just expensive.</p> <p>The post <a href="https://www.lastweekinaws.com/blog/why-your-cpu-based-utilisation-metric-is-absolute-nonsense/">Why Your CPU-Based Utilisation Metric is Absolute Nonsense</a> appeared first on <a href="https://www.lastweekinaws.com">Last Week in AWS</a>.</p> ]]></content:encoded> </item> </channel> </rss>