AWS re:Invent 2025: Matt Garman unpacks frontier agents and the Trainium 3 gamble

Matt Garman, Amazon Web Services’ CEO, stood on stage in Las Vegas at AWS re:Invent 2025, speaking to a packed keynote hall with nearly 2 million watching online, including a crowd streaming from inside Fortnite, and delivered what might be AWS’s most consequential keynote in years. Not because of the sheer volume of announcements (there were 25 product launches crammed into the final ten minutes alone), but because of what sat underneath them: a bet that software development as we know it is about to fundamentally restructure itself around autonomous agents.

AWS re:Invent 2025 wasn’t subtle about its ambitions. Garman opened with the numbers: AWS is now a $132 billion business, growing at 20% year-over-year, with $22 billion in absolute growth over the last 12 months. That’s larger than the annual revenue of more than half the Fortune 500. Amazon Bedrock is powering AI inference for over 100,000 companies. The infrastructure story is relentless: 38 regions, 120 availability zones, 3.8GW of data centre capacity added in the last year alone, and a private network spanning 9 million kilometres of terrestrial and subsea cable.

But the infrastructure flex was table stakes. The real story was agents, and specifically what AWS is calling frontier agents: autonomous, massively scalable, long-running systems that can work for hours or days without human intervention. Garman framed this as an inflection point, the moment when AI stops being a technical curiosity and starts delivering measurable business returns. “I believe that in the future, there’s going to be billions of agents inside of every company and across every imaginable field,” he said.

The keynote introduced three frontier agents designed to prove the concept. The Kiro autonomous agent integrates directly into development workflows, maintaining context across repositories and automating tasks like library upgrades across 15 microservices simultaneously. The AWS security agent embeds security review into the development process from design documents through to on-demand penetration testing. The AWS DevOps agent monitors incidents, correlates telemetry across multi-cloud environments, and resolves issues before engineers even log on.

Garman shared an internal Amazon story to illustrate the potential: a team originally scoped at 30 developers working for 18 months completed a significant rearchitecture with six people in 76 days using Kiro. Not 10% or 20% efficiency gains, but something closer to a 5x productivity leap. The team’s insight wasn’t just about using the tool, it was about restructuring workflows around what agents do well: directing broad outcomes instead of babysitting tasks, scaling out concurrent workloads, and letting agents run autonomously overnight.

It’s a compelling narrative, but it’s also deeply self-interested. AWS is building the infrastructure, the tooling, and the agents themselves, which means every layer of this stack generates revenue. The company announced that all of Amazon has standardised on Kiro as its official AI development environment, a signal to enterprises that AWS is eating its own dog food. But it’s also a signal that AWS sees agentic development as the next platform lock-in opportunity.

The hardware announcements were equally aggressive. Garman unveiled Trainium 3 UltraServers, now generally available, featuring the industry’s first three-nanometer AI chip in the AWS Cloud. Trainium 3 delivers 4.4x more compute than Trainium 2, 3.9x the memory bandwidth, and five times more AI tokens per megawatt of power. AWS also previewed Trainium 4, promising six times the FP4 compute performance and four times more memory bandwidth compared to Trainium 3.

This is AWS doubling down on custom silicon as a competitive moat. Garman pointed out that the majority of inference running in Amazon Bedrock today is already powered by Trainium, and that AWS has deployed over 1 million Trainium chips to date. The company ramped Trainium 2 volumes four times faster than any previous AI chip, and it’s now a multi-billion dollar business. The pitch is clear: if you’re serious about AI at scale, AWS’s vertically integrated stack (custom chips, custom networking, custom software) delivers better performance and cost efficiency than stitching together third-party components.

AWS also announced new NVIDIA-powered instances: the P6e GB300, featuring NVIDIA’s latest GB300 NVL72 systems for the most demanding AI workloads. OpenAI runs clusters of EC2 UltraServers with hundreds of thousands of chips on AWS, supporting ChatGPT and training next-generation models. NVIDIA itself runs its large-scale GenAI cluster, Project Ceiba, on AWS. These partnerships validate AWS’s operational reliability at GPU scale, something Garman emphasised repeatedly: AWS sweats the details, debugging BIOS issues to prevent GPU reboots while other providers just accept node failures as inevitable.

The model layer saw significant expansion. Amazon Nova 2 launched with four new models: Nova 2 Lite (fast, cost-effective reasoning), Nova 2 Pro (frontier-level intelligence for complex workloads), Nova 2 Sonic (real-time conversational AI), and Nova 2 Omni (the industry’s first reasoning model supporting text, image, video, and audio input with text and image output). Garman positioned Nova as delivering industry-leading price performance, with Nova 2 Lite comparing favourably to Claude Haiku, GPT-4o Mini, and Gemini Flash 2.0 across instruction following, tool calling, and code generation benchmarks.

But the most conceptually ambitious announcement was Amazon Nova Forge, introducing what AWS calls “open training models”. Nova Forge gives customers exclusive access to Nova training checkpoints, allowing them to blend proprietary data with Amazon-curated datasets at every stage of model training. The resulting models, called novellas, deeply understand company-specific information without forgetting foundational capabilities. Reddit used Forge to build a content moderation model that finally met their accuracy and cost efficiency targets after fine-tuning existing models failed. Sony adopted Forge to increase compliance review efficiency by 100x.

This is AWS positioning itself not just as infrastructure or model provider, but as the platform where enterprises train proprietary foundation models. It’s a direct challenge to the “one model to rule them all” narrative and a bet that domain-specific models trained on company data will outperform general-purpose alternatives for mission-critical applications.

Amazon Bedrock AgentCore received major updates. AWS introduced Policy, providing real-time deterministic controls for how agents interact with enterprise tools and data. Policies are written in natural language, converted to Cedar (an open-source authorisation language), and evaluated in milliseconds before agents access APIs, Lambda functions, or third-party services. AWS also launched AgentCore Evaluations, which continuously inspects agent behaviour for quality dimensions like correctness, helpfulness, and harmfulness, with 13 pre-built evaluators and support for custom scoring.

These aren’t just features, they’re attempts to solve the trust problem that’s keeping enterprises from deploying agents to high-stakes use cases. Garman framed it explicitly: “Most customers feel that they’re blocked from being able to deploy agents to their most valuable critical use cases.” Policy and Evaluations are AWS’s answer, giving organisations the confidence that agents will stay within defined boundaries and behave appropriately in production.

AWS also announced AI Factories, dedicated AI infrastructure deployed in customers’ own data centres for exclusive use. It operates like a private AWS region, providing access to Trainium UltraServers, NVIDIA GPUs, SageMaker, and Bedrock while meeting stringent compliance and sovereignty requirements. The concept originated from a partnership with HUMAIN, Saudi Arabia’s AI innovation company, and AWS is now offering it to government organisations and large enterprises globally.

The partnerships showcased throughout the keynote were strategic. Sony’s Chief Digital Officer, John Kodera, discussed how AWS supports 129 million PlayStation gamers and powers the Sony Engagement Platform, processing 760TB of data from over 500 sources across Sony Group. Adobe CEO Shantanu Narayen highlighted how AWS infrastructure underpins Adobe Firefly (29 billion assets generated), Acrobat Studio, and Experience Platform (35 trillion segment evaluations, 70 billion profile activations per day). Writer CEO May Habib demonstrated how her company uses AWS SageMaker HyperPod and P5 instances to train Palmyra X5, their frontier model, completing training runs in a third of the time with 90% more reliability.

These weren’t just customer testimonials, they were validation that AWS’s full-stack approach (custom silicon, managed services, agentic platforms) is being adopted by companies building the next generation of AI-powered products. Adobe integrating Bedrock guardrails and models directly into Writer’s platform, Sony using Nova Forge for compliance workflows, and Lila using AWS compute to build AI science factories all point to AWS becoming the underlying substrate for enterprise AI at scale.

AWS also launched Amazon Quick, a consumer-grade AI experience for corporate employees that brings together structured data (BI, databases), app data (Microsoft 365, Salesforce, Jira), and unstructured data (documents, files) into a unified interface with deep research capabilities and customisable Quick Flows for automating repetitive tasks. Hundreds of thousands of Amazon employees are already using Quick internally, with teams reporting tasks completed in one-tenth the time. It’s AWS’s pitch for becoming the operating system for enterprise knowledge work.

The keynote’s final sprint covered 25 product launches in ten minutes: new X-family large memory instances, C8a instances with 30% higher performance, M8azn instances with the fastest CPU clock frequency in the cloud, EC2 M3 Ultra Mac and M4 Max Mac instances, Lambda durable functions for long-running workloads, S3 maximum object size increased to 50TB, S3 Tables with Intelligent-Tiering (up to 80% cost savings), S3 Vectors for storing trillions of vector embeddings at 90% lower cost, GPU-accelerated vector indices in OpenSearch (10x faster indexing at one-fourth the cost), GuardDuty extended threat detection for ECS and EC2, unified CloudWatch data store for operational and security logs, RDS storage capacity increased to 256TB for SQL Server and Oracle, and Database Savings Plans offering up to 35% savings across all database services.

It was an overwhelming volume of announcements, deliberately so. AWS wanted to demonstrate that its pace of innovation extends far beyond AI and agents into every layer of the cloud stack. But the subtext was clear: AWS is building an integrated platform where infrastructure, custom silicon, managed services, foundation models, and autonomous agents all work together, and the commercial logic is that enterprises will find it easier to buy the entire stack from one provider than assemble equivalent capabilities from multiple vendors.

The bet AWS is making with frontier agents is that software development, security, and operations are about to be radically reorganised around autonomous systems that can work independently for extended periods. If that’s true, then AWS’s early lead in building the infrastructure, tooling, and agents themselves could be enormously valuable. If it’s not, or if enterprises remain sceptical about ceding that much autonomy to AI systems, then AWS has spent considerable capital building a platform for a future that hasn’t arrived yet.

Garman’s closing message was unambiguous: “We want you to move fast, so you have a broader set of capabilities to build for your own customers.” The implication is that AWS’s customers’ success depends on adopting these tools, and that hesitation means falling behind competitors who’ve already started restructuring their workflows around agents. It’s a sales pitch wrapped in a product vision, which is exactly what a keynote at AWS re:Invent 2025 should be. The question is whether enterprises will buy it, and whether the productivity gains AWS is promising will materialise at the scale required to justify the investment.

AWS re:Invent 2025: Matt Garman unpacks frontier agents and the Trainium 3 gamble

Huawei Pura X Max wide foldable lands before Apple or Samsung can get there

Huawei Pura 90 series launches with the world’s first 200MP RYYB telephoto sensor

Enterprise Wi-Fi ROI is real. But Cisco’s numbers need some context.

Compliance isn’t the enemy of public sector IT security, but treating it as a checkbox is

Huawei Pura X Max wide foldable lands before Apple or Samsung can get there

Huawei Pura 90 series launches with the world’s first 200MP RYYB telephoto sensor

OMODA & JAECOO’s cross-border warranty is a sign the brand is maturing

Zeen Social Icons