AWS
AWS Adjusts EC2 Capacity Block Pricing Amid GPU Demand
Amazon Web Services has increased prices for some EC2 Capacity Blocks, potentially impacting enterprises with large-scale machine learning workloads.
- Read time
- 4 min read
- Word count
- 971 words
- Date
- Jan 6, 2026
Summarize with AI
Amazon Web Services has adjusted the pricing structure for select Elastic Compute Cloud (EC2) Capacity Blocks for machine learning offerings, with increases of approximately 15%. This change primarily affects P5 instances powered by Nvidia H100 and H200 Tensor Core GPUs. Industry experts attribute the hike to robust market demand and limited supply of high-end GPUs, leading to a scarcity premium for guaranteed compute resources. While the immediate impact on existing workloads may be minimal due to strong data gravity and complex migration processes, new AI initiatives on AWS could experience higher costs.

🌟 Non-members read here
Amazon Web Services (AWS) has announced an updated pricing structure for certain Elastic Compute Cloud (EC2) Capacity Blocks designed for machine learning. This adjustment, which sees an increase of approximately 15%, is expected to influence enterprises managing extensive machine learning operations.
These Capacity Blocks offer clients the ability to reserve access to high-performance computing resources for a predetermined future start date. Businesses can reserve accelerated compute instances in clusters ranging from one to 64 instances, incorporating up to 512 GPUs or 1024 Trainium chips. These reservations are available for a duration of up to six months for diverse machine learning tasks, though they can only be booked eight weeks in advance.
It is worth noting that in June of the previous year, AWS had actually reduced prices by up to 45% for EC2 Nvidia GPU-accelerated instances across its P4 and P5 offerings. AWS has not yet provided comment on this latest pricing change.
Rising Costs for P5 Capacity Blocks
The Capacity Blocks encompass various EC2 instances, including the P6 series, which utilizes the newest Nvidia Blackwell GPUs. Also included are P5 instances, powered by Nvidia H100 and H200 Tensor Core GPUs, and P4 instances, which leverage Nvidia A100 Tensor Core GPUs. The recent price increases are specifically observed across P5 Capacity Blocks.
For instance, the p5e.48xlarge instance, equipped with eight Nvidia H200 accelerators in the US East (Ohio) region, has seen its effective hourly rate per instance (per accelerator) climb. The cost has moved from $34.608 to $39.799. This represents a significant rise for users in that particular region.
Similarly, the p5en.48xlarge instance in the same region has experienced a price jump from $36.184 to $41.612. This pricing adjustment holds consistent across several international regions, including Stockholm, London, and Spain in Europe, as well as Jakarta, Mumbai, Tokyo, and Seoul in the Asia Pacific. However, customers in the US West (N. California) face even higher rates, now paying $49.749 instead of $43.26 for p5e.48xlarge, and $52.015 instead of $45.23 for p5en.48xlarge.
In contrast, pricing for the P6e instances, specifically the p4d.24xlarge with 72 B200 accelerators in the Dallas Local Zone, remains unchanged at $761.904. This stability in P6e pricing suggests the adjustments are highly targeted at specific, high-demand GPU configurations.
Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting, offered a clear rationale for these price increases. He stated that the most plausible explanation is rooted in market dynamics driven by supply and demand. As the demand for advanced H100 and H200 GPUs consistently outstrips their available supply, AWS is effectively imposing a scarcity premium on its guaranteed inventory. This strategy allows AWS to recoup higher infrastructure and capital expenditures associated with securing urgent capacity, rather than across its entire capacity portfolio.
The Competition for Guaranteed GPU Capacity
The assurance of consistent access to GPU clusters is becoming a critical factor for enterprises. It allows them to mitigate risks in AI infrastructure planning and build resilience against potential future supply chain disruptions. Recognizing the intense demand for high-end GPUs, which has led to a scarcity of Nvidia H100 and H200 units, major cloud providers are increasingly offering guaranteed capacity solutions to their clientele.
Beyond AWS, other prominent cloud service providers such as Google Cloud and Microsoft Azure also feature comparable offerings. However, their approaches are generally structured within more conventional reservation models and scheduling frameworks, providing different nuances to their guaranteed capacity solutions. This differentiation highlights a competitive landscape where securing crucial resources is paramount.
Google Cloud, for instance, has introduced a calendar-based scheduling tool. This innovation allows customers to reserve GPU capacity in fixed blocks well in advance, a mechanism that shares similarities with the AWS Capacity Blocks. Sanchit Vir Gogia, CEO and chief analyst at Greyhound Research, noted that while the underlying function is similar, Google’s framing differs. Google integrates it into a broader resource scheduler rather than presenting it as a premium SKU, implying a competitive focus on scheduling efficiency over dynamic pricing. Moreover, Google’s ability to direct some workloads to TPUs instead of GPUs provides an added layer of system flexibility.
Microsoft Azure, conversely, places a greater emphasis on regional capacity reservations. Gogia explained that these reservations enable customers to secure specific virtual machine types within particular zones. Azure’s model tends to cater to long-term planning and significant enterprise commitments. While clients benefit from guaranteed capacity, they often incur costs for maintaining these resources, irrespective of actual usage. This represents a distinct premium, less focused on hourly rates and more on commitment duration.
Price Hikes and Their Limited Immediate Impact
Industry experts indicate that these specialized offerings for guaranteed GPU capacity constitute a relatively small fraction of total cloud spending. However, they command a disproportionately significant share of strategic investments in artificial intelligence initiatives. This imbalance underscores the strategic importance of these high-performance compute resources despite their limited contribution to overall cloud expenditure.
While AWS has been the first to publicly announce these price increases, Gogia suggests that other cloud providers may implement similar adjustments, albeit through different mechanisms. Both Microsoft and Google have not yet responded to inquiries regarding their pricing strategies for these specialized services, leaving room for speculation about future market movements. The competitive landscape suggests that all major players are navigating similar supply-demand pressures.
Jain further elaborated that for the majority of enterprises, these price hikes are unlikely to prompt immediate large-scale migration of existing workloads. EC2 Capacity Blocks typically account for a modest portion of total GPU expenditure. Furthermore, factors such as strong data gravity, established MLOps stacks, stringent compliance controls, and specialized skill sets continue to firmly anchor current workloads on AWS. The complexity and time required to transition mature machine learning training stacks off AWS mean that the primary impact will likely be felt in the budgeting and deployment of new AI workloads rather than a rapid exodus of existing ones.