Red Hat expands partnership with AWS for enhanced enterprise-grade generative artificial intelligence

Ashesh Badani, Senior Vice President and Chief Product Officer
Ashesh Badani, Senior Vice President and Chief Product Officer
0Comments

Red Hat has announced an expanded collaboration with Amazon Web Services (AWS) aimed at improving enterprise-grade generative AI (gen AI) capabilities on AWS. The partnership focuses on using Red Hat AI in conjunction with AWS’s custom AI chips, Trainium and Inferentia, to offer customers more options and greater efficiency for deploying production AI workloads.

According to the announcement, Red Hat will provide its AI Inference Server—built on the vLLM framework—to run on AWS’s Trainium and Inferentia chips. This is intended to create a unified inference layer that can support any gen AI model. The companies state that this approach may deliver up to 30-40% better price performance compared to current GPU-based Amazon EC2 instances.

The companies also collaborated to develop an AWS Neuron operator for Red Hat OpenShift platforms, which allows customers a more seamless path for running AI workloads with AWS accelerators. This integration aims to make it easier for users of Red Hat OpenShift, Red Hat OpenShift AI, and Red Hat OpenShift Service on AWS to deploy high-performance AI applications.

Further steps include enhancing access and deployment by supporting AWS’s high-capacity accelerators for Red Hat customers on AWS. Recently, Red Hat released the amazon.ai Certified Ansible Collection for its Ansible Automation Platform, designed to help orchestrate AI services on AWS.

Red Hat and AWS are also working together on upstream community contributions by optimizing an AWS AI chip plugin integrated into vLLM. As a leading commercial contributor to vLLM, Red Hat is committed to enabling this technology on AWS infrastructure. The vLLM project forms the basis of llm-d, an open source initiative now available as a supported feature in Red Hat OpenShift AI 3.

Joe Fernandes, vice president and general manager of the AI Business Unit at Red Hat, said: “By enabling our enterprise-grade Red Hat AI Inference Server, built on the innovative vLLM framework, with AWS AI chips, we’re empowering organizations to deploy and scale AI workloads with enhanced efficiency and flexibility. Building on Red Hat’s open source heritage, this collaboration aims to make generative AI more accessible and cost-effective across hybrid cloud environments.”

Colin Brace, vice president at Annapurna Labs within AWS, added: “Enterprises demand solutions that deliver exceptional performance, cost efficiency, and operational choice for mission-critical AI workloads. AWS designed its Trainium and Inferentia chips to make high-performance AI inference and training more accessible and cost-effective. Our collaboration with Red Hat provides customers with a supported path to deploying generative AI at scale, combining the flexibility of open source with AWS infrastructure and purpose-built AI accelerators to accelerate time-to-value from pilot to production.”

Jean-François Gamache of CAE noted: “Modernizing our critical applications with Red Hat OpenShift Service on AWS marks a significant milestone in our digital transformation. This platform supports our developers in focusing on high-value initiatives – driving product innovation and accelerating AI integration across our solutions. Red Hat OpenShift provides the flexibility and scalability that enable us to deliver real impact, from actionable insights through live virtual coaching to significantly reducing cycle times for user-reported issues.”

Anurag Agrawal from Techaisle commented: “As AI inference costs escalate, enterprises are prioritizing efficiency alongside performance. This collaboration exemplifies Red Hat’s ‘any model, any hardware’ strategy by combining its open hybrid cloud platform with the distinct economic advantages of AWS Trainium and Inferentia. It empowers CIOs to operationalize generative AI at scale, shifting from cost-intensive experimentation to sustainable, governed production.”

Industry research predicts increased adoption of custom silicon such as ARM processors or specialized chips by 2027 as organizations seek improved performance optimization and cost savings.

The new offerings will become available soon; specifically, the AWS Neuron community operator can now be found in the Red Hat OpenShift OperatorHub for relevant users. Developer preview support for running the Red Hat AI Inference Server on AWS’s custom chips is expected in January 2026.

For further details about this partnership or demonstrations of these technologies in action, interested parties can visit booth #839 at AWS re:Invent 2025.



Related

Steve Troxler, Commissioner

NCDA&CS opens hotline for farmers facing winter storm impacts

The North Carolina Department of Agriculture and Consumer Services (NCDA&CS) will activate a hotline on Saturday, January 24, to help farmers affected by the recent winter storm.

D. Reid Wilson Secretary

North Carolina opens applications for beach access improvement grants

The North Carolina Department of Environmental Quality’s Division of Coastal Management (DCM) expects to have about $1.5 million available in the 2026-27 fiscal year for local governments aiming to improve public access to coastal beaches and waters.

D. Reid Wilson Secretary

North Carolina updates sheepshead fishing rules effective March 1

The North Carolina Department of Environmental Quality’s Division of Marine Fisheries will introduce new regulations for sheepshead fishing starting March 1.

Trending

The Weekly Newsletter

Sign-up for the Weekly Newsletter from North Raleigh Today.