Technology major IBM and Advanced Micro Devices (AMD) have joined forces to offer AMD Instinct MI300X accelerators as a service on IBM Cloud.
This AI accelerator service is expected to be available in the first half of 2025.
With the new offering, IBM and AMD aim to boost performance and efficiency for generative artificial intelligence (AI) models and high-performance computing applications for business customers.
The partnership will integrate support for AMD Instinct MI300X accelerators into IBM’s watsonx AI and data platform, along with AI inferencing capabilities in Red Hat Enterprise Linux.
AMD Instinct MI300X accelerators, featuring 192GB of HBM3, are designed to handle large-scale model inferencing and fine-tuning tasks.
The large memory capacity of the AMD Instinct MI300X accelerators can help customers run larger models with fewer GPUs, potentially cutting inferencing costs.
How well do you really know your competitors?
Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.
Thank you!
Your download email will arrive shortly
Not ready to buy yet? Download a free sample
We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form
By GlobalDataThis is expected to provide watsonx clients enhanced AI infrastructure to scale workloads across hybrid cloud environments.
Generative AI inferencing workloads involve computational tasks using trained generative AI models to produce outputs such as text, images, audio, or video.
These workloads encompass processes where live data is fed into models to generate content, predictions, or solutions.
They typically require significant computational power and efficiency to handle complex operations, especially in real-time applications.
AMD executive vice president and chief commercial officer Philip Guido said: “As enterprises continue adopting larger AI models and datasets, it is critical that the accelerators within the system can process compute-intensive workloads with high performance and flexibility to scale.
“Our collaboration with IBM Cloud will aim to allow customers to execute and scale Gen AI inferencing without hindering cost, performance or efficiency.”
IBM Cloud general manager Alan Peacock said: “Leveraging AMD’s accelerators on IBM Cloud will give our enterprise clients another option to scale to meet their enterprise AI needs, while also aiming to help them optimize cost and performance.”
Earlier in November 2024, reports surfaced that AMD plans to cut its global workforce by 4%, affecting around 1,000 employees.
The move aims to bolster AMD’s position in the AI chip market to better compete with industry leader NVIDIA.