AI Generated by Fortune India
OpenAI, Broadcom unveil first custom AI inference chip; target deployment by end-2026 after nine-month development cycleJune 24, 2026, 21:18 IST
Loading AI Hub...
Disclaimer : Certain content on this page, including summaries, timelines, FAQs, glossaries, highlights, insights, and other supplementary informational features, maybe generated or assisted by artificial intelligence tools. While reasonable efforts are made to review and verify such content, AI generated output may occasionally contain errors, omissions or inconsistencies. Readers are advised to independently verify any information before relying upon them for professional, legal, financial, medical or other decisions. The publisher along with its affiliates and contributors do not warrant accuracy of AI-generated content and disclaim any liability, loss or damage arising from its use.

OpenAI, Broadcom unveil first custom AI inference chip; target deployment by end-2026 after nine-month development cycle

/2 min read

ADVERTISEMENT

Jalapeño, OpenAI’s first in-house AI accelerator, is designed specifically for large language model inference and could improve compute efficiency as the company seeks greater control over its AI infrastructure stack.
OpenAI, Broadcom unveil first custom AI inference chip; target deployment by end-2026 after nine-month development cycle
The chip, named Jalapeño, was developed in partnership with semiconductor giant Broadcom and systems manufacturer Celestica Credits: OpenAI

OpenAI has unveiled its first custom-designed AI inference chip, marking a significant step in the company's efforts to reduce reliance on third-party hardware and gain greater control over the infrastructure powering its rapidly expanding artificial intelligence services.

The chip, named Jalapeño, was developed in partnership with semiconductor giant Broadcom and systems manufacturer Celestica. OpenAI said engineering samples are already running machine-learning workloads in laboratory environments, including GPT-5.3 Codex Spark, with commercial deployment targeted by the end of 2026.

Sign up for Fortune India's ad-free experience
Enjoy uninterrupted access to premium content and insights.

The announcement comes as AI companies race to secure computing capacity amid soaring demand for generative AI applications. Inference—the process of generating responses from trained AI models—has emerged as one of the industry's biggest cost centres as services such as ChatGPT scale to hundreds of millions of users.

Built specifically for AI inference

Unlike general-purpose AI accelerators adapted for multiple workloads, Jalapeño has been designed from the ground up for large language model (LLM) inference. OpenAI said the architecture was optimised around compute, memory, networking and data-movement requirements observed across products such as ChatGPT, Codex and its API platform.

While the company has yet to disclose benchmark results, early testing indicates the processor could deliver significantly higher performance per watt than current state-of-the-art alternatives. A detailed technical performance report is expected in the coming months.

“Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses,” said Greg Brockman, president and co-founder of OpenAI.

Richard Ho, who leads OpenAI’s hardware programme, said the chip was designed using insights gathered from close collaboration between OpenAI researchers and engineering teams. “Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits,” he said.

Nine-month development cycle

OpenAI and Broadcom said the chip moved from initial design to manufacturing tape-out in just nine months, a timeline the companies described as among the fastest achieved for a high-performance advanced semiconductor programme.

The companies also revealed that OpenAI’s own AI models were used to assist portions of the chip design and optimisation process, highlighting a growing trend of AI being deployed to accelerate hardware development.

Part of a broader infrastructure push

Jalapeño is the first processor in a planned multi-generation compute platform being jointly developed by OpenAI, Broadcom and Celestica. Broadcom will contribute silicon implementation and networking technologies, including its Tomahawk networking platform, while Celestica will provide board, rack and system-level integration.

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI,” said Broadcom CEO Hock Tan. He added that the companies are targeting deployment in gigawatt-scale data centres with Microsoft and other partners beginning in 2026.

The move underscores a broader shift among leading AI developers towards building custom silicon as they seek to improve performance, lower operating costs and secure long-term access to computing resources.