ADVERTISEMENT

OpenAI has unveiled its first custom-designed AI inference chip, marking a significant step in the company's efforts to reduce reliance on third-party hardware and gain greater control over the infrastructure powering its rapidly expanding artificial intelligence services.
The chip, named Jalapeño, was developed in partnership with semiconductor giant Broadcom and systems manufacturer Celestica. OpenAI said engineering samples are already running machine-learning workloads in laboratory environments, including GPT-5.3 Codex Spark, with commercial deployment targeted by the end of 2026.
The announcement comes as AI companies race to secure computing capacity amid soaring demand for generative AI applications. Inference—the process of generating responses from trained AI models—has emerged as one of the industry's biggest cost centres as services such as ChatGPT scale to hundreds of millions of users.
Unlike general-purpose AI accelerators adapted for multiple workloads, Jalapeño has been designed from the ground up for large language model (LLM) inference. OpenAI said the architecture was optimised around compute, memory, networking and data-movement requirements observed across products such as ChatGPT, Codex and its API platform.
While the company has yet to disclose benchmark results, early testing indicates the processor could deliver significantly higher performance per watt than current state-of-the-art alternatives. A detailed technical performance report is expected in the coming months.
“Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses,” said Greg Brockman, president and co-founder of OpenAI.
Richard Ho, who leads OpenAI’s hardware programme, said the chip was designed using insights gathered from close collaboration between OpenAI researchers and engineering teams. “Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits,” he said.
OpenAI and Broadcom said the chip moved from initial design to manufacturing tape-out in just nine months, a timeline the companies described as among the fastest achieved for a high-performance advanced semiconductor programme.
The companies also revealed that OpenAI’s own AI models were used to assist portions of the chip design and optimisation process, highlighting a growing trend of AI being deployed to accelerate hardware development.
Jalapeño is the first processor in a planned multi-generation compute platform being jointly developed by OpenAI, Broadcom and Celestica. Broadcom will contribute silicon implementation and networking technologies, including its Tomahawk networking platform, while Celestica will provide board, rack and system-level integration.
“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI,” said Broadcom CEO Hock Tan. He added that the companies are targeting deployment in gigawatt-scale data centres with Microsoft and other partners beginning in 2026.
The move underscores a broader shift among leading AI developers towards building custom silicon as they seek to improve performance, lower operating costs and secure long-term access to computing resources.