US artificial intelligence company OpenAI and semiconductor firm Broadcom have unveiled Jalapeño, OpenAI's first custom processor designed specifically for large language model inference, marking the company's expansion into hardware as it seeks greater control over the infrastructure underpinning ChatGPT, Codex and future agent-based products.
The development, reported by Edtech Innovation Hub, was completed with manufacturing partner Celestica and progressed from initial design to tape-out in nine months, described by the partners as an unusually short development cycle for a high-performance application-specific integrated circuit.
Engineering samples are currently running machine learning workloads at intended production frequency and power, including GPT-5.3-Codex-Spark. Initial deployment is planned through data centre partners including Microsoft by the end of 2026, with the partners intending to scale the system to gigawatt capacity across several chip generations.
Richard Ho, who leads OpenAI's hardware programme, said: "Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers. We optimised the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models."
Unlike general-purpose accelerators adapted for AI workloads, Jalapeño was architected specifically around the compute, memory movement, networking and serving patterns used by large language models. OpenAI says early testing indicates the chip will deliver substantially better performance per watt than current leading accelerators, with a detailed technical report due in the coming months.
Broadcom contributed silicon implementation, networking and connectivity technology including Tomahawk networking silicon for connecting large numbers of accelerators within data centre systems. Celestica is supporting board design, rack integration, system assembly and production infrastructure.
Hock Tan, President and CEO of Broadcom, said: "This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centres with Microsoft and other partners beginning in 2026."
OpenAI says greater inference efficiency could support faster responses, longer Codex tasks, more reliable service during peak demand and lower costs for developers building on its models. The chip is designed with flexibility to support current and future large language models across the wider AI industry, not solely OpenAI's own products.
Find out more about the Jalapeño chip, its architecture and OpenAI's hardware expansion strategy in the full coverage.




.png)

