ADVERTISEMENT
OpenAI has released two open-weight language models—gpt-oss-120b and gpt-oss-20b—marking its first open-weight model release since GPT-2 in 2019. The models are designed to offer strong reasoning performance at lower cost and can be run efficiently on consumer and enterprise hardware. gpt-oss-120b operates on a single Nvidia 80GB GPU, while gpt-oss-20b supports on-device use with only 16GB memory.
How do these models work?
Both models use a mixture-of-experts (MoE) architecture and activate a subset of parameters per input, making them efficient to deploy. Mixture of Experts (MoE) is a machine learning approach that divides an AI model into specialised sub-networks ("experts"). Only a few relevant experts are activated for each task, reducing computation during training and speeding up inference, even in models with billions of parameters.
July 2025
In the world’s youngest nation—where over 65% of the population is under 35—India’s future is already being shaped by those bold enough to lead it. From boardrooms to breakout ideas, a new generation of business leaders is rewriting the rules. This year's Fortune India’s 40 Under 40 celebrates these changemakers—icons in the making like Akash Ambani, Kaviya Kalanithi Maran, Shashwat Goenka, Parth Jindal, Aman Mehta, and Devansh Jain—who are not just carrying forward legacies but boldly reimagining them for a new era. Alongside them are first-generation disruptors like Sagar Daryani, scaling Wow! Momo with a vision to take ₹100 momos to 5,000 cities, and Palak Shah, turning the Banarasi weave into a global fashion story with Ekaya Banaras. These are the entrepreneurs turning ambition into scale. And even beyond traditional industry, the entrepreneurial wave is pulling in creative forces—Ranveer Singh, for instance, is shaking up wellness and nutrition with Bold Care and SuperYou, proving that passion, backed by purpose, is the new blueprint for building brands.
These models are optimised for tasks like reasoning, tool use, chain-of-thought processing, and instruction following. OpenAI has trained them using advanced techniques including reinforcement learning and fine-tuning approaches borrowed from its proprietary systems. The models support multiple reasoning effort levels, allowing trade-offs between performance and latency.
Evaluation results show that gpt-oss-120b matches or outperforms OpenAI’s o4-mini in several benchmarks, including competition-level coding, mathematics, and health-related tasks. The smaller 20b model achieves comparable results to o3-mini and is suitable for edge applications.
While OpenAI has made the models' weights and code available under the permissive Apache 2.0 license, it has withheld proprietary details like routing mechanisms and training methods to protect intellectual property. The Apache 2.0 license is a permissive open-source license that allows users to use, modify, and distribute software freely, including for commercial use.
What is the reason behind open sourcing now?
Writing in a rather long post on X, OpenAI CEO Sam Altman wrote, “we believe this is the best and most usable open model in the world. We're excited to make this model, the result of billions of dollars of research, available to the world to get AI into the hands of the most people possible. Although we believe most people will want to use a convenient service like ChatGPT, people should be able to directly control and modify their own AI when they need to, and the privacy benefits are obvious. As part of this, we are quite hopeful that this release will enable new kinds of research and the creation of new kinds of products. OpenAI’s mission is to ensure AGI that benefits all of humanity. To that end, we are excited for the world to be building on an open AI stack created in the United States, based on democratic values, available for free to all and for wide benefit.”
The release is positioned as a response to increasing competition from open-source models developed in China and elsewhere, including DeepSeek and Qwen.
What happens to safety?
OpenAI says it conducted adversarial testing by fine-tuning the models on sensitive domains to assess misuse potential. The results, reviewed by external experts, indicated that even when fine-tuned for harmful tasks, the models remained within acceptable safety thresholds.
Researcher Eric Wallace also shared that the company also ran a ‘first of its kind safety analysis’ on the newly released models prior to their public announcement, where the models were fine-tuned ‘intentionally to maximise their bio and cyber capabilities.’
How has the access been rolled out?
The models are now accessible on platforms like Hugging Face and Ollama and optimised for deployment across various hardware and cloud environments, including Microsoft, AWS, and Qualcomm devices.
“We’ve designed these models to be flexible and easy to run anywhere—locally, on-device, or through third-party inference providers. To support this, we partnered ahead of launch with leading deployment platforms such as Azure, Hugging Face, vLLM, Ollama, llama.cpp, LM Studio, AWS, Fireworks, Together AI, Baseten, Databricks, Vercel, Cloudflare, and OpenRouter to make the models broadly accessible to developers. On the hardware side, we worked with industry leaders including NVIDIA, AMD, Cerebras, and Groq to ensure optimized performance across a range of systems,” the company said in a statement.
Microsoft is also set to be releasing GPU-optimised versions of the gpt-oss-20b model to Windows devices.
Qualcomm confirmed that gpt-oss-20b supports on-device chain-of-thought reasoning using Snapdragon processors, enabling faster and more private AI applications.
“We see this moment as a turning point: a glimpse into the future of AI where even rich assistant-style reasoning will be local. It also shows the maturity of the AI ecosystem, where open-source innovation from leaders like OpenAI can be harnessed in real-time by partners and developers utilising Snapdragon processors. OpenAI's gpt-oss-20b will enable devices to leverage on-device inference, offering benefits in terms of privacy and latency, while complementing cloud solutions via AI agents,” Qualcomm wrote in a blogpost on Wednesday.
Fortune India is now on WhatsApp! Get the latest updates from the world of business and economy delivered straight to your phone. Subscribe now.