arxiv:2508.10925

gpt-oss-120b & gpt-oss-20b Model Card

Published on Aug 8

Authors:

Abstract

Two open-weight reasoning models, gpt-oss-120b and gpt-oss-20b, utilize an efficient mixture-of-expert transformer architecture and achieve strong performance across various benchmarks while being released under an open license.

AI-generated summary

We present gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models that push the frontier of accuracy and inference cost. The models use an efficient mixture-of-expert transformer architecture and are trained using large-scale distillation and reinforcement learning. We optimize the models to have strong agentic capabilities (deep research browsing, python tool use, and support for developer-provided functions), all while using a rendered chat format that enables clear instruction following and role delineation. Both models achieve strong results on benchmarks ranging from mathematics, coding, and safety. We release the model weights, inference implementations, tool environments, and tokenizers under an Apache 2.0 license to enable broad use and further research.