Papers
arxiv:2508.10925

gpt-oss-120b & gpt-oss-20b Model Card

Published on Aug 8
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Two open-weight reasoning models, gpt-oss-120b and gpt-oss-20b, utilize an efficient mixture-of-expert transformer architecture and achieve strong performance across various benchmarks while being released under an open license.

AI-generated summary

We present gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models that push the frontier of accuracy and inference cost. The models use an efficient mixture-of-expert transformer architecture and are trained using large-scale distillation and reinforcement learning. We optimize the models to have strong agentic capabilities (deep research browsing, python tool use, and support for developer-provided functions), all while using a rendered chat format that enables clear instruction following and role delineation. Both models achieve strong results on benchmarks ranging from mathematics, coding, and safety. We release the model weights, inference implementations, tool environments, and tokenizers under an Apache 2.0 license to enable broad use and further research.

Community

Sign up or log in to comment

Models citing this paper 7

Browse 7 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.10925 in a dataset README.md to link it from this page.

Spaces citing this paper 1,375

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.