The new OpenAI Open-Weight models are here¶

Yesterday, OpenAI released its new Open-Weight models under Apache 2.0, which differ in size: New: gpt-oss. The * gpt-oss-120b* model is comparable to OpenAI o4-mini, gpt-oss-20b to OpenAI o3-mini. gpt-oss-20b can run on devices with 16 GB of RAM or more, making it suitable for local inference or rapid iterations without costly infrastructure. But even the large gpt-oss-20b model runs fast enough on my Mac laptop with 64 GB of RAM.
Note
There are a number of ways to run these models. For my first attempts, I used
LM Studio to install openai/gpt-oss-20b. It then consumes just over
11 GB with reasoning=medium
and processes approximately 55 tokens/second.
The publication on how the models were trained also provides interesting insights: gpt-oss-120b & gpt-oss-20b Model Card (PDF, 5.1 MB). The models were specifically trained to use web browsers and Python tools more effectively:
A browsing tool allows you to search for and open content available on the web.
A Python tool executes code in a state-oriented Jupyter Notebook environment.
There is also a section on using Python tools in the openai/gpt-oss: repository.
Finally, OpenAI Harmony has also been released under the Apache 2 licence. It is inspired by their new Responses API. The format is described in OpenAI Harmony Response Format. It contains some exciting concepts:
A fine-grained role model with the roles
system
,developer
,user
,assistant
andtool
.Three output channels:
analysis
,commentary
andfinal
.In the graphical user interface, usually only the
final
channel is visible,analysis
is used for the chain of thought, andcommentary
is used for tools.
I have not yet tested how well the tool calls work with local models. So far, I have been rather disappointed in this regard. This was probably due to the fact that I was able to execute individual calls, but with Claude, dozens of tools are called in a single session.