OpenAI released a new functionality for developers (API)
As part of the DevDay conference, the OpenAI company announced a new functionality for developers and immediately made it available. The October 1 release includes:
Realtime API: An API for embedding voice functionality into applications, including audio input and output. Connecting via websocket. The model is currently being used by 4o (or rather, the special new gpt-4o-realtime-preview), but they promise to provide 4o-mini soon as well. You can see the prices in this article (not to mention that they are very low, so the mini will surely be in demand among those who care about it). I can’t say anything about availability yet: the platform promises to make this feature later, and I haven’t checked the websocket.
Distillation of models (documentation): a fine-tuning tool for cheaper models (GPT-4o mini) using information generated by more advanced models (o1-preview and GPT-4o).
Image fine-tuning: GPT-4o fine-tuning data can now include not only text, but also images, which can be transmitted either as links (URLs) or as base64. In addition to the documentation, there is also a small article.
Evaluations: Automation tool for testing the quality of prompts and models, implemented in the platform. It is used in particular as part of the above-described distillation.
Caching prompts: A mechanism to reduce costs (up to two times) for API calls in some scenarios. documentation.
AI assistants for generating system prompts and json schemas in Functions calling. For playground chat system prompts, it looks something like this: