Reduce latency and costs when processing high volumes of API requests
GPT-4o provides optimized inference performance and token efficiency compared to earlier models, lowering per-request costs and response times at scale

New tools for building agents and tools
GPT-4o (“o” for “omni”) is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is the best model for most tasks, and is our most capable model outside of our o-series models.
GPT-4o provides optimized inference performance and token efficiency compared to earlier models, lowering per-request costs and response times at scale
GPT-4o natively processes images, text, and other modalities in a single model, enabling developers to build applications that understand and reason across multiple input types
The Responses API allows developers to define and enforce response schemas, ensuring AI outputs conform to required data structures for reliable downstream processing
The Agents SDK enables developers to create autonomous agents using GPT-4o that can plan, execute, and iterate on complex workflows without manual intervention at each step