OpenAI has announced the launch of its latest compact generative model, GPT-4o mini, which is designed to be more resource-efficient and cost-effective compared to the standard GPT-4o model. This makes it easier for developers to incorporate AI technology into a wider range of products.
This upgrade not only benefits developers and apps, but also enhances the capabilities and removes limitations for the free version of ChatGPT. GPT-4o mini is now accessible to users on the Free, Plus, and Team tiers through the ChatGPT web and app starting today. ChatGPT Enterprise subscribers will receive access next week, with GPT-4o mini replacing the current small model, GPT-3.5 Turbo, for end users.
While the older model is still available to developers through the API, the company plans to eventually retire it without specifying a date. According to OpenAI, users who hit their limit will now be downgraded to GPT-4o mini instead of GPT-3.5, providing a significant advantage for those who have not upgraded to ChatGPT Plus.
We’re continuing to make advanced AI accessible to all with the launch of GPT-4o mini, now available in the API and rolling out on ChatGPT today. https://t.co/sTxtOfUapJ
— OpenAI (@OpenAI) July 18, 2024
Based on data from Artificial Analysis, OpenAI’s latest AI model achieved a score of 82% on the MMLU reasoning benchmark, surpassing Gemini 1.5 Flash by 3% and Claude 3 Haiku by 7%. The highest MMLU benchmark to date was set by Gemini Ultra, Google’s top AI model, with a score of 90%.
Additionally, OpenAI claims that GPT-4o mini is 60% more cost-effective to operate than GPT-3.5 Turbo. Developers will pay 15 cents per million input tokens and 60 cents per million output tokens. According to CNBC, GPT-4o mini is considered “the most capable and cost-efficient small model available today.”
The cost savings from GPT-4o mini stem from the fact that not all AI tasks require the full capabilities of larger models like GPT, Claude, or Gemini. Smaller models like Google’s Gemini 1.5 Flash, Meta’s Llama 3 8b, and Anthropic’s Claude 3 Haiku are more suited for simple, high-volume tasks, offering faster and more cost-effective performance than their larger counterparts.
OpenAI states that GPT-4o mini retains the same context window size of 128,000 tokens and knowledge cutoff of October 2023 as the full-size version, although the exact size of the new model was not specified. While the model API currently supports text and vision capabilities, video and audio functionalities will be added in the future.
This announcement follows on the heels of OpenAI’s update on Voice Mode for GPT-4o, with a smaller alpha release slated for late July and a wider rollout planned for fall.