Next-gen AI model GPT-5 debuts, boasting a significant decrease in hallucinatory tendencies, with an impressive 80% reduction.
OpenAI, the renowned artificial intelligence research laboratory, has launched its most capable model yet - GPT-5. This latest innovation offers notable quantitative gains in accuracy, reasoning, and coding benchmarks over its predecessor, GPT-3.
GPT-5 demonstrates significant performance improvements across various benchmarks, particularly excelling in coding, reasoning, and problem-solving tasks. For instance, on the SWE-bench Verified coding benchmark, GPT-5 scores 74.9%, a marked improvement over GPT-3's 69.1%.
In PhD-level science question assessments like GPQA Diamond, GPT-5 achieves scores of 85.7% to 89.4% (GPT-5 Pro), outperforming GPT-3’s 83.3%. On math benchmarks such as AIME ’25, GPT-5 attains 94.6% compared to GPT-3’s 88.9%, indicating better reasoning and mathematical skill.
Efficiency is another area where GPT-5 shines. It delivers similar or better answers with fewer tokens compared to GPT-3, and features a new "real-time router" system that dynamically balances quick responses with deeper reasoning, streamlining user interaction.
GPT-5 also boasts improved reliability and fewer logic errors on advanced academic questions. It is 25% more expensive for generating outputs than GPT-3 but overall 37% cheaper when accounting for input costs, implying cost efficiency depending on usage.
Users report GPT-5 as a more streamlined and less verbose version of GPT-3, with fewer hallucinations and better performance on coding simple apps and simulations.
OpenAI has rolled out four new optional personalities for its chatbot: cynic, robot, listener, and nerd. GPT-5 is particularly effective in health-related queries, according to OpenAI CEO Sam Altman.
It's worth noting that GPT-5 is less prone to hallucinating and factual errors, and delivers improvements in coding, writing, math, and visual perception.
OpenAI has implemented evaluations to test for deceitful behavior on the models' part, and the deception rates from 4.8% on GPT-3 to 2.1% in reasoning responses have been reduced in GPT-5.
GPT-5 is a collection of models, not one single model, and is available now on ChatGPT for free, Plus, and Pro users. The routing model is continuously trained on new input signals to make it smarter about which model to route the request to and when to trigger reasoning functionality.
Lastly, OpenAI has released its first open weights models since GPT-2 this week. GPT-5's responses are 45% less likely to contain a factual error than GPT-4o and 80% less likely than OpenAI o3.
In summary, GPT-5 offers a more efficient, reliable, and accurate AI chatbot experience, making it a significant leap forward in the field of artificial intelligence.
- GPT-5, the latest model from OpenAI, demonstrates significant improvements in coding, reasoning, and problem-solving tasks, excelling in areas such as the SWE-bench Verified coding benchmark and math benchmarks like AIME '25.
- With a new "real-time router" system, GPT-5 delivers similar or better answers with fewer tokens compared to GPT-3, streamlining user interaction by balancing quick responses with deeper reasoning.
- OpenAI has implemented evaluations to test for deceitful behavior on the models' part, and the deception rates from 4.8% on GPT-3 to 2.1% in reasoning responses have been reduced in GPT-5.
- GPT-5, a collection of models, is less prone to hallucinating and factual errors, delivering improvements in coding, writing, math, and visual perception. It is currently available for free, Plus, and Pro users on ChatGPT.