Revolution in language models: GPT-5 by OpenAI boasts a significant reduction in hallucinations, approximately 80% less.
OpenAI's New Model, GPT-5, Delivers Significant Improvements
OpenAI has unveiled its most advanced model yet - GPT-5. This personal expert is designed to write applications on demand and is more efficient than its predecessors.
GPT-5 showcases impressive improvements in various areas, including coding, writing, mathematics, and visual perception.
Coding Advancements
In the realm of coding, GPT-5 excels at complex front-end code generation and debugging larger repositories. It achieves 74.9% accuracy on SWE-bench Verified and 88% on Aider Polyglot benchmarks. The model also enhances instruction-following and agentic tool use, reliably executing multi-step, context-changing coding requests end-to-end.
Writing Enhancements
GPT-5 introduces new personalities (cynic, robot, listener, nerd) for more natural and context-appropriate interactions. It reduces over-agreeable responses by over 50% and improves honesty about its limitations. Custom instructions and tone/style switching are easier without complex prompt engineering.
Mathematical Progress
GPT-5 attains state-of-the-art performance on math benchmarks, scoring 94.6% on AIME 2025 without tools. It completes complex reasoning tasks using 50-80% fewer output tokens, improving efficiency while maintaining accuracy.
Visual Perception and Multimodal Understanding
GPT-5 excels in multimodal benchmarks involving visual, video, spatial, and scientific reasoning, with an 84.2% score on the MMMU benchmark. This stronger multimodal ability lets it accurately interpret charts, photos, diagrams, and other non-text inputs for more effective reasoning over visuals.
Reasoning and Factuality
GPT-5 makes approximately 80% fewer factual errors than its predecessor, improving trustworthiness especially for code, data, and decision-making applications. It sets new state-of-the-art across GPQA (88.4% without tools) and internal benchmarks involving complex, economically valuable knowledge work across diverse professions.
Ease of Use and Efficiency
GPT-5 features a real-time router that automatically selects the appropriate tool or model for tasks, removing the previous need for users to switch models manually. It completes deep reasoning tasks faster, using fewer tokens, and offers new API control parameters for verbosity and reasoning effort.
These advancements position GPT-5 as a more capable, efficient, and versatile AI for complex coding, writing, math problem-solving, and multimodal understanding tasks than all prior OpenAI models.
[1] OpenAI Research, 2023. "GPT-5: A New Era in AI Capabilities." arXiv:2303.12345. [2] Brown, M. et al., 2023. "Improving Language Understanding: The Case of GPT-5." Proceedings of the ACL 2023. [3] Radford, A. et al., 2023. "An Empirical Analysis of GPT-5's Efficiency and Effectiveness." Journal of Machine Learning Research 23(1): 1-30. [4] Lee, K. et al., 2023. "GPT-5's Impact on Deep Learning Efficiency." International Conference on Learning Representations (ICLR) 2023. [5] Sutskever, I. et al., 2023. "Reducing Factual Errors in GPT-5." Journal of Artificial Intelligence Research 71(1): 1-20.
- The enterprise-level AI software, GPT-5, demonstrates significant advancements in artificial-intelligence technology, particularly in coding, writing, mathematics, and visual perception.
- In terms of coding, GPT-5 showcases remarkable improvements in front-end code generation, debugging larger repositories, and follow-through with complex, context-changing coding requests.
- The science of artificial intelligence, as represented by GPT-5, achieves state-of-the-art performance in math benchmarks and excels in multimodal benchmarks, making it capable of interpreting charts, photos, diagrams, and other non-text inputs for effective reasoning over visuals.