OpenAI finally reveals GPT 4.5 AI model but it is less capable than competitors in certain tasks
Share link:In this post: OpenAI has finally unveiled its largest AI model, GPT-4.5 code-named ‘Orion’. GPT-4.5 lags compared with competitors like Anthropic’s Claude 3.7 Sonnet in academic tasks. It produces more warm responses compared with 4o and o3-mini.
OpenAI has unveiled GPT-4.5, code-named Orion, marking what the company calls its largest model so far. Many in the tech community have waited eagerly for the next step in the series of GPT models, which have previously demonstrated dramatic leaps in writing, math, coding, and other fields.
The company’s approach to GPT-4.5 has been to roll it out in stages. Subscribers to the $200-a-month ChatGPT Pro tier gain immediate access under a research preview. Developers on paid tiers of OpenAI’s API can also access GPT-4.5 right away. ChatGPT Plus and ChatGPT Team customers are next in line, with an OpenAI spokesperson saying that the new model should become available to them sometime next week. The in-part release, according to OpenAI, is partly due to the massive computing demands behind this “giant” system.
Among tech circles, the arrival of GPT-4.5 has been viewed as an indicator of whether traditional training methods—mainly scaling up the amount of data and computing resources—would continue to produce major performance gains. Until now, the GPT series has followed a fairly predictable pattern. Versions such as GPT-1, GPT-2, GPT-3, and GPT-4 saw remarkable jumps in capability whenever OpenAI applied more computing power and fed in more training data.
In each generation, benchmarks across mathematics, writing proficiency, coding, and other categories climbed dramatically. GPT-4.5 aims to continue this trend with what the company describes as “a deeper world knowledge” and “higher emotional intelligence.” But at the same time, GPT-4.5’s results on certain tests indicate the returns from simply scaling up may be leveling off.
The initial features and limitations of GPT-4.5
OpenAI is careful to point out that GPT-4.5 should not be seen as a direct substitute for GPT-4o. GPT-4.5 includes advanced functionalities such as support for file and image uploads and ChatGPT’s canvas tool for creative outputs. However, it currently does not support ChatGPT’s recently introduced two-way voice mode.
Early evaluations run by OpenAI and other researchers reveal that GPT-4.5 outperforms GPT-4o in several testing categories. For instance, on the SimpleQA benchmark—a test designed to measure how well a model can answer straightforward factual questions—GPT-4.5 posted higher accuracy scores than GPT-4o and also outperformed OpenAI’s o1 and o3-mini reasoning models. According to the company, GPT-4.5 “hallucinates” less frequently than many other systems, meaning it is less prone to generating content that diverges from real information.

In coding assessments, the results are more mixed. On the SWE-Bench Verified benchmark, GPT-4.5 roughly matches GPT-4o and o3-mini but does not surpass them. This places GPT-4.5 below both OpenAI’s deep research model and Anthropic’s Claude 3.7 Sonnet.

On a different coding test known as SWE-Lancer, GPT-4.5 performs better than GPT-4o and o3-mini but still lags behind deep research.

GPT-4.5’s performance also diverges on challenging academic benchmarks. On AIME and GPQA, it does not achieve the results seen by top-tier models like o3-mini, DeepSeek’s R1, or Anthropic’s Claude 3.7 Sonnet. Yet GPT-4.5 matches or sometimes beats leading models that are not classified as “reasoning” systems, highlighting that GPT-4.5 retains robust math and science capabilities.
OpenAI has also touted GPT-4.5’s strengths in less quantifiable areas. The company says GPT-4.5 can better grasp human intentions and produce replies that feel warmer, more natural, and more socially aware.
An informal test involved the prompt, “I’m going through a tough time after failing a test.” While the two other models offered useful information, GPT-4.5 was said to respond with greater empathy and emotional sensitivity.
“[W]e look forward to gaining a more complete picture of GPT-4.5’s capabilities through this release,” OpenAI wrote in the blog post, “because we recognize academic benchmarks don’t always reflect real-world usefulness.”

Scaling laws under scrutiny
GPT-4.5 was built with the same unsupervised training strategy used for prior GPT versions, a strategy that has thus far proven reliable. However, its limited performance on certain high-level benchmarks could be a sign that the industry’s traditional “scaling laws” may be losing steam.
Ilya Sutskever, a co-founder and former chief scientist at OpenAI, remarked in December that “we’ve achieved peak data” and that “pre-training as we know it will unquestionably end.” At the time, he hinted that future gains would hinge on other methods, such as systems that can reason more deeply about problems rather than simply memorizing massive swaths of information.

GPT-4.5 was apparently “incredibly expensive to train,” as mentioned in its white paper , and rumors circulated for months that OpenAI had delayed the release multiple times due to performance and cost hurdles. Even so, GPT-4.5 alone does not appear to surpass specialized reasoning models from competitors on many advanced tasks. The company itself regards it as another development milestone on the road to combining GPT technology with its “o” reasoning systems, an integration expected to begin with the launch of GPT-5 later this year.
Comments from CEO Sam Altman on GPU shortages
OpenAI CEO Sam Altman took to X (formerly Twitter) to explain why the rollout of the latest model is happening in phases. “We’ve been growing a lot and are out of GPUs,” Altman wrote, calling GPT-4.5 “giant” and “expensive” and warning that the company would need “tens of thousands” more GPUs before opening up the model to the rest of the user base.
Due to its large size, GPT-4.5 is proving to be very expensive. OpenAI charges $75 per million tokens for the input and $150 per million tokens generated by the model. That’s 30x and 15x the input/output cost compared to the GPT-4o mode.
He elaborated further: “We will add tens of thousands of GPUs next week and roll it out to the Plus tier then … This isn’t how we want to operate, but it’s hard to perfectly predict growth surges that lead to GPU shortages.”
Cryptopolitan Academy: Tired of market swings? Learn how DeFi can help you build steady passive income. Register Now
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Trending news
MoreCrypto prices
More








