Digital detox
What if AI Doesn't Improve Significantly From This Point? - Cal Newport

What if AI Doesn't Improve Significantly From This Point? - Cal Newport

      Since the launch of ChatGPT in late 2022, it has been challenging to avoid experiencing a mix of excitement and anxiety regarding the potential effects of generative AI. This response has been partly fueled by the bold statements made by tech CEOs, whose rhetoric has become increasingly exaggerated.

      “AI is beginning to outperform humans in nearly all intellectual tasks,” stated Anthropic CEO Dario Amodei in an interview with Anderson Cooper. He warned that within the next one to five years, half of entry-level white-collar positions could be “eliminated,” leading to unemployment rates reaching up to 20%—a peak reminiscent of the Great Depression.

      Simultaneously, OpenAI’s Sam Altman claimed that AI is now competitive with individuals holding PhDs, prompting one outlet to ask, “What opportunities are left for graduates?”

      Not wanting to be outdone, Mark Zuckerberg asserted that superintelligence is now “within reach.” (His investors are likely hoping he’s correct, as he is reportedly offering compensation packages of up to $300 million to attract top AI talent to Meta.)

      However, two weeks ago, OpenAI released its much-anticipated GPT-5, a large language model that many expected would showcase significant advancements similar to those introduced with GPT-3 and GPT-4. Instead, the product appeared to be merely acceptable.

      GPT-5 demonstrated slight improvements over its predecessors in certain areas but performed worse in others. It included various usability enhancements, but some users found them irritating. Within days, over 4,000 ChatGPT users signed a petition on change.org requesting that OpenAI restore access to their previous model, GPT-4o, which they preferred over the latest version. An early YouTube reviewer stated that GPT-5 was a product that “was hard to complain about,” a characterization more fitting for the iPhone 16 than a groundbreaking technology. AI expert Gary Marcus, who had anticipated this outcome for years, succinctly described his initial thoughts on GPT-5 as “overdue, overhyped, and underwhelming.”

      This leads to a crucial question that, until recently, few had contemplated: Could it be that the AI we currently utilize is essentially as good as it will be for some time?

      In my latest article for The New Yorker, published last week, I attempted to address this question. In doing so, I uncovered a technical narrative not widely recognized outside the AI community. The significant performance improvements of the GPT-3 and GPT-4 language models were attributed to enhancements in a process known as pretraining, where a model processes an immense amount of text, effectively allowing it to educate itself and become more intelligent. The advancements in both models stemmed from increasing both their size and the volume of text used during pretraining.

      However, after the release of GPT-4, AI companies began to realize that this approach was becoming less effective. They continued to expand model size and training intensity but experienced diminishing returns in capability enhancements.

      As a result, starting around last fall, these companies shifted their focus to post-training techniques, which involve refining a pretrained model to perform better on specific tasks. This allowed AI companies to continue reporting advancements in their products’ capabilities, but these improvements became more narrowly focused.

      I illustrated this transition in my article with a metaphor involving a car:

      “Pre-training can be likened to producing the vehicle; post-training enhances its performance. Researchers had predicted that expanding the pre-training process would increase the power of the vehicles produced; if GPT-3 was a sedan, then GPT-4 was a sports car. Once this progression stalled, the industry shifted its focus to optimizing the performance of the cars they had already built.”

      The result was a bewildering array of models with obscure names—o1, o3-mini, o3-mini-high, -4-mini-high—each featuring tailored post-training upgrades. These models demonstrated publicly announced improvements on specific benchmarks but no longer achieved the substantial leaps in practical capabilities that were once anticipated. As Gary Marcus noted, “I don’t hear a lot of companies using AI saying that 2025 models are significantly more useful to them than 2024 models, even though the 2025 models perform better on benchmarks.”

      It appears that the post-training approach may yield marginal improvements in products, but not the substantial advancements in ability that would be required to substantiate the more extravagant predictions made by tech CEOs.

      This does not mean, of course, that generative AI tools lack value. They can be quite impressive, particularly in aiding computer programming (though perhaps not to the extent some expected), conducting smart searches, or creating custom tools to analyze extensive texts. However, this presents a different perspective from the idea that AI is “better than humans at nearly all intellectual tasks.”

      For further insights into this narrative, including a clear prediction of what to expect from this technology in the near future, you can read the complete article. Meanwhile, it seems prudent to shift your focus away from the

Other articles

What if AI Doesn't Improve Significantly From This Point? - Cal Newport

Since the launch of ChatGPT in late 2022, it has been difficult to avoid being overwhelmed by emotions of excitement or anxiety regarding the ... Read more