I'm seeing quite a few posts online evaluating models like o1 or r1 based on their ability to perform some arbitrary user tasks. I'm also seeing that many of these posts then extrapolate this to suggest that AGI or ASI is not imminent.

  • o1 is incapable of interpreting some board game state and therefore AGI is not imminent.

  • o1 can't fix some asynchronous programming bug, so AGI is not imminent.

  • o1 requires thousands of dollars of compute, so AGI is not imminent.

While I agree that achieving ASI will most likely not happen in 2025, it's crucial to keep in mind that the folks working on foundation models only need to excel at a single specific task in order to change the game entirely. That task is novel AI research.

When a sufficiently advanced language model becomes capable of researching ways to improve itself, it unlocks the ability to enhance not just its core functionality but also it can expand its an infinite number of potential use cases - effectively the AI singularity if done correctly. Therefore, I think it is important to remember that the current generation of models doesn’t need to be close to AGI in order for AGI to be close. In fact, I think that most of the current use cases, even benchmarks like ARC-AGI and FrontierMath tests, are essentially toy examples meant to show progress. Effectively, a sprint demo for OpenAI engineers.

What I believe OpenAI is prioritizing behind the scenes is enabling reasoning models to generate novel, meaningful improvements that they can directly apply back to their own work. I hypothesize that the real turning point will come when OpenAI releases a new AI research paper and, in the author section, there will be only one name: ChatGPT.

Look at some of the work being experimented with here Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers. This study shows that LLMs are capable of generating ideas that are judged as more novel than those generated by expert human researchers. This finding is statistically significant and holds across multiple tests.

If OpenAI reaches the stage where an autonomous research agent can make meaningful contributions to the AI ecosystem -— generating novel ideas, improving itself, and sharing these improvements with the world, I think only then will the technology be taken seriously by skeptics.