Skip to content Skip to sidebar Skip to footer




  • The accuracy achieved by the top-scoring AI in the world’s hardest benchmark as improved by 183% in just two weeks
  • ChatGPT o3-mini now scores up to 13% accuracy depending on capacity
  • OpenAI Deep Research obliterates competition with 26.6% accuracy result

The world’s hardest AI exam, Humanity’s Last Exam, was launched less than two weeks ago, and we’ve already seen a huge jump in accuracy, with ChatGPT o3-mini and now OpenAI’s Deep Reasoning topping the leaderboard.

The AI benchmark created by experts from around the world contains some of the hardest reasoning problems and questions known to man – it’s so hard, that when I previously wrote about Humanity’s Last Exam in the article linked above, I couldn’t even understand one of the questions, let alone answer it.



error: Content is protected !!