dumber

  • Blog

    AI is dumber than you think

    OpenAI recently introduced SimpleQA, a new benchmark for evaluating the factual accuracy of large language models (LLMs) that underpin generative AI (genAI). Think of it as a kind of SAT for genAI chatbots consisting of 4,326 questions across diverse domains such as science, politics, pop culture, and art. Each question is designed to have one correct answer, which is verified by independent…

    Read More »
Back to top button
close