Lead
Richard Sutton, a 2023 Turing Award winner, said on June 1 that pure generative AI cannot conduct real scientific research because it cannot evaluate its own outputs.
Context
Speaking to The Decoder, Sutton highlighted a core weakness in current generative models: they generate novel ideas but have no built‑in mechanism to judge whether those ideas are correct or useful. He contrasted this with systems such as AlphaGo and AlphaProof, which embed evaluation loops that let the AI test and refine its own moves or proofs.
Impact
Without self‑evaluation, Sutton argues, AI‑driven novelty flashes briefly then disappears, leaving no lasting contribution to science. The claim challenges the optimism surrounding large language models that claim to accelerate discovery, suggesting that many AI‑assisted experiments may remain superficial.
What’s Next
Sutton calls for research that integrates evaluation feedback directly into generative pipelines. Future work may focus on hybrid architectures that combine creative generation with rigorous testing, a direction that could reshape how labs deploy AI in hypothesis generation and experimental design.
📎 Related Articles
NVIDIA AI Cloud Grows Globally to Power Expanding AI Compute • NVIDIA unveils Cosmos 3, an open physical AI model • Alpamayo 2 Super Model Boosts AI Infrastructure for Robotaxis • Anthropic rolls out Claude Opus 4.8 and readies Mythos models for all users • OpenAI rolls out election‑info tools and AI safeguards • Anthropic lands $65 B Series H, valuation tops $965 B • Debate Over AI Psychosis Hits Equity Podcast • Anthropic to Release Mythos‑Level AI Models in Weeks
Explore topic hubs
AI News Today • ChatGPT Prompts • AI Agents • AI Models • AI Coding Tools




