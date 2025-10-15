Leading Artificial Intelligence (AI) models may be performing impressively in basic scientific tasks, but they still fall short when it comes to deeper reasoning, a new study by researchers from IIT Delhi and Friedrich Schiller University (FSU) Jena, Germany, has found. Published in Nature Computational Science, the study reveals that although current AI systems can handle simple perception-based tasks with near-perfect accuracy, they fail to demonstrate genuine scientific understanding.

The research team, led by IIT Delhi Associate Professor N. M. Anoop Krishnan and FSU Jena Professor Kevin Maik Jablonka, developed MaCBench, the first comprehensive benchmark designed to assess how vision–language models perform on real-world chemistry and materials science tasks. Their results showed a striking paradox: while AI models performed nearly flawlessly in identifying laboratory equipment, they struggled with spatial reasoning, integrating cross-modal information, and multistep logical inference—skills essential for authentic scientific discovery.