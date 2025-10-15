In a major advance for biotechnology and bioinformatics, researchers from ETH Zurich have developed MetaGraph, a search engine that enables scientists to rapidly query massive public DNA and RNA databases in a manner akin to how Google indexes the web.

Modern sequencing technologies have led to vast repositories of raw genetic data, which is stored in archives such as the SRA (Sequence Read Archive) and the ENA (European Nucleotide Archive), totalling on the order of 100 petabytes of information, Scitech Daily reports.

Until now, researchers could only search metadata or had to download entire datasets, making detailed searches expensive, slow, and computationally demanding.

MetaGraph overcomes these limitations by indexing raw sequences into a compressed, full-text structure that allows queries of genetic sequences themselves. According to Professor Gunnar Rätsch,data scientist at the Department of Computer Science at ETH Zurich, this tool offers “a kind of Google for DNA.”

The researchers claim that their compression framework reduces data volumes by a factor of roughly 300, while preserving accuracy, enabling near-instant matches at very low computational cost — in some cases estimated at about USD 0.74 per megabase of query data.

Because it links raw sequence data and metadata in integrated graphs, the search engine is scalable: as the database grows, the additional computational burden does not increase linearly. The tool is already indexing nearly half of global public sequence data, with plans to incorporate the remainder by year-end.