University of Georgia food scientist Henk den Bakker is a member of an international team of researchers that has developed a way to quickly search massive amounts of DNA microbial data to identify specific genes, such as the genes responsible for drug-resistant bacteria.
Using their combined knowledge of bacterial genetics and web search algorithms, the scientists built the Bitsliced Genomic Signature Index (BIGSI), a DNA search engine that serves a similar purpose as internet search engines like Google.
Described in a paper published in the February 2019 issue of Nature Biotechnology, the search engine could enable researchers and public-health agencies to perform fast searches of genome sequencing data to monitor the spread of genes in microbial populations.
This means scientists can now quickly identify how many strains of bacteria — among hundreds of thousands of bacterial genomes contained in databases — contain genes that, for example, make them harder to fight with traditional antibiotics.
“A lot of labs are currently sequencing the DNA of microorganisms and they enter this data in international public databases, such as Genbank in the U.S. These data produced by these sequencers are not a single genome sequence, instead they consist of hundreds of thousands of little genomic sequences, which researchers have to piece together to study the genome and individual genes,” said den Bakker, a researcher in the UGA Center for Food Safety on the UGA Griffin campus. “There are currently hundreds of thousands of data points, each representing a microbial strain. Like Google, the Bitsliced Genomic Signature Index (BIGSI) can show us which other bacteria share certain genes.”
For comparison, den Bakker recalled a time just a few years ago when he was part of a group of scientists working on strains of bacteria from France. It took a little less than a month to search through available data to find other strains in which this particular resistant gene could be found.
“Now it only takes seconds using our ultra-fast system of bacterial and viral genomic data,” he said. “We can now look for microbial resistance quicker, and we can see which ones spread more quickly or are resistant to sanitizers or even resistant to colistin, which is kind of a last resort antibiotic medication.”
With today’s technology, the amount of microbial DNA scientists sequence doubles every two years. Until now, there was no practical way to search this massive amount of data.
BIGSI could prove extremely useful during outbreaks of foodborne illness. For example, it would be helpful during a food poisoning outbreak caused by a Salmonella strain containing a drug-resistance plasmid (a ‘hitchhiking’ DNA element that can spread drug resistance across different bacterial species). BIGSI would allow researchers to easily spot if and when the plasmid has been seen before.
“This search engine complements other existing tools and offers a solution that can scale to the vast amounts of data we’re now generating,” said Phelim Bradley, leader of the project and a bioinformatician at the European Bioinformatics Institute. “This means that the search will continue to work as the amount of data keeps growing. In fact, this was one of the biggest challenges we had to overcome. We were able to develop a search engine that can be used by anybody with an internet connection.”
The full article on this project can be viewed at www.nature.com/articles/s41587-018-0010-1. For more information about the UGA Center for Food Safety, visit cfs.caes.uga.edu.