Published on 03/03/22

New software from CAES improves accuracy of DNA sequence analysis

By Jennifer L Reynolds

UGA Assistant Professor Henk den Bakker leans in front of his computer monitor, which displays his Sepia software application — Sepia is a cutting-edge read classifier, written by College of Agricultural and Environmental Sciences Assistant Professor Henk den Bakker, that is out now as open-source software.

Researchers from the University of Georgia’s Center for Food Safety have developed software that functions as an important step in improving the accuracy of DNA sequence analysis when testing for microbial contamination.

Sepia is a cutting-edge read classifier, written by College of Agricultural and Environmental Sciences Assistant Professor Henk den Bakker, that is out now as open-source software. And it should make genome sequencing much faster for researchers studying bacteria.

The length of chromosomes of bacteria typically range between 1.5 million base pairs to roughly 9.5 million base pairs, but if researchers want to “read” the individual bases of a genome (the genome sequencing process), they can only do that in pieces of 150 to 10,000 base pairs using modern technology. These pieces are called “reads.”

Now imagine when researchers want to determine what types of microorganisms and viruses are present in a sample — such as in a nasal swab — and sequence the DNA of all those organisms. Presented with a mixture of DNA reads of a plethora of organisms, researchers use a tool called a “read classifier” to quickly sort through the reads and determine to what microorganisms they most likely belong.

Like other read classifiers, den Bakker’s new software works by cross-referencing the information from the sample to existing databases, but it is designed to address challenges in the process posed by potential errors in the taxonomic information available on some microorganisms or switching to a new taxonomic system altogether.

Because bacteria are often single-celled microorganisms lacking physical distinctions, they are more difficult to classify than more complex organisms, such as mammals or reptiles. Researchers have only recently begun using DNA to determine the taxonomy of microorganisms. This means that the taxonomy of some databases that read classifiers pull information from are sometimes not in agreement with what similarities in DNA tells us.

“Only recently, in the last decade, we began sequencing these organisms and using the genetic data to build taxonomies. That’s very important because when we know things are genetically similar, a read classifier can use that information to make predictions,” den Bakker said.

Henk den Bakker holds new technology with monitor in the background — Den Bakker anticipates that the software in its current form will function as a base model onto which he will build additional features, like removing human DNA from test results to help protect patient confidentiality.

Using these predictions, when the read classifier discovers an organism that is missing from the database, it can help researchers determine what that unidentified organism is most closely related to by comparing its genetic material to that of known microorganisms, he said.

When writing the software, den Bakker intentionally made it simple for the end user to make edits and corrections as needed to help address the problems with the taxonomy used in databases. Given its wide range of applications, much of his focus was on creating software that was user-friendly, allowing researchers to easily edit the taxonomy of the databases if they find an error.

To test the software, den Bakker recruited the help of Lee Katz, a bioinformatician with the Centers for Disease Control and Prevention (CDC) and adjunct faculty member with the UGA Center for Food Safety. Katz tested the software for genome contamination — this occurs when researchers confirm that they have sequenced only the organism that they are interested in, and not a mixture of organisms. Based on his findings, Katz has suggested its use to CDC colleagues for metagenomics analysis.

Den Bakker anticipates that the software in its current form will function as a base model onto which he will build additional features. One such upcoming feature is designed to help protect patient confidentiality by removing human DNA from test results. Researchers will then be able to share the results of their research while simultaneously complying with health information privacy laws.

“For me, writing software is also exploring new data structures on a data science level — how to make these things more efficient. Writing it is more or less like starting an experiment in the lab,” den Bakker said.

The software is available now and is free to download on GitHub. More information on Sepia can be found in The Journal of Open Source Software. To hear a discussion on Sepia with Hendrik den Bakker, listen to episode 74 of the “Micro Binfie Podcast,” “Sepia With Henk – Soup Or Salad Yes”!

Jennifer Reynolds is the communications professional for UGA's Center for Food Safety.

Authors:

Jennifer L Reynolds

Experts/Sources:

Hendrik C Den Bakker

CAES News

CAES alum teams up with Keebler Elves as food quality manager at Ferrero Bakery 03/04/25 Maria M. Lameiras cultivate.caes.uga.edu

Angela Dupree’s office is overflowing with elfin magic. Her office mate is Ernie the Elf, the Hollow Tree conference room is around the corner, and it’s her job to make sure the cookies you buy in the store are perfect, batch after batch. Dupree, who graduated with a bachelor’s degree in environmental health and a minor in environmental soil science from the University of Georgia’s College of Agricultural and Environmental Sciences in 1992, is the food safety and quality manager at Ferrero Bakery in Augusta, Georgia.

CAES News

International student turned CAES researcher advances impact in antimicrobial resistance 02/28/25 Devockius Tabron

From the small village of Batloun in the mountains of Lebanon to the American University of Beirut, Jouman Hassan carried the lessons her parents instilled in her. Their guidance has served her well at the University of Georgia. At the graduation ceremony for the UGA Griffin campus in December, Hassan shared powerful wisdom from her mother. “Jouman, the word ‘impossible’ doesn’t exist in my dictionary — I erased it,” Hassan recalled her mother saying to her, and she has used the words as a talisman throughout her academic journey.

CAES News

CAES study uncovers antibiotic-resistant genes in Georgia surface and waste waters 02/25/25 Jordan Powers

Antibiotics are commonplace — used to treat minor and major infections, prevent infection post-surgery, and treat our pets and livestock. But what happens if these life-saving drugs lose their efficacy? Issmat Kassem, assistant professor at the University of Georgia College of Agricultural and Environmental Sciences, has asked this question for nearly two decades.

Best by dates on the bottom of canned goods.

CAES News

Best by vs. sell by: UGA food safety expert explains expiration dates 02/12/25 Emily Cabrera

Checking expiration dates is second nature for many consumers, a routine part of deciding what’s safe to eat. But those dates often indicate peak quality rather than actual safety, leading people to throw away perfectly good food out of caution — wasting both meals and money in the process. Between 30% and 40% of the food supply in the United States is wasted, according to the United States Department of Agriculture. A portion of that stems from consumer misunderstanding of food labels, said Carla Schwan, a food safety specialist with University of Georgia Cooperative Extension.

A bubbling pot unveils the creation of a fish broth infused with succulent shrimp.

CAES News

UGA Extension experts search for safe seafood broth recipe 12/10/24 Cal Powell fcs.uga.edu

Safe recipes for canning chicken and other meat stocks are plentiful, but when it comes to seafood, the options are limited. Currently there are no research-based, validated recipes for safely canning seafood broth at home, only recipes developed for taste and flavor. A team of University of Georgia scientists are conducting an exploratory study they hope will lead to validated seafood broth recipes for home canners.

CAES News

Holiday food safety 101: Tips to avoid foodborne illness from UGA Extension experts 12/04/24 Jordan Powers

The holiday table: For many, it is the highlight of the season, packed with roast turkey, savory stuffing, homemade pies and perhaps a cold glass of eggnog. It is also a place where bacteria can linger, turning a holiday gathering into a very different type of party. As holiday menus are planned and grocery lists assembled, Carla Schwan, University of Georgia Cooperative Extension food safety specialist, urges home cooks to remember one key rule. Keep hot foods hot and cold foods cold.