Invited Speaker Australian Society for Microbiology Annual Scientific Meeting 2023

New tools to explore metagenomic data (95312)

Gene Tyson 1
  1. Queensland University of Technology, Woolloongabba, QLD, Australia

Advances in DNA sequencing and bioinformatics have dramatically increased the rate of recovery of microbial genomes from metagenomic data. However, recovery of genomes from low abundance microorganisms and mobile genetic elements (MGEs) from metagenomic data poses both computational and scientific challenges. Recently the team at the Centre for Microbiome Research has developed three new tools, Ibis (Bin Chicken), CheckM2 and RecurM, to overcome these limitations.

 

Ibis uses single-copy marker genes to select metagenomes for coassembly to improve recovery of metagenome-assembled genomes (MAGs) from previously un-recovered microbial species (typically low abundance microorganisms). We demonstrate that Ibis can dramatically increase the number and diversity of MAGs coming from closely related habitats or temporal metagenomic data from the same site.

 

Assessing the quality of MAGs is a critical step prior to downstream analysis. CheckM2 is an improved method of predicting genome quality (completeness and contamination) of MAGs using machine learning. We show that CheckM2 accurately predicts genome quality for MAGs from novel lineages, even those with reduced genome size (e.g. symbionts) such as members of the Patescibacteria and the DPANN superphylum. CheckM2 provides accurate genome quality predictions across all bacterial and archaeal lineages, giving increased confidence when inferring biological conclusions from MAGs. 

 

RecurM is a reference-free method for finding novel MGEs based on recurrent assembly. RecurM uniquely leverages between-samples information to find the same MGEs assembled across many different metagenomes. RecurM outperforms all other existing tools in its accuracy of MGE recovery on synthetic data. Furthermore, we demonstrate its utility by discovering new plasmids and viruses in publicly available metagenomic datasets that had previously been overlooked.

 

As the volume of available sequencing data continues to grow, these new tools will enhance our ability to recover high-quality microbial genomes and MGEs from diverse habitats.