A more complete picture: Capturing single nucleotide variant diversity in extended-spectrum beta-lactamase producing Escherichia coli using post-enrichment metagenomics

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Inferring transmission relies on accurately distinguishing between isolates from the same source and those from different sources, and high-quality genomic data is frequently used to model transmission scenarios. The post-enrichment metagenome sequencing (pe-MGS) method uses a sequencing approach to analyse the diversity of a target pathogen enriched by pre-culturing, and has been effectively used to analyse the transmission of nosocomial infections. However, a direct comparison of single nucleotide variant (SNV) call accuracy, cost and feasibility between single colony whole genome sequence (sc-WGS) data and pe-MGS for an antimicrobial resistant bacteria of clinical importance, extended-spectrum beta-lactamase producing E. coli (ESBL-EC), is required for implementation in large-scale clinical studies. A spiked stool sample and rectal swabs from six study participants were pre-enriched in buffered peptone water and cultured on MacConkey agar with 1mg/L cefotaxime. Seven single colonies were picked, and the remaining biomass of all colonies was collected from each plate, sequenced and analysed using the mSWEEP/ mGEMS pipeline. We created a custom SNV calling workflow that allows heterozygous SNVs in a bacterial population, and found that the choice of reference changed the number of measurable SNV distances between the sc-WGS and pe-MGS. Using our custom workflow with a core-gene reference captured 99% of all the SNV calls from multiple sc-WGS data in the pe-MGS data of the same culture. The plate sweep method offers a feasible, cost-effective alternative to multiple single colony picks for describing within-host ESBL-EC diversity. The workflow we developed allows for effective SNV calling from pe-MGS data that was comparable to SNV calls from multiple sc-WGS data from the same sample.

Abbreviations

AMR, antimicrobial resistance; ESBL-EC, extended-spectrum beta-lactamase producing Escherichia coli ; ST, sequence type; SNV, single nucleotide variant; BPW, buffered peptone water; sc-WGS, single-colony whole-genome sequencing; pe-MGS, post-enrichment metagenome sequencing

Impact statement

For bacterial species with high within patient diversity and within-genome variation such as the opportunistic pathogen E. coli , capturing the full diversity is essential to identify transmission events. Pre-enriching the species of interest from patient samples, and then sequencing all colonies, post-enrichment metagenomics (pe-MGS) promises to be a cost-effective, efficient method capturing the full diversity. To estimate transmission events with high confidence and make it applicable for hospital transmission studies, single-nucleotide variants (SNVs) have to be identified with an equivalent resolution as would be achieved when using single-colony whole-genome sequencing on all colonies. Here we present a proof-of-concept study on a set of stool samples and rectal swabs from healthy participants, where we developed a new workflow tested against these control samples. All samples were analysed using both single-colony WGS (sc-WGS) and pe-MGS from the same plate, following pre-enrichment for the species and phenotype of interest (drug resistant Escherichia coli ). This direct comparison allowed us to assess the reconstruction of SNVs between the two approaches on clinically relevant sample types at the highest resolution. We show that by using a newly developed SNV calling workflow, a core-gene reference allowed us to identify comparable SNVs in the pe-MGS to the sc-WGS sequence data. The pe-MGS offers a cheaper and time-saving alternative to multiple sc-WGS and is thus more feasible to integrate into public health settings for routine surveillance. This would provide a robust basis to significantly improve our detection of transmission events and thus understanding of transmission routes, and allow for the targeted implementation of preventative measures.

Data summary

The short-read sequence data generated in this study have been submitted to the European Nucleotide Archive (ENA, <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB101999">https://www.ebi.ac.uk/ena/browser/view/PRJEB101999</ext-link> ), and their individual accession numbers are listed in Supplementary Table 1 . Github repository: <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/joelewis101/TRACS-liverpool/tree/main/bioinformatics">https://github.com/joelewis101/TRACS-liverpool/tree/main/bioinformatics</ext-link> . All protocols developed and utilized in this study have been detailed and provided in the article and supplementary data files.

Related articles

Related articles are currently not available for this article.