SKiM: Accurately Classifying Metagenomic ONT Reads in Limited Memory
Abstract
Motivation
Oxford Nanopore Technologies’ devices, such as MinION, permit affordable, real-time DNA sequencing, and come with targeted sequencing capabilities. Such capabilities create new challenges for metagenomic classifiers that must be computationally efficient yet robust enough to handle potentially erroneous DNA reads, while ideally inspecting only a few hundred bases of a read. Currently available DNA classifiers leave room for improvement with respect to classification accuracy, memory usage, and the ability to operate in targeted sequencing scenarios.
Results
We present SKiM: Short K-mers in Metagenomics, a new lightweight metagenomic classifier designed for ONT reads. Compared to state-of-the-art classifiers, SKiM requires only a fraction of memory to run, and can classify DNA reads with higher accuracy after inspecting only their first few hundred bases. To achieve this, SKiM introduces new data compression techniques to maintain a reference database built from shortk-mers, and treats classification as a statistical testing problem.
Availability
SKiM source code, documentation and test data are available from:<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://gitlab.com/SCoRe-Group/skim">https://gitlab.com/SCoRe-Group/skim</ext-link>.
Contact
<email>tcschneg@buffalo.edu</email>
Related articles
Related articles are currently not available for this article.