The Tiling Algorithm - A general method for structural characterization of accurate long DNA sequence reads: application to AAV genome sequences.
Abstract
Adeno-Associated Virus (AAV), a common vector used in human gene therapy, is a challenging organism for DNA sequencing because its replication cycle results in structural rearrangements. In addition, the AAV manufacturing process can produce small fractions of viral particles containing host cell DNA and/or fragments of the helper plasmids. Pacific Biosciences (PacBio) long-read sequencers are capable of full length, single molecule sequencing of AAV viral genomes, but the analysis of the data is challenging. We present a simple algorithm for determining the arrangement of functional elements of single DNA molecules which can be aggregated to provide a sensitive measure of the population of sequences in a sample, including minor species. Using data from four publicly available datasets, we demonstrate our algorithm is able to characterize nearly all of the species in the AAV samples.
Related articles
Related articles are currently not available for this article.