Aird-MSI: a high compression rate and decompression speed format for mass spectrometry imaging data

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Mass spectrometry imaging has emerged as a pivotal tool in spatial metabolomics, yet its reliance on the imzML format poses critical challenges in data storage, transmission, and computational efficiency. While imzML ensures cross-platform compatibility, its lower compressed binary architecture results in large file sizes and high parsing overhead, hindering cloud-based analysis and real-time visualization.

This study introduces an enhanced Aird compression format optimized for spatial metabolomics through two innovations: (1) a dynamic combinatorial compression algorithm for integer-based encoding ofm/zand intensity data; (2) a coordinate-separation storage strategy for rapid spatial indexing. Experimental validation on 47 public datasets demonstrated significant performance gains. Compared to imzML, Aird achieved a 70% reduction in storage footprint (mean compression ratio: 30.03%) while maintaining near-lossless data precision (F1-score = 99.26% at 0.1ppm m/ztolerance). For high-precision-controlled datasets, Aird accelerated loading speeds by 15-fold in MZmine.

The Aird format overcomes crucial bottlenecks in spatial metabolomics by harmonizing storage efficiency, computational speed, and analytical precision, reducing I/O latency for large cohorts. By achieving near-native feature detection accuracy, Aird establishes a robust infrastructure for translational applications, including disease biomarker discovery and pharmacokinetic imaging.

Related articles

Related articles are currently not available for this article.