Accounting for Structured Missingness in Canonical Correlation Analysis
Abstract
A particularly challenging form of missing data is structured missingness, where sets of subjects and variables consistently have missing data. For tabular data from sub-studies or modalities, structured missingness can come from non-participation in followup studies, which creates large blocks of missing data. Canonical Correlation Analysis (CCA) is a multivariate modelling tool commonly used to link two different set of variables, and in neuroimaging has typically been used to find associations between imaging and non-imaging variables. Motivated by CCA, we propose a new method for covariance estimation from incomplete data that handles data with a mix of structured and unstructured missingness, assuming Missing at Random (MAR). Our proposed method is compared to existing methodology by way of evaluation on simulated data and on real data from subjects in the UK Biobank brain imaging cohort.
Related articles
Related articles are currently not available for this article.