WeightedPCA: PCA for heterogeneous quality samples

Documentation for WeightedPCA.

👋 This package provides research code and work is ongoing. If you are interested in using it in your own research, I'd love to hear from you and collaborate! Feel free to write: dahong67@wharton.upenn.edu

Please cite the following paper for this technique:

David Hong, Fan Yang, Jeffrey A. Fessler, Laura Balzano. "Optimally Weighted PCA for High-Dimensional Heteroscedastic Data", 2022. https://arxiv.org/abs/1810.12862

In BibTeX form:

@Misc{hyfb2022owp,
   title  = "Optimally Weighted PCA for High-Dimensional Heteroscedastic Data", 
   author = "David Hong and Fan Yang and Jeffrey A. Fessler and Laura Balzano",
   year   = 2022,
   url    = "https://arxiv.org/abs/1810.12862",
}

WeightedPCA.WeightedPCA — Module

Weighted PCA module. Provides weighted principal component analysis (PCA) for data with samples of heterogeneous quality (heteroscedastic noise).

source

WeightedPCA.wpca — Function

wpca(Y, i, weights=UniformWeights())

Compute ith principal component of data Y via weighted PCA using weights, i.e., output is the ith eigenvector of the weighted sample covariance Σ_l w[l] Y[l]*Y[l]'. Data Y is a list of matrices (each column is a sample).

Choices for weights

UniformWeights() : uniform weights, i.e., w[l] = 1 [default]
InverseVarianceWeights([v]) : inverse noise variance weights, i.e., w[l] = 1/v[l]
OptimalWeights([v,λ]) : optimal weights for signal with variance λ, i.e., w[l] = 1/v[l] * 1/(1+v[l]/λ)

The weights can also be manually set by passing in an AbstractVector{<:Real}.

source

WeightedPCA.ComputedWeights — Type

ComputedWeights

Abstract supertype for weights that are computed from properties of the data.

source

WeightedPCA.UniformWeights — Type

UniformWeights <: ComputedWeights

Uniform weighting, i.e., w[l] = 1. Corresponds to conventional (unweighted) PCA.

source

WeightedPCA.InverseVarianceWeights — Type

InverseVarianceWeights <: ComputedWeights

Inverse noise variance weighting, i.e., w[l] = 1/v[l].

Constructors

InverseVarianceWeights(v=noisevar) for known noise variances noisevar
InverseVarianceWeights() for unknown noise variances; noise variances will be estimated from data

source

WeightedPCA.OptimalWeights — Type

OptimalWeights <: ComputedWeights

Optimal weighting, i.e., w[l] = 1/v[l] * 1/(1+v[l]/λ).

Constructors

OptimalWeights(v=noisevar, λ=signalvar) for known noise variances noisevar and signal variance signalvar
OptimalWeights(λ=signalvar) for known signal variance signalvar; noise variances will be estimated from data
OptimalWeights(v=noisevar) for known noise variances noisevar; signal variance will be estimated from data
OptimalWeights() for unknown noise and signal variances; noise and signal variances will be estimated from data

source