public class PCA extends java.lang.Object implements LinearProjection, java.io.Serializable
PCA is mostly used as a tool in exploratory data analysis and for making predictive models. PCA involves the calculation of the eigenvalue decomposition of a data covariance matrix or singular value decomposition of a data matrix, usually after mean centering the data for each attribute. The results of a PCA are usually discussed in terms of component scores and loadings.
As a linear technique, PCA is built for several purposes: first, it enables us to decorrelate the original variables; second, to carry out data compression, where we pay decreasing attention to the numerical accuracy by which we encode the sequence of principal components; third, to reconstruct the original input data using a reduced number of variables according to a least-squares criterion; and fourth, to identify potential clusters in the data.
In certain applications, PCA can be misleading. PCA is heavily influenced when there are outliers in the data. In other situations, the linearity of PCA may be an obstacle to successful data reduction and compression.
KPCA
,
PPCA
,
GHA
,
Serialized FormConstructor and Description |
---|
PCA(double[] mu,
double[] eigvalues,
smile.math.matrix.Matrix loadings)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
static PCA |
cor(double[][] data)
Fits principal component analysis with correlation matrix.
|
static PCA |
fit(double[][] data)
Fits principal component analysis with covariance matrix.
|
double[] |
getCenter()
Returns the center of data.
|
double[] |
getCumulativeVarianceProportion()
Returns the cumulative proportion of variance contained in principal components,
ordered from largest to smallest.
|
smile.math.matrix.Matrix |
getLoadings()
Returns the variable loading matrix, ordered from largest to smallest
by corresponding eigenvalues.
|
smile.math.matrix.Matrix |
getProjection()
Returns the projection matrix.
|
double[] |
getVariance()
Returns the principal component variances, ordered from largest to smallest,
which are the eigenvalues of the covariance or correlation matrix of learning data.
|
double[] |
getVarianceProportion()
Returns the proportion of variance contained in each principal component,
ordered from largest to smallest.
|
double[] |
project(double[] x)
Project a data point to the feature space.
|
double[][] |
project(double[][] x)
Project a set of data to the feature space.
|
PCA |
setProjection(double p)
Set the projection matrix with top principal components that contain
(more than) the given percentage of variance.
|
PCA |
setProjection(int p)
Set the projection matrix with given number of principal components.
|
public PCA(double[] mu, double[] eigvalues, smile.math.matrix.Matrix loadings)
mu
- the mean of samples.eigvalues
- the eigen values of principal components.loadings
- the matrix of variable loadings.public static PCA fit(double[][] data)
data
- training data of which each row is a sample.
If the sample size is larger than the data
dimension and cor = false, SVD is employed for
efficiency. Otherwise, eigen decomposition on
covariance or correlation matrix is performed.public static PCA cor(double[][] data)
data
- training data of which each row is a sample.
If the sample size is larger than the data
dimension and cor = false, SVD is employed for
efficiency. Otherwise, eigen decomposition on
covariance or correlation matrix is performed.public double[] getCenter()
public smile.math.matrix.Matrix getLoadings()
public double[] getVariance()
public double[] getVarianceProportion()
public double[] getCumulativeVarianceProportion()
public smile.math.matrix.Matrix getProjection()
LinearProjection
getProjection
in interface LinearProjection
public PCA setProjection(int p)
p
- choose top p principal components used for projection.public PCA setProjection(double p)
p
- the required percentage of variance.public double[] project(double[] x)
Projection
project
in interface LinearProjection
project
in interface Projection<double[]>
public double[][] project(double[][] x)
Projection
project
in interface LinearProjection
project
in interface Projection<double[]>