public class PCA extends Object
Constructor and Description |
---|
PCA(INDArray dataset)
Create a PCA instance with calculated data: covariance, mean, eigenvectors, and eigenvalues.
|
Modifier and Type | Method and Description |
---|---|
INDArray |
convertBackToFeatures(INDArray data)
Take the data that has been transformed to the principal components about the mean and
transform it back into the original feature set.
|
INDArray |
convertToComponents(INDArray data)
Takes a set of data on each row, with the same number of features as the constructing data
and returns the data in the coordinates of the basis set about the mean.
|
static INDArray[] |
covarianceMatrix(INDArray in)
Returns the covariance matrix of a data set of many records, each with N features.
|
double |
estimateVariance(INDArray data,
int ndims)
Estimate the variance of a single record with reduced # of dimensions.
|
INDArray |
generateGaussianSamples(long count)
Generates a set of count random samples with the same variance and mean and eigenvector/values
as the data set used to initialize the PCA object, with same number of features N.
|
INDArray |
getCovarianceMatrix() |
INDArray |
getEigenvalues() |
INDArray |
getEigenvectors() |
INDArray |
getMean() |
static INDArray |
pca_factor(INDArray A,
double variance,
boolean normalize)
Calculates pca vectors of a matrix, for a given variance.
|
static INDArray |
pca_factor(INDArray A,
int nDims,
boolean normalize)
Calculates pca factors of a matrix, for a flags number of reduced features
returns the factors to scale observations
The return is a factor matrix to reduce (normalized) feature sets
|
static INDArray |
pca(INDArray A,
double variance,
boolean normalize)
Calculates pca reduced value of a matrix, for a given variance.
|
static INDArray |
pca(INDArray A,
int nDims,
boolean normalize)
Calculates pca vectors of a matrix, for a flags number of reduced features
returns the reduced feature set
The return is a projection of A onto principal nDims components
To use the PCA: assume A is the original feature set
then project A onto a reduced set of features.
|
static INDArray |
pca2(INDArray in,
double variance)
This method performs a dimensionality reduction, including principal components
that cover a fraction of the total variance of the system.
|
static INDArray[] |
principalComponents(INDArray cov)
Calculates the principal component vectors and their eigenvalues (lambda) for the covariance matrix.
|
INDArray |
reducedBasis(double variance)
Return a reduced basis set that covers a certain fraction of the variance of the data
|
public PCA(INDArray dataset)
dataset
- The set of data (records) of features, each row is a data record and each
column is a feature, every data record has the same number of features.public INDArray reducedBasis(double variance)
variance
- The desired fractional variance (0 to 1), it will always be greater than the value.public INDArray convertToComponents(INDArray data)
data
- Data of the same features used to construct the PCA objectpublic INDArray convertBackToFeatures(INDArray data)
data
- Data of the same features used to construct the PCA object but as the componentspublic double estimateVariance(INDArray data, int ndims)
data
- A single record with the same N features as the constructing data setndims
- The number of dimensions to include in calculationpublic INDArray generateGaussianSamples(long count)
count
- The number of samples to generatepublic static INDArray pca(INDArray A, int nDims, boolean normalize)
INDArray Areduced = A.mmul( factor ) ;
INDArray Aoriginal = Areduced.mmul( factor.transpose() ) ;
A
- the array of features, rows are results, columns are features - will be changednDims
- the number of components on which to project the featuresnormalize
- whether to normalize (adjust each feature to have zero mean)public static INDArray pca_factor(INDArray A, int nDims, boolean normalize)
A
- the array of features, rows are results, columns are features - will be changednDims
- the number of components on which to project the featuresnormalize
- whether to normalize (adjust each feature to have zero mean)pca(INDArray, int, boolean)
public static INDArray pca(INDArray A, double variance, boolean normalize)
A
- the array of features, rows are results, columns are features - will be changedvariance
- the amount of variance to preserve as a float 0 - 1normalize
- whether to normalize (set features to have zero mean)pca(INDArray, int, boolean)
public static INDArray pca_factor(INDArray A, double variance, boolean normalize)
A
- the array of features, rows are results, columns are features - will be changedvariance
- the amount of variance to preserve as a float 0 - 1normalize
- whether to normalize (set features to have zero mean)pca(INDArray, double, boolean)
public static INDArray pca2(INDArray in, double variance)
in
- A matrix of datapoints as rows, where column are features with fixed number Nvariance
- The desired fraction of the total variance requiredpublic static INDArray[] covarianceMatrix(INDArray in)
in
- A matrix of vectors of fixed length N (N features) on each rowpublic static INDArray[] principalComponents(INDArray cov)
cov
- The covariance matrix (calculated with the covarianceMatrix(in) method)public INDArray getCovarianceMatrix()
public INDArray getMean()
public INDArray getEigenvectors()
public INDArray getEigenvalues()
Copyright © 2020. All rights reserved.