public class LogisticRegression extends java.lang.Object implements SoftClassifier<double[]>, OnlineClassifier<double[]>
Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables.
Logistic regression has many analogies to ordinary least squares (OLS) regression. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the raw values of the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements.
Compared with linear discriminant analysis, logistic regression has several advantages:
Logistic regression also has strong connections with neural network and maximum entropy modeling. For example, binary logistic regression is equivalent to a one-layer, single-output neural network with a logistic activation function trained under log loss. Similarly, multinomial logistic regression is equivalent to a one-layer, softmax-output neural network.
Logistic regression estimation also obeys the maximum entropy principle, and thus logistic regression is sometimes called "maximum entropy modeling", and the resulting classifier the "maximum entropy classifier".
MLP
,
Maxent
,
LDA
,
Serialized FormConstructor and Description |
---|
LogisticRegression(double[] w,
double L,
double lambda)
Constructor of binary logistic regression.
|
LogisticRegression(double L,
double[][] W,
double lambda)
Constructor of multi-class logistic regression.
|
LogisticRegression(double L,
double[][] W,
double lambda,
smile.util.IntSet labels)
Constructor of multi-class logistic regression.
|
LogisticRegression(double L,
double[] w,
double lambda,
smile.util.IntSet labels)
Constructor of binary logistic regression.
|
Modifier and Type | Method and Description |
---|---|
static LogisticRegression |
fit(double[][] x,
int[] y)
Learn logistic regression.
|
static LogisticRegression |
fit(double[][] x,
int[] y,
double lambda,
double tol,
int maxIter)
Learn logistic regression.
|
static LogisticRegression |
fit(double[][] x,
int[] y,
java.util.Properties prop)
Learn logistic regression.
|
static LogisticRegression |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Learn logistic regression.
|
static LogisticRegression |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Learn logistic regression.
|
double |
getLearningRate()
Returns the learning rate of stochastic gradient descent.
|
double |
loglikelihood()
Returns the log-likelihood of model.
|
int |
predict(double[] x)
Predicts the class label of an instance.
|
int |
predict(double[] x,
double[] posteriori)
Predicts the class label of an instance and also calculate a posteriori
probabilities.
|
void |
setLearningRate(double rate)
Sets the learning rate of stochastic gradient descent.
|
void |
update(double[] x,
int y)
Online update the classifier with a new training instance.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
update
applyAsDouble, applyAsInt, f, predict
public LogisticRegression(double[] w, double L, double lambda)
L
- the log-likelihood of learned model.w
- the weights.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance, especially
when the dimensionality is high.public LogisticRegression(double L, double[] w, double lambda, smile.util.IntSet labels)
L
- the log-likelihood of learned model.w
- the weights.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance, especially
when the dimensionality is high.labels
- class labelspublic LogisticRegression(double L, double[][] W, double lambda)
L
- the log-likelihood of learned model.W
- the weights of first k - 1 classes.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance, especially
when the dimensionality is high.public LogisticRegression(double L, double[][] W, double lambda, smile.util.IntSet labels)
L
- the log-likelihood of learned model.W
- the weights.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance, especially
when the dimensionality is high.labels
- class labelspublic static LogisticRegression fit(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression fit(double[][] x, int[] y)
x
- training samples. Each sample is represented by a set of sparse
binary features. The features are stored in an integer array, of which
are the indices of nonzero features.y
- training labels in [0, k), where k is the number of classes.public static LogisticRegression fit(double[][] x, int[] y, java.util.Properties prop)
x
- training samples. Each sample is represented by a set of sparse
binary features. The features are stored in an integer array, of which
are the indices of nonzero features.y
- training labels in [0, k), where k is the number of classes.public static LogisticRegression fit(double[][] x, int[] y, double lambda, double tol, int maxIter)
x
- training samples.y
- training labels in [0, k), where k is the number of classes.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance, especially
when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.public void update(double[] x, int y)
OnlineClassifier
update
in interface OnlineClassifier<double[]>
x
- training instance.y
- training label.public void setLearningRate(double rate)
rate
- the learning rate.public double getLearningRate()
public double loglikelihood()
public int predict(double[] x)
Classifier
predict
in interface Classifier<double[]>
x
- the instance to be classified.public int predict(double[] x, double[] posteriori)
SoftClassifier
predict
in interface SoftClassifier<double[]>
x
- an instance to be classified.posteriori
- the array to store a posteriori probabilities on output.