public abstract class LogisticRegression extends java.lang.Object implements SoftClassifier<double[]>, OnlineClassifier<double[]>
Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables.
Logistic regression has many analogies to ordinary least squares (OLS) regression. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the raw values of the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements.
Compared with linear discriminant analysis, logistic regression has several advantages:
Logistic regression also has strong connections with neural network and maximum entropy modeling. For example, binary logistic regression is equivalent to a one-layer, single-output neural network with a logistic activation function trained under log loss. Similarly, multinomial logistic regression is equivalent to a one-layer, softmax-output neural network.
Logistic regression estimation also obeys the maximum entropy principle, and thus logistic regression is sometimes called "maximum entropy modeling", and the resulting classifier the "maximum entropy classifier".
GLM
,
MLP
,
Maxent
,
LDA
,
Serialized FormModifier and Type | Class and Description |
---|---|
static class |
LogisticRegression.Binomial
Binomial logistic regression.
|
static class |
LogisticRegression.Multinomial
Multinomial logistic regression.
|
Constructor and Description |
---|
LogisticRegression(int p,
double L,
double lambda,
smile.util.IntSet labels)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
double |
AIC()
Returns the AIC score.
|
static LogisticRegression.Binomial |
binomial(double[][] x,
int[] y)
Fits binomial logistic regression.
|
static LogisticRegression.Binomial |
binomial(double[][] x,
int[] y,
double lambda,
double tol,
int maxIter)
Fits binomial logistic regression.
|
static LogisticRegression.Binomial |
binomial(double[][] x,
int[] y,
java.util.Properties prop)
Fits binomial logistic regression.
|
static LogisticRegression.Binomial |
binomial(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Fits binomial logistic regression.
|
static LogisticRegression.Binomial |
binomial(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Fits binomial logistic regression.
|
static LogisticRegression |
fit(double[][] x,
int[] y)
Fits logistic regression.
|
static LogisticRegression |
fit(double[][] x,
int[] y,
double lambda,
double tol,
int maxIter)
Fits logistic regression.
|
static LogisticRegression |
fit(double[][] x,
int[] y,
java.util.Properties prop)
Fits logistic regression.
|
static LogisticRegression |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Fits logistic regression.
|
static LogisticRegression |
fit(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Fits logistic regression.
|
double |
getLearningRate()
Returns the learning rate of stochastic gradient descent.
|
double |
loglikelihood()
Returns the log-likelihood of model.
|
static LogisticRegression.Multinomial |
multinomial(double[][] x,
int[] y)
Fits multinomial logistic regression.
|
static LogisticRegression.Multinomial |
multinomial(double[][] x,
int[] y,
double lambda,
double tol,
int maxIter)
Fits multinomial logistic regression.
|
static LogisticRegression.Multinomial |
multinomial(double[][] x,
int[] y,
java.util.Properties prop)
Fits multinomial logistic regression.
|
static LogisticRegression.Multinomial |
multinomial(smile.data.formula.Formula formula,
smile.data.DataFrame data)
Fits multinomial logistic regression.
|
static LogisticRegression.Multinomial |
multinomial(smile.data.formula.Formula formula,
smile.data.DataFrame data,
java.util.Properties prop)
Fits multinomial logistic regression.
|
void |
setLearningRate(double rate)
Sets the learning rate of stochastic gradient descent.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
predict, predict
update, update
applyAsDouble, applyAsInt, predict, predict, score
public LogisticRegression(int p, double L, double lambda, smile.util.IntSet labels)
p
- the dimension of input data.L
- the log-likelihood of learned model.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance,
especially when the dimensionality is high.labels
- class labelspublic static LogisticRegression.Binomial binomial(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression.Binomial binomial(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression.Binomial binomial(double[][] x, int[] y)
x
- training samples.y
- training labels.public static LogisticRegression.Binomial binomial(double[][] x, int[] y, java.util.Properties prop)
x
- training samples.y
- training labels.public static LogisticRegression.Binomial binomial(double[][] x, int[] y, double lambda, double tol, int maxIter)
x
- training samples.y
- training labels.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance,
especially when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.public static LogisticRegression.Multinomial multinomial(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression.Multinomial multinomial(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y)
x
- training samples.y
- training labels.public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y, java.util.Properties prop)
x
- training samples.y
- training labels.public static LogisticRegression.Multinomial multinomial(double[][] x, int[] y, double lambda, double tol, int maxIter)
x
- training samples.y
- training labels.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance,
especially when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.public static LogisticRegression fit(smile.data.formula.Formula formula, smile.data.DataFrame data)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)
formula
- a symbolic description of the model to be fitted.data
- the data frame of the explanatory and response variables.public static LogisticRegression fit(double[][] x, int[] y)
x
- training samples.y
- training labels.public static LogisticRegression fit(double[][] x, int[] y, java.util.Properties prop)
x
- training samples.y
- training labels.public static LogisticRegression fit(double[][] x, int[] y, double lambda, double tol, int maxIter)
x
- training samples.y
- training labels.lambda
- λ > 0 gives a "regularized" estimate of linear
weights which often has superior generalization performance,
especially when the dimensionality is high.tol
- the tolerance for stopping iterations.maxIter
- the maximum number of iterations.public void setLearningRate(double rate)
rate
- the learning rate.public double getLearningRate()
public double loglikelihood()
public double AIC()