LogisticRegression

java.lang.Object
- smile.classification.LogisticRegression

All Implemented Interfaces:

java.io.Serializable, java.util.function.ToDoubleFunction<double[]>, java.util.function.ToIntFunction<double[]>, Classifier<double[]>, OnlineClassifier<double[]>, SoftClassifier<double[]>
```
public class LogisticRegression
extends java.lang.Object
implements SoftClassifier<double[]>, OnlineClassifier<double[]>
```
Logistic regression. Logistic regression (logit model) is a generalized linear model used for binomial regression. Logistic regression applies maximum likelihood estimation after transforming the dependent into a logit variable. A logit is the natural log of the odds of the dependent equaling a certain value or not (usually 1 in binary logistic models, the highest value in multinomial models). In this way, logistic regression estimates the odds of a certain event (value) occurring.
Goodness-of-fit tests such as the likelihood ratio test are available as indicators of model appropriateness, as is the Wald statistic to test the significance of individual independent variables.
Logistic regression has many analogies to ordinary least squares (OLS) regression. Unlike OLS regression, however, logistic regression does not assume linearity of relationship between the raw values of the independent variables and the dependent, does not require normally distributed variables, does not assume homoscedasticity, and in general has less stringent requirements.
Compared with linear discriminant analysis, logistic regression has several advantages:
- It is more robust: the independent variables don't have to be normally distributed, or have equal variance in each group
- It does not assume a linear relationship between the independent variables and dependent variable.
- It may handle nonlinear effects since one can add explicit interaction and power terms.
However, it requires much more data to achieve stable, meaningful results.
Logistic regression also has strong connections with neural network and maximum entropy modeling. For example, binary logistic regression is equivalent to a one-layer, single-output neural network with a logistic activation function trained under log loss. Similarly, multinomial logistic regression is equivalent to a one-layer, softmax-output neural network.
Logistic regression estimation also obeys the maximum entropy principle, and thus logistic regression is sometimes called "maximum entropy modeling", and the resulting classifier the "maximum entropy classifier".
See Also:

MLP, Maxent, LDA, Serialized Form

Constructor Summary

Constructors
Constructor and Description
`LogisticRegression(double[] w, double L, double lambda)` Constructor of binary logistic regression.
`LogisticRegression(double L, double[][] W, double lambda)` Constructor of multi-class logistic regression.
`LogisticRegression(double L, double[][] W, double lambda, smile.util.IntSet labels)` Constructor of multi-class logistic regression.
`LogisticRegression(double L, double[] w, double lambda, smile.util.IntSet labels)` Constructor of binary logistic regression.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static LogisticRegression`	`fit(double[][] x, int[] y)` Learn logistic regression.
`static LogisticRegression`	`fit(double[][] x, int[] y, double lambda, double tol, int maxIter)` Learn logistic regression.
`static LogisticRegression`	`fit(double[][] x, int[] y, java.util.Properties prop)` Learn logistic regression.
`static LogisticRegression`	`fit(smile.data.formula.Formula formula, smile.data.DataFrame data)` Learn logistic regression.
`static LogisticRegression`	`fit(smile.data.formula.Formula formula, smile.data.DataFrame data, java.util.Properties prop)` Learn logistic regression.
`double`	`getLearningRate()` Returns the learning rate of stochastic gradient descent.
`double`	`loglikelihood()` Returns the log-likelihood of model.
`int`	`predict(double[] x)` Predicts the class label of an instance.
`int`	`predict(double[] x, double[] posteriori)` Predicts the class label of an instance and also calculate a posteriori probabilities.
`void`	`setLearningRate(double rate)` Sets the learning rate of stochastic gradient descent.
`void`	`update(double[] x, int y)` Online update the classifier with a new training instance.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface smile.classification.OnlineClassifier
update

Methods inherited from interface smile.classification.Classifier
applyAsDouble, applyAsInt, f, predict

- Constructor Detail
  - LogisticRegression
```
public LogisticRegression(double[] w,
                          double L,
                          double lambda)
```
    Constructor of binary logistic regression.
    
    Parameters:
    
    L - the log-likelihood of learned model.
    
    w - the weights.
    
    lambda - λ > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
  - LogisticRegression
```
public LogisticRegression(double L,
                          double[] w,
                          double lambda,
                          smile.util.IntSet labels)
```
    Constructor of binary logistic regression.
    
    Parameters:
    
    L - the log-likelihood of learned model.
    
    w - the weights.
    
    lambda - λ > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
    
    labels - class labels
  - LogisticRegression
```
public LogisticRegression(double L,
                          double[][] W,
                          double lambda)
```
    Constructor of multi-class logistic regression.
    
    Parameters:
    
    L - the log-likelihood of learned model.
    
    W - the weights of first k - 1 classes.
    
    lambda - λ > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
  - LogisticRegression
```
public LogisticRegression(double L,
                          double[][] W,
                          double lambda,
                          smile.util.IntSet labels)
```
    Constructor of multi-class logistic regression.
    
    Parameters:
    
    L - the log-likelihood of learned model.
    
    W - the weights.
    
    lambda - λ > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
    
    labels - class labels
- Method Detail
  - fit
```
public static LogisticRegression fit(smile.data.formula.Formula formula,
                                     smile.data.DataFrame data)
```
    Learn logistic regression.
    
    Parameters:
    
    formula - a symbolic description of the model to be fitted.
    
    data - the data frame of the explanatory and response variables.
  - fit
```
public static LogisticRegression fit(smile.data.formula.Formula formula,
                                     smile.data.DataFrame data,
                                     java.util.Properties prop)
```
    Learn logistic regression.
    
    Parameters:
    
    formula - a symbolic description of the model to be fitted.
    
    data - the data frame of the explanatory and response variables.
  - fit
```
public static LogisticRegression fit(double[][] x,
                                     int[] y)
```
    Learn logistic regression.
    
    Parameters:
    
    x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
    
    y - training labels in [0, k), where k is the number of classes.
  - fit
```
public static LogisticRegression fit(double[][] x,
                                     int[] y,
                                     java.util.Properties prop)
```
    Learn logistic regression.
    
    Parameters:
    
    x - training samples. Each sample is represented by a set of sparse binary features. The features are stored in an integer array, of which are the indices of nonzero features.
    
    y - training labels in [0, k), where k is the number of classes.
  - fit
```
public static LogisticRegression fit(double[][] x,
                                     int[] y,
                                     double lambda,
                                     double tol,
                                     int maxIter)
```
    Learn logistic regression.
    
    Parameters:
    
    x - training samples.
    
    y - training labels in [0, k), where k is the number of classes.
    
    lambda - λ > 0 gives a "regularized" estimate of linear weights which often has superior generalization performance, especially when the dimensionality is high.
    
    tol - the tolerance for stopping iterations.
    
    maxIter - the maximum number of iterations.
  - update
```
public void update(double[] x,
                   int y)
```
    Description copied from interface: OnlineClassifier
    
    Online update the classifier with a new training instance. In general, this method may be NOT multi-thread safe.
    
    Specified by:
    
    update in interface OnlineClassifier<double[]>
    
    Parameters:
    
    x - training instance.
    
    y - training label.
  - setLearningRate
```
public void setLearningRate(double rate)
```
    Sets the learning rate of stochastic gradient descent. It is a good practice to adapt the learning rate for different data sizes. For example, it is typical to set the learning rate to eta/n, where eta is in [0.1, 0.3] and n is the size of the training data.
    
    Parameters:
    
    rate - the learning rate.
  - getLearningRate
```
public double getLearningRate()
```
    Returns the learning rate of stochastic gradient descent.
  - loglikelihood
```
public double loglikelihood()
```
    Returns the log-likelihood of model.
  - predict
```
public int predict(double[] x)
```
    Description copied from interface: Classifier
    
    Predicts the class label of an instance.
    
    Specified by:
    
    predict in interface Classifier<double[]>
    
    Parameters:
    
    x - the instance to be classified.
    
    Returns:
    
    the predicted class label.
  - predict
```
public int predict(double[] x,
                   double[] posteriori)
```
    Description copied from interface: SoftClassifier
    
    Predicts the class label of an instance and also calculate a posteriori probabilities. Classifiers may NOT support this method since not all classification algorithms are able to calculate such a posteriori probabilities.
    
    Specified by:
    
    predict in interface SoftClassifier<double[]>
    
    Parameters:
    
    x - an instance to be classified.
    
    posteriori - the array to store a posteriori probabilities on output.
    
    Returns:
    
    the predicted class label

Class LogisticRegression

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface smile.classification.OnlineClassifier

Methods inherited from interface smile.classification.Classifier

Constructor Detail

LogisticRegression

LogisticRegression

LogisticRegression

LogisticRegression

Method Detail

fit

fit

fit

fit

fit

update

setLearningRate

getLearningRate

loglikelihood

predict

predict