GLM

java.lang.Object
- smile.glm.GLM

All Implemented Interfaces:

java.io.Serializable
```
public class GLM
extends java.lang.Object
implements java.io.Serializable
```
Generalized linear models. The generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.
In GLM, each outcome Y of the dependent variables is assumed to be generated from a particular distribution in an exponential family. The mean, μ, of the distribution depends on the independent variables, X, through:
```
     E(Y) = μ = g^-1(Xβ)
 
```
where E(Y) is the expected value of Y; Xβ is the linear combination of linear predictors and unknown parameters β; g is the link function that is a monotonic, differentiable function. THe link function that transforms the mean to the natural parameter is called the canonical link.
In this framework, the variance is typically a function, V, of the mean:
```
     Var(Y) = V(μ) = V(g^-1(Xβ))
 
```
It is convenient if V follows from an exponential family of distributions, but it may simply be that the variance is a function of the predicted value, such as V(μ_i) = μ_i for the Poisson, V(μ_i) = μ_i(1 - μ_i) for the Bernoulli, and V(μ_i) = σ² (i.e., constant) for the normal.
The unknown parameters, β, are typically estimated with maximum likelihood, maximum quasi-likelihood, or Bayesian techniques.
See Also:

Serialized Form

Field Summary

Fields
Modifier and Type	Field and Description
`protected double[]`	`beta` The linear weights.
`protected double`	`deviance` The deviance = 2 * (LogLikelihood(Saturated Model) - LogLikelihood(Proposed Model)).
`protected double[]`	`devianceResiduals` The deviance residuals.
`protected int`	`df` The degrees of freedom of the residual deviance.
`protected smile.data.formula.Formula`	`formula` The symbolic description of the model to be fitted.
`protected double`	`loglikelihood` Log-likelihood.
`protected Model`	`model` The model specifications (link function, deviance, etc.).
`protected double[]`	`mu` The fitted mean values.
`protected double`	`nullDeviance` The null deviance = 2 * (LogLikelihood(Saturated Model) - LogLikelihood(Null Model)).
`protected double[][]`	`ztest` The coefficients, their standard errors, z-scores, and p-values.

Constructor Summary

Constructors
Constructor and Description
`GLM(smile.data.formula.Formula formula, java.lang.String[] predictors, Model model, double[] beta, double loglikelihood, double deviance, double nullDeviance, double[] mu, double[] residuals, double[][] ztest)` Constructor.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`double`	`AIC()` Returns the AIC score.
`double`	`BIC()` Returns the BIC score.
`double[]`	`coefficients()` Returns an array of size (p+1) containing the linear weights of binary logistic regression, where p is the dimension of feature vectors.
`double`	`deviance()` Returns the deviance of model.
`double[]`	`devianceResiduals()` Returns the deviance residuals.
`static GLM`	`fit(smile.data.formula.Formula formula, smile.data.DataFrame data, Model model)` Fits the generalized linear model with IWLS (iteratively reweighted least squares).
`static GLM`	`fit(smile.data.formula.Formula formula, smile.data.DataFrame data, Model model, double tol, int maxIter)` Fits the generalized linear model with IWLS (iteratively reweighted least squares).
`static GLM`	`fit(smile.data.formula.Formula formula, smile.data.DataFrame data, Model model, java.util.Properties prop)` Fits the generalized linear model with IWLS (iteratively reweighted least squares).
`double[]`	`fittedValues()` Returns the fitted mean values.
`double`	`loglikelihood()` Returns the log-likelihood of model.
`double[]`	`predict(smile.data.DataFrame df)` Predicts the mean response.
`double`	`predict(smile.data.Tuple x)` Predicts the mean response.
`java.lang.String`	`toString()`
`double[][]`	`ztest()` Returns the z-test of the coefficients (including intercept).

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - formula
```
protected smile.data.formula.Formula formula
```
    The symbolic description of the model to be fitted.
  - model
```
protected Model model
```
    The model specifications (link function, deviance, etc.).
  - beta
```
protected double[] beta
```
    The linear weights.
  - ztest
```
protected double[][] ztest
```
    The coefficients, their standard errors, z-scores, and p-values.
  - mu
```
protected double[] mu
```
    The fitted mean values.
  - nullDeviance
```
protected double nullDeviance
```
    The null deviance = 2 * (LogLikelihood(Saturated Model) - LogLikelihood(Null Model)).
    The saturated model, also referred to as the full model or maximal model, allows a different mean response for each group of replicates. One can think of the saturated model as having the most general possible mean structure for the data since the means are unconstrained.
    The null model assumes that all observations have the same distribution with common parameter. Like the saturated model, the null model does not depend on predictor variables. While the saturated most is the most general model, the null model is the most restricted model.
  - deviance
```
protected double deviance
```
    The deviance = 2 * (LogLikelihood(Saturated Model) - LogLikelihood(Proposed Model)).
  - devianceResiduals
```
protected double[] devianceResiduals
```
    The deviance residuals.
  - df
```
protected int df
```
    The degrees of freedom of the residual deviance.
  - loglikelihood
```
protected double loglikelihood
```
    Log-likelihood.
- Constructor Detail
  - GLM
```
public GLM(smile.data.formula.Formula formula,
           java.lang.String[] predictors,
           Model model,
           double[] beta,
           double loglikelihood,
           double deviance,
           double nullDeviance,
           double[] mu,
           double[] residuals,
           double[][] ztest)
```
    Constructor.
- Method Detail
  - coefficients
```
public double[] coefficients()
```
    Returns an array of size (p+1) containing the linear weights of binary logistic regression, where p is the dimension of feature vectors. The last element is the weight of bias.
  - ztest
```
public double[][] ztest()
```
    Returns the z-test of the coefficients (including intercept). The first column is the coefficients, the second column is the standard error of coefficients, the third column is the z-score of the hypothesis test if the coefficient is zero, the fourth column is the p-values of test. The last row is of intercept.
  - devianceResiduals
```
public double[] devianceResiduals()
```
    Returns the deviance residuals.
  - fittedValues
```
public double[] fittedValues()
```
    Returns the fitted mean values.
  - deviance
```
public double deviance()
```
    Returns the deviance of model.
  - loglikelihood
```
public double loglikelihood()
```
    Returns the log-likelihood of model.
  - AIC
```
public double AIC()
```
    Returns the AIC score.
  - BIC
```
public double BIC()
```
    Returns the BIC score.
  - predict
```
public double predict(smile.data.Tuple x)
```
    Predicts the mean response.
  - predict
```
public double[] predict(smile.data.DataFrame df)
```
    Predicts the mean response.
  - toString
```
public java.lang.String toString()
```
    Overrides:
    
    toString in class java.lang.Object
  - fit
```
public static GLM fit(smile.data.formula.Formula formula,
                      smile.data.DataFrame data,
                      Model model)
```
    Fits the generalized linear model with IWLS (iteratively reweighted least squares).
    
    Parameters:
    
    formula - a symbolic description of the model to be fitted.
    
    data - the data frame of the explanatory and response variables.
  - fit
```
public static GLM fit(smile.data.formula.Formula formula,
                      smile.data.DataFrame data,
                      Model model,
                      java.util.Properties prop)
```
    Fits the generalized linear model with IWLS (iteratively reweighted least squares).
    
    Parameters:
    
    formula - a symbolic description of the model to be fitted.
    
    data - the data frame of the explanatory and response variables.
  - fit
```
public static GLM fit(smile.data.formula.Formula formula,
                      smile.data.DataFrame data,
                      Model model,
                      double tol,
                      int maxIter)
```
    Fits the generalized linear model with IWLS (iteratively reweighted least squares).
    
    Parameters:
    
    formula - a symbolic description of the model to be fitted.
    
    data - the data frame of the explanatory and response variables.
    
    tol - the tolerance for stopping iterations.
    
    maxIter - the maximum number of iterations.

Class GLM

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

formula

model

beta

ztest

mu

nullDeviance

deviance

devianceResiduals

df

loglikelihood

Constructor Detail

GLM

Method Detail

coefficients

ztest

devianceResiduals

fittedValues

deviance

loglikelihood

AIC

BIC

predict

predict

toString

fit

fit

fit