R(w) = alpha * L1 + (1 - alpha) * L2: 0 -> ridge; 1 -> lasso
lambda * R(w): 0 -> no regularizer
All Models: early stopping Shared Model Parameters
All Models: early stopping Shared Model Parameters
All Models: early stopping Shared Model Parameters
All Models: early stopping Shared Model Parameters
All Models: early stopping Shared Model Parameters
All Models: early stopping Shared Model Parameters
then identify hyperparameters: laplace for NB, number of trees for distributed random forest