QuantileDiscretizer

Transform a column of continuous features to n columns of binned categorical features. The
number of bins is set by the numBuckets parameter.
The bin ranges are chosen using the Algebird's QTree approximate data structure. The precision
of the approximation can be controlled with the k parameter.
Missing values are transformed to zero vectors.
When using aggregated feature summary from a previous session, values outside of previously seen
[min, max] are binned into the first or last bucket and FeatureRejection.OutOfBound
rejections are reported.
class Object
trait Matchable
class Any

Value members

Methods

def apply(name: String, numBuckets: Int, k: Int): Transformer[Double, B, C]
Create a new QuantileDiscretizer instance.
Value Params
k
precision of the underlying Algebird QTree approximation
numBuckets
number of buckets (quantiles, or categories) into which data points are
grouped, must be greater than or equal to 2
def fromSettings(setting: Settings): Transformer[Double, B, C]
Create a new QuantileDiscretizer from a settings object
Value Params
setting
Settings object