QuantileDiscretizer
Transform a column of continuous features to n columns of binned categorical features. The
number of bins is set by the
number of bins is set by the
numBuckets
parameter.The bin ranges are chosen using the Algebird's QTree approximate data structure. The precision
of the approximation can be controlled with the
of the approximation can be controlled with the
k
parameter.Missing values are transformed to zero vectors.
When using aggregated feature summary from a previous session, values outside of previously seen
rejections are reported.
[min, max]
are binned into the first or last bucket and FeatureRejection.OutOfBoundrejections are reported.
Value members
Methods
Create a new QuantileDiscretizer instance.
- Value Params
- k
-
precision of the underlying Algebird QTree approximation
- numBuckets
-
number of buckets (quantiles, or categories) into which data points are
grouped, must be greater than or equal to 2