Transform a column of continuous features to n columns of feature buckets.
With n+1 splits, there are n buckets. A bucket defined by splits x,y holds values in the range
[x,y) except the last bucket, which also includes y. Splits should be strictly increasing.
Values at -inf, inf must be explicitly provided to cover all double values; Otherwise,
FeatureRejection.OutOfBound rejection will be reported for values outside the splits
specified.. Two examples of splits are
Array(Double.NegativeInfinity, 0.0, 1.0, Double.PositiveInfinity) and Array(0.0, 1.0, 2.0).
Note that if you have no idea of the upper and lower bounds of the targeted column, you should
add Double.NegativeInfinity and Double.PositiveInfinity as the bounds of your splits to
prevent a potential FeatureRejection.OutOfBound rejection.
Note also that the splits that you provided have to be in strictly increasing order, i.e.
s0 < s1 < s2 < ... < sn.
Transform a column of continuous features to n columns of feature buckets.
With n+1 splits, there are n buckets. A bucket defined by splits x,y holds values in the range [x,y) except the last bucket, which also includes y. Splits should be strictly increasing. Values at -inf, inf must be explicitly provided to cover all double values; Otherwise, FeatureRejection.OutOfBound rejection will be reported for values outside the splits specified.. Two examples of splits are
Array(Double.NegativeInfinity, 0.0, 1.0, Double.PositiveInfinity)
andArray(0.0, 1.0, 2.0)
.Note that if you have no idea of the upper and lower bounds of the targeted column, you should add
Double.NegativeInfinity
andDouble.PositiveInfinity
as the bounds of your splits to prevent a potential FeatureRejection.OutOfBound rejection.Note also that the splits that you provided have to be in strictly increasing order, i.e.
s0 < s1 < s2 < ... < sn
.Missing values are transformed to zero vectors.