features on which the hash is based. Notice that function output type is Any.
a distribution used for selecting values.
whether it is OK to hash on missing data. Keep in mind that if set to true, there is no guarantee about what value will be selected. (Missing data in this context means None. There are no explicit null checks; just None checks.)
"Randomly" but idempotently pick an index of a sub-tree down which to branch.
"Randomly" but idempotently pick an index of a sub-tree down which to branch. Compute a uniformly distributed hash code from the features specified to the constructor and then use it to drive alias method sampling to select a sub-tree branch with the desired probability. Since the uniformly generated variates are based on hashing rather than a random number generator, the selection is idempotent.
Note: This uses the MurmurHash3 singleton in Scala rather than Guava's com.google.common.hash.Hashing.murmur3_32 implementation for speed and compatibility with Scala. Therefore, there should be no expectation that the hashcodes produced by the different methods will produce the same hashes on the same data.
input from which features are extracted. These features are then hashed to produce a value.
a positive value i if node i should be selected. May return a negative value in which case processErrorAt should be called with the value returned.
a distribution used for selecting values.
features on which the hash is based.
features on which the hash is based. Notice that function output type is Any.
whether it is OK to hash on missing data.
whether it is OK to hash on missing data. Keep in mind that if set to true, there is no guarantee about what value will be selected. (Missing data in this context means None. There are no explicit null checks; just None checks.)
Process an error.
Process an error.
input to decision tree model
the value returned by apply. This should be a negative number whose absolute value represents the index of the feature that contained missing data.
A selector that random selects a child node.
the input type from which features are extracted.
features on which the hash is based. Notice that function output type is Any.
a distribution used for selecting values.
whether it is OK to hash on missing data. Keep in mind that if set to true, there is no guarantee about what value will be selected. (Missing data in this context means None. There are no explicit null checks; just None checks.)