com.salesforce.op.stages.impl.feature
unique name of the operation this stage performs
uid for instance
Count unique values of each of the sequence & map key components in the dataset using HyperLogLog HLL
Count unique values of each of the sequence & map key components in the dataset using HyperLogLog HLL
value type
dataset to count unique values
size of each sequence component
number of bits for HyperLogLog HLL
kryo serializer to serialize V value into array of bytes
class tag of V - needed by kryo
HyperLogLog HLL of unique values count for each of the sequence components and total rows count
Count unique values of each of the sequence components in the dataset using HyperLogLog HLL
Count unique values of each of the sequence components in the dataset using HyperLogLog HLL
value type
dataset to count unique values
size of each sequence component
number of bits for HyperLogLog HLL
kryo serializer to serialize V value into array of bytes
class tag of V - needed by kryo
HyperLogLog HLL of unique values count for each of the sequence components and total rows count
unique name of the operation this stage performs
unique name of the operation this stage performs
Option to keep track of values that were missing
Option to keep track of values that were missing
uid for instance
uid for instance
Converts a sequence of features into a vector keeping the top K most common occurrences of each feature (ie the final vector has length K * number of inputs). Plus an additional column for "other" values - which will capture values that do not make the cut or values not seen in training, and an additional column for empty values unless null tracking is disabled.