Indicates whether to attempt language detection.
Indicates whether to attempt language detection.
Language detection threshold.
Language detection threshold. If none of the detected languages have confidence greater than the threshold then defaultLanguage is used.
Default language to assume in case autoDetectLanguage is disabled or failed to make a good enough prediction.
Default language to assume in case autoDetectLanguage is disabled or failed to make a good enough prediction.
Minimum token length, >= 1.
Minimum token length, >= 1.
Get the metadata describing the output vector
Get the metadata describing the output vector
This does not trigger onGetMetadata()
Metadata of output vector
Indicates whether to convert all characters to lowercase before string operation.
Indicates whether to convert all characters to lowercase before string operation.
Compute the output vector metadata only from the input features.
Compute the output vector metadata only from the input features. Vectorizers use this to derive the full vector, including pivot columns or indicator features.
Vector metadata from input features
Get the name of the output vector
Get the name of the output vector
Output vector name as a string
Sequence transformer for generating a sequence of text lengths from a sequence of TextList values (eg. tokenized raw text)