Package sentencepiece
Interface SentencepieceModel.NormalizerSpecOrBuilder
-
- All Superinterfaces:
com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
,com.google.protobuf.MessageLiteOrBuilder
,com.google.protobuf.MessageOrBuilder
- All Known Implementing Classes:
SentencepieceModel.NormalizerSpec
,SentencepieceModel.NormalizerSpec.Builder
- Enclosing class:
- SentencepieceModel
public static interface SentencepieceModel.NormalizerSpecOrBuilder extends com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description boolean
getAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.boolean
getEscapeWhitespaces()
Replaces whitespace with meta symbol.java.lang.String
getName()
name of normalization rule.com.google.protobuf.ByteString
getNameBytes()
name of normalization rule.java.lang.String
getNormalizationRuleTsv()
Custom normalization rule file in TSV format.com.google.protobuf.ByteString
getNormalizationRuleTsvBytes()
Custom normalization rule file in TSV format.com.google.protobuf.ByteString
getPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
getRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.boolean
hasAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.boolean
hasEscapeWhitespaces()
Replaces whitespace with meta symbol.boolean
hasName()
name of normalization rule.boolean
hasNormalizationRuleTsv()
Custom normalization rule file in TSV format.boolean
hasPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
hasRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.-
Methods inherited from interface com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder
getDefaultInstanceForType, getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, hasExtension, hasExtension, hasExtension
-
-
-
-
Method Detail
-
hasName
boolean hasName()
name of normalization rule.
optional string name = 1;
- Returns:
- Whether the name field is set.
-
getName
java.lang.String getName()
name of normalization rule.
optional string name = 1;
- Returns:
- The name.
-
getNameBytes
com.google.protobuf.ByteString getNameBytes()
name of normalization rule.
optional string name = 1;
- Returns:
- The bytes for name.
-
hasPrecompiledCharsmap
boolean hasPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Returns:
- Whether the precompiledCharsmap field is set.
-
getPrecompiledCharsmap
com.google.protobuf.ByteString getPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Returns:
- The precompiledCharsmap.
-
hasAddDummyPrefix
boolean hasAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Returns:
- Whether the addDummyPrefix field is set.
-
getAddDummyPrefix
boolean getAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Returns:
- The addDummyPrefix.
-
hasRemoveExtraWhitespaces
boolean hasRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Returns:
- Whether the removeExtraWhitespaces field is set.
-
getRemoveExtraWhitespaces
boolean getRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Returns:
- The removeExtraWhitespaces.
-
hasEscapeWhitespaces
boolean hasEscapeWhitespaces()
Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Returns:
- Whether the escapeWhitespaces field is set.
-
getEscapeWhitespaces
boolean getEscapeWhitespaces()
Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Returns:
- The escapeWhitespaces.
-
hasNormalizationRuleTsv
boolean hasNormalizationRuleTsv()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Returns:
- Whether the normalizationRuleTsv field is set.
-
getNormalizationRuleTsv
java.lang.String getNormalizationRuleTsv()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Returns:
- The normalizationRuleTsv.
-
getNormalizationRuleTsvBytes
com.google.protobuf.ByteString getNormalizationRuleTsvBytes()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Returns:
- The bytes for normalizationRuleTsv.
-
-