Package sentencepiece
Class SentencepieceModel.NormalizerSpec
- java.lang.Object
-
- com.google.protobuf.AbstractMessageLite
-
- com.google.protobuf.AbstractMessage
-
- com.google.protobuf.GeneratedMessageV3
-
- com.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec>
-
- sentencepiece.SentencepieceModel.NormalizerSpec
-
- All Implemented Interfaces:
com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
,com.google.protobuf.Message
,com.google.protobuf.MessageLite
,com.google.protobuf.MessageLiteOrBuilder
,com.google.protobuf.MessageOrBuilder
,java.io.Serializable
,SentencepieceModel.NormalizerSpecOrBuilder
- Enclosing class:
- SentencepieceModel
public static final class SentencepieceModel.NormalizerSpec extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec> implements SentencepieceModel.NormalizerSpecOrBuilder
NormalizerSpec encodes a various parameters for string normalizaiton
Protobuf typesentencepiece.NormalizerSpec
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
SentencepieceModel.NormalizerSpec.Builder
NormalizerSpec encodes a various parameters for string normalizaiton-
Nested classes/interfaces inherited from class com.google.protobuf.GeneratedMessageV3.ExtendableMessage
com.google.protobuf.GeneratedMessageV3.ExtendableMessage.ExtensionWriter
-
Nested classes/interfaces inherited from class com.google.protobuf.GeneratedMessageV3
com.google.protobuf.GeneratedMessageV3.BuilderParent, com.google.protobuf.GeneratedMessageV3.ExtendableBuilder<MessageType extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage,BuilderType extends com.google.protobuf.GeneratedMessageV3.ExtendableBuilder<MessageType,BuilderType>>, com.google.protobuf.GeneratedMessageV3.ExtendableMessage<MessageType extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage>, com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<MessageType extends com.google.protobuf.GeneratedMessageV3.ExtendableMessage>, com.google.protobuf.GeneratedMessageV3.FieldAccessorTable, com.google.protobuf.GeneratedMessageV3.UnusedPrivateParameter
-
-
Field Summary
Fields Modifier and Type Field Description static int
ADD_DUMMY_PREFIX_FIELD_NUMBER
static int
ESCAPE_WHITESPACES_FIELD_NUMBER
static int
NAME_FIELD_NUMBER
static int
NORMALIZATION_RULE_TSV_FIELD_NUMBER
static com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
PARSER
Deprecated.static int
PRECOMPILED_CHARSMAP_FIELD_NUMBER
static int
REMOVE_EXTRA_WHITESPACES_FIELD_NUMBER
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
equals(java.lang.Object obj)
boolean
getAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.static SentencepieceModel.NormalizerSpec
getDefaultInstance()
SentencepieceModel.NormalizerSpec
getDefaultInstanceForType()
static com.google.protobuf.Descriptors.Descriptor
getDescriptor()
boolean
getEscapeWhitespaces()
Replaces whitespace with meta symbol.java.lang.String
getName()
name of normalization rule.com.google.protobuf.ByteString
getNameBytes()
name of normalization rule.java.lang.String
getNormalizationRuleTsv()
Custom normalization rule file in TSV format.com.google.protobuf.ByteString
getNormalizationRuleTsvBytes()
Custom normalization rule file in TSV format.com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
getParserForType()
com.google.protobuf.ByteString
getPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
getRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.int
getSerializedSize()
com.google.protobuf.UnknownFieldSet
getUnknownFields()
boolean
hasAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.boolean
hasEscapeWhitespaces()
Replaces whitespace with meta symbol.int
hashCode()
boolean
hasName()
name of normalization rule.boolean
hasNormalizationRuleTsv()
Custom normalization rule file in TSV format.boolean
hasPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method.boolean
hasRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.protected com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
internalGetFieldAccessorTable()
boolean
isInitialized()
static SentencepieceModel.NormalizerSpec.Builder
newBuilder()
static SentencepieceModel.NormalizerSpec.Builder
newBuilder(SentencepieceModel.NormalizerSpec prototype)
SentencepieceModel.NormalizerSpec.Builder
newBuilderForType()
protected SentencepieceModel.NormalizerSpec.Builder
newBuilderForType(com.google.protobuf.GeneratedMessageV3.BuilderParent parent)
protected java.lang.Object
newInstance(com.google.protobuf.GeneratedMessageV3.UnusedPrivateParameter unused)
static SentencepieceModel.NormalizerSpec
parseDelimitedFrom(java.io.InputStream input)
static SentencepieceModel.NormalizerSpec
parseDelimitedFrom(java.io.InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static SentencepieceModel.NormalizerSpec
parseFrom(byte[] data)
static SentencepieceModel.NormalizerSpec
parseFrom(byte[] data, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static SentencepieceModel.NormalizerSpec
parseFrom(com.google.protobuf.ByteString data)
static SentencepieceModel.NormalizerSpec
parseFrom(com.google.protobuf.ByteString data, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static SentencepieceModel.NormalizerSpec
parseFrom(com.google.protobuf.CodedInputStream input)
static SentencepieceModel.NormalizerSpec
parseFrom(com.google.protobuf.CodedInputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static SentencepieceModel.NormalizerSpec
parseFrom(java.io.InputStream input)
static SentencepieceModel.NormalizerSpec
parseFrom(java.io.InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static SentencepieceModel.NormalizerSpec
parseFrom(java.nio.ByteBuffer data)
static SentencepieceModel.NormalizerSpec
parseFrom(java.nio.ByteBuffer data, com.google.protobuf.ExtensionRegistryLite extensionRegistry)
static com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec>
parser()
SentencepieceModel.NormalizerSpec.Builder
toBuilder()
void
writeTo(com.google.protobuf.CodedOutputStream output)
-
Methods inherited from class com.google.protobuf.GeneratedMessageV3.ExtendableMessage
extensionsAreInitialized, extensionsSerializedSize, extensionsSerializedSizeAsMessageSet, getAllFields, getAllFieldsRaw, getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, getExtensionFields, getField, getRepeatedField, getRepeatedFieldCount, hasExtension, hasExtension, hasExtension, hasField, makeExtensionsImmutable, newExtensionWriter, newMessageSetExtensionWriter, parseUnknownField, parseUnknownFieldProto3
-
Methods inherited from class com.google.protobuf.GeneratedMessageV3
canUseUnsafe, computeStringSize, computeStringSizeNoTag, emptyBooleanList, emptyDoubleList, emptyFloatList, emptyIntList, emptyLongList, getDescriptorForType, getOneofFieldDescriptor, hasOneof, internalGetMapField, mergeFromAndMakeImmutableInternal, mutableCopy, mutableCopy, mutableCopy, mutableCopy, mutableCopy, newBooleanList, newBuilderForType, newDoubleList, newFloatList, newIntList, newLongList, parseDelimitedWithIOException, parseDelimitedWithIOException, parseWithIOException, parseWithIOException, parseWithIOException, parseWithIOException, serializeBooleanMapTo, serializeIntegerMapTo, serializeLongMapTo, serializeStringMapTo, writeReplace, writeString, writeStringNoTag
-
Methods inherited from class com.google.protobuf.AbstractMessage
findInitializationErrors, getInitializationErrorString, hashBoolean, hashEnum, hashEnumList, hashFields, hashLong, toString
-
Methods inherited from class com.google.protobuf.AbstractMessageLite
addAll, addAll, checkByteStringIsUtf8, toByteArray, toByteString, writeDelimitedTo, writeTo
-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface com.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder
getExtension, getExtension, getExtension, getExtension, getExtension, getExtension, getExtensionCount, getExtensionCount, getExtensionCount, hasExtension, hasExtension, hasExtension
-
-
-
-
Field Detail
-
NAME_FIELD_NUMBER
public static final int NAME_FIELD_NUMBER
- See Also:
- Constant Field Values
-
PRECOMPILED_CHARSMAP_FIELD_NUMBER
public static final int PRECOMPILED_CHARSMAP_FIELD_NUMBER
- See Also:
- Constant Field Values
-
ADD_DUMMY_PREFIX_FIELD_NUMBER
public static final int ADD_DUMMY_PREFIX_FIELD_NUMBER
- See Also:
- Constant Field Values
-
REMOVE_EXTRA_WHITESPACES_FIELD_NUMBER
public static final int REMOVE_EXTRA_WHITESPACES_FIELD_NUMBER
- See Also:
- Constant Field Values
-
ESCAPE_WHITESPACES_FIELD_NUMBER
public static final int ESCAPE_WHITESPACES_FIELD_NUMBER
- See Also:
- Constant Field Values
-
NORMALIZATION_RULE_TSV_FIELD_NUMBER
public static final int NORMALIZATION_RULE_TSV_FIELD_NUMBER
- See Also:
- Constant Field Values
-
PARSER
@Deprecated public static final com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec> PARSER
Deprecated.
-
-
Method Detail
-
newInstance
protected java.lang.Object newInstance(com.google.protobuf.GeneratedMessageV3.UnusedPrivateParameter unused)
- Overrides:
newInstance
in classcom.google.protobuf.GeneratedMessageV3
-
getUnknownFields
public final com.google.protobuf.UnknownFieldSet getUnknownFields()
- Specified by:
getUnknownFields
in interfacecom.google.protobuf.MessageOrBuilder
- Overrides:
getUnknownFields
in classcom.google.protobuf.GeneratedMessageV3
-
getDescriptor
public static final com.google.protobuf.Descriptors.Descriptor getDescriptor()
-
internalGetFieldAccessorTable
protected com.google.protobuf.GeneratedMessageV3.FieldAccessorTable internalGetFieldAccessorTable()
- Specified by:
internalGetFieldAccessorTable
in classcom.google.protobuf.GeneratedMessageV3
-
hasName
public boolean hasName()
name of normalization rule.
optional string name = 1;
- Specified by:
hasName
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the name field is set.
-
getName
public java.lang.String getName()
name of normalization rule.
optional string name = 1;
- Specified by:
getName
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The name.
-
getNameBytes
public com.google.protobuf.ByteString getNameBytes()
name of normalization rule.
optional string name = 1;
- Specified by:
getNameBytes
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The bytes for name.
-
hasPrecompiledCharsmap
public boolean hasPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Specified by:
hasPrecompiledCharsmap
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the precompiledCharsmap field is set.
-
getPrecompiledCharsmap
public com.google.protobuf.ByteString getPrecompiledCharsmap()
Pre-compiled normalization rule created by Builder::GetPrecompiledCharsMap() or Builder::CompileCharsMap() method. Usually this field is set by Builder::GetNormalizerSpec() method.
optional bytes precompiled_charsmap = 2;
- Specified by:
getPrecompiledCharsmap
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The precompiledCharsmap.
-
hasAddDummyPrefix
public boolean hasAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Specified by:
hasAddDummyPrefix
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the addDummyPrefix field is set.
-
getAddDummyPrefix
public boolean getAddDummyPrefix()
Adds dummy whitespace at the beginning of text in order to treat "world" in "world" and "hello world" in the same way.
optional bool add_dummy_prefix = 3 [default = true];
- Specified by:
getAddDummyPrefix
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The addDummyPrefix.
-
hasRemoveExtraWhitespaces
public boolean hasRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Specified by:
hasRemoveExtraWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the removeExtraWhitespaces field is set.
-
getRemoveExtraWhitespaces
public boolean getRemoveExtraWhitespaces()
Removes leading, trailing, and duplicate internal whitespace.
optional bool remove_extra_whitespaces = 4 [default = true];
- Specified by:
getRemoveExtraWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The removeExtraWhitespaces.
-
hasEscapeWhitespaces
public boolean hasEscapeWhitespaces()
Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Specified by:
hasEscapeWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the escapeWhitespaces field is set.
-
getEscapeWhitespaces
public boolean getEscapeWhitespaces()
Replaces whitespace with meta symbol. This field must be true to train sentence piece model.
optional bool escape_whitespaces = 5 [default = true];
- Specified by:
getEscapeWhitespaces
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The escapeWhitespaces.
-
hasNormalizationRuleTsv
public boolean hasNormalizationRuleTsv()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
hasNormalizationRuleTsv
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- Whether the normalizationRuleTsv field is set.
-
getNormalizationRuleTsv
public java.lang.String getNormalizationRuleTsv()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
getNormalizationRuleTsv
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The normalizationRuleTsv.
-
getNormalizationRuleTsvBytes
public com.google.protobuf.ByteString getNormalizationRuleTsvBytes()
Custom normalization rule file in TSV format. https://github.com/google/sentencepiece/blob/master/doc/normalization.md This field is only used in SentencePieceTrainer::Train() method, which compiles the rule into the binary rule stored in `precompiled_charsmap`.
optional string normalization_rule_tsv = 6;
- Specified by:
getNormalizationRuleTsvBytes
in interfaceSentencepieceModel.NormalizerSpecOrBuilder
- Returns:
- The bytes for normalizationRuleTsv.
-
isInitialized
public final boolean isInitialized()
- Specified by:
isInitialized
in interfacecom.google.protobuf.MessageLiteOrBuilder
- Overrides:
isInitialized
in classcom.google.protobuf.GeneratedMessageV3.ExtendableMessage<SentencepieceModel.NormalizerSpec>
-
writeTo
public void writeTo(com.google.protobuf.CodedOutputStream output) throws java.io.IOException
- Specified by:
writeTo
in interfacecom.google.protobuf.MessageLite
- Overrides:
writeTo
in classcom.google.protobuf.GeneratedMessageV3
- Throws:
java.io.IOException
-
getSerializedSize
public int getSerializedSize()
- Specified by:
getSerializedSize
in interfacecom.google.protobuf.MessageLite
- Overrides:
getSerializedSize
in classcom.google.protobuf.GeneratedMessageV3
-
equals
public boolean equals(java.lang.Object obj)
- Specified by:
equals
in interfacecom.google.protobuf.Message
- Overrides:
equals
in classcom.google.protobuf.AbstractMessage
-
hashCode
public int hashCode()
- Specified by:
hashCode
in interfacecom.google.protobuf.Message
- Overrides:
hashCode
in classcom.google.protobuf.AbstractMessage
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(java.nio.ByteBuffer data) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(java.nio.ByteBuffer data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.ByteString data) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.ByteString data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(byte[] data) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(byte[] data, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws com.google.protobuf.InvalidProtocolBufferException
- Throws:
com.google.protobuf.InvalidProtocolBufferException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(java.io.InputStream input) throws java.io.IOException
- Throws:
java.io.IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(java.io.InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws java.io.IOException
- Throws:
java.io.IOException
-
parseDelimitedFrom
public static SentencepieceModel.NormalizerSpec parseDelimitedFrom(java.io.InputStream input) throws java.io.IOException
- Throws:
java.io.IOException
-
parseDelimitedFrom
public static SentencepieceModel.NormalizerSpec parseDelimitedFrom(java.io.InputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws java.io.IOException
- Throws:
java.io.IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.CodedInputStream input) throws java.io.IOException
- Throws:
java.io.IOException
-
parseFrom
public static SentencepieceModel.NormalizerSpec parseFrom(com.google.protobuf.CodedInputStream input, com.google.protobuf.ExtensionRegistryLite extensionRegistry) throws java.io.IOException
- Throws:
java.io.IOException
-
newBuilderForType
public SentencepieceModel.NormalizerSpec.Builder newBuilderForType()
- Specified by:
newBuilderForType
in interfacecom.google.protobuf.Message
- Specified by:
newBuilderForType
in interfacecom.google.protobuf.MessageLite
-
newBuilder
public static SentencepieceModel.NormalizerSpec.Builder newBuilder()
-
newBuilder
public static SentencepieceModel.NormalizerSpec.Builder newBuilder(SentencepieceModel.NormalizerSpec prototype)
-
toBuilder
public SentencepieceModel.NormalizerSpec.Builder toBuilder()
- Specified by:
toBuilder
in interfacecom.google.protobuf.Message
- Specified by:
toBuilder
in interfacecom.google.protobuf.MessageLite
-
newBuilderForType
protected SentencepieceModel.NormalizerSpec.Builder newBuilderForType(com.google.protobuf.GeneratedMessageV3.BuilderParent parent)
- Specified by:
newBuilderForType
in classcom.google.protobuf.GeneratedMessageV3
-
getDefaultInstance
public static SentencepieceModel.NormalizerSpec getDefaultInstance()
-
parser
public static com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec> parser()
-
getParserForType
public com.google.protobuf.Parser<SentencepieceModel.NormalizerSpec> getParserForType()
- Specified by:
getParserForType
in interfacecom.google.protobuf.Message
- Specified by:
getParserForType
in interfacecom.google.protobuf.MessageLite
- Overrides:
getParserForType
in classcom.google.protobuf.GeneratedMessageV3
-
getDefaultInstanceForType
public SentencepieceModel.NormalizerSpec getDefaultInstanceForType()
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.GeneratedMessageV3.ExtendableMessageOrBuilder<SentencepieceModel.NormalizerSpec>
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.MessageLiteOrBuilder
- Specified by:
getDefaultInstanceForType
in interfacecom.google.protobuf.MessageOrBuilder
-
-