Package com.basistech.rosette.dm
Class ArabicMorphoAnalysis
java.lang.Object
com.basistech.rosette.dm.BaseAttribute
com.basistech.rosette.dm.MorphoAnalysis
com.basistech.rosette.dm.ArabicMorphoAnalysis
- All Implemented Interfaces:
Serializable
Arabic morphological analysis. An Arabic token is analyzed into a prefix,
a stem, and a suffix, where any of these components could be empty. This
class stores the prefix length and the stem length. The suffix length can
be deduced from these and the length of the original token.
The component parts themselves can be subdivided into sub-components. Each sub-component has an associated tag. For example, one of the possible analyses for "wAlktb" (Buckwalter transliteration for "and the books") looks like:
The component parts themselves can be subdivided into sub-components. Each sub-component has an associated tag. For example, one of the possible analyses for "wAlktb" (Buckwalter transliteration for "and the books") looks like:
prefix: wAl stem: ktb suffix: part-of-speech: NOUN prefix: [w, Al] prefixTags: [CONJ, DET] stems: [ktb] stemTags: [NOUN] suffixes: [] suffixTags: [NO_FUNC]
- See Also:
-
Nested Class Summary
-
Field Summary
Fields inherited from class com.basistech.rosette.dm.BaseAttribute
extendedProperties
-
Constructor Summary
ModifierConstructorDescriptionprotected
ArabicMorphoAnalysis
(String partOfSpeech, String lemma, List<Token> components, String raw, int prefixLength, int stemLength, String root, boolean definiteArticle, boolean strippablePrefix, List<String> prefixes, List<String> stems, List<String> suffixes, List<String> prefixTags, List<String> stemTags, List<String> suffixTags, TagSet tagSet, Map<String, Object> extendedProperties) protected
ArabicMorphoAnalysis
(String partOfSpeech, String lemma, List<Token> components, String raw, int prefixLength, int stemLength, String root, boolean definiteArticle, boolean strippablePrefix, List<String> prefixes, List<String> stems, List<String> suffixes, List<String> prefixTags, List<String> stemTags, List<String> suffixTags, Map<String, Object> extendedProperties) -
Method Summary
Modifier and TypeMethodDescriptionReturns the components of the prefix, if any.int
Returns the number of characters in the prefix.Returns the part-of-speech tags for the prefix components.getRoot()
Returns the root, according to semitic linguistics.int
Returns the number of characters in the stem.getStems()
Returns the components of the stem.Returns the part-of-speech tags for stem components.Returns the components of the suffix, if any.Returns the part-of-speech tags for suffix components.boolean
Returns true if this word has an attached definite article.boolean
Returns true if the prefixes of this word can be stripped (e.g.protected com.google.common.base.MoreObjects.ToStringHelper
Methods inherited from class com.basistech.rosette.dm.MorphoAnalysis
getComponents, getLemma, getPartOfSpeech, getRaw, getTagSet, toString
Methods inherited from class com.basistech.rosette.dm.BaseAttribute
getExtendedProperties, listOrNull, setExtendedProperty
-
Constructor Details
-
ArabicMorphoAnalysis
protected ArabicMorphoAnalysis(String partOfSpeech, String lemma, List<Token> components, String raw, int prefixLength, int stemLength, String root, boolean definiteArticle, boolean strippablePrefix, List<String> prefixes, List<String> stems, List<String> suffixes, List<String> prefixTags, List<String> stemTags, List<String> suffixTags, TagSet tagSet, Map<String, Object> extendedProperties) -
ArabicMorphoAnalysis
protected ArabicMorphoAnalysis(String partOfSpeech, String lemma, List<Token> components, String raw, int prefixLength, int stemLength, String root, boolean definiteArticle, boolean strippablePrefix, List<String> prefixes, List<String> stems, List<String> suffixes, List<String> prefixTags, List<String> stemTags, List<String> suffixTags, Map<String, Object> extendedProperties)
-
-
Method Details
-
getPrefixLength
public int getPrefixLength()Returns the number of characters in the prefix.- Returns:
- the number of characters in the prefix
-
getStemLength
public int getStemLength()Returns the number of characters in the stem.- Returns:
- the number of characters in the stem.
-
getRoot
Returns the root, according to semitic linguistics.- Returns:
- the root, according to semitic linguistics
-
getPrefixes
Returns the components of the prefix, if any.- Returns:
- the components of the prefix, if any
-
getStems
Returns the components of the stem.- Returns:
- the components of the stem
-
getSuffixes
Returns the components of the suffix, if any.- Returns:
- the components of the suffix, if any
-
getPrefixTags
Returns the part-of-speech tags for the prefix components.- Returns:
- the part-of-speech tags for prefix components
-
getStemTags
Returns the part-of-speech tags for stem components.- Returns:
- the part-of-speech tags for stem components
-
getSuffixTags
Returns the part-of-speech tags for suffix components.- Returns:
- the part-of-speech tags for suffix components.
-
isDefiniteArticle
public boolean isDefiniteArticle()Returns true if this word has an attached definite article.- Returns:
- true if this word has an attached definite article
-
isStrippablePrefix
public boolean isStrippablePrefix()Returns true if the prefixes of this word can be stripped (e.g. prepositions).- Returns:
- true if the prefixes of this word can be stripped (e.g. prepositions)
-
toStringHelper
protected com.google.common.base.MoreObjects.ToStringHelper toStringHelper()- Overrides:
toStringHelper
in classMorphoAnalysis
-