gate.creole.morph
Class Morph

java.lang.Object
  extended by gate.util.AbstractFeatureBearer
      extended by gate.creole.AbstractResource
          extended by gate.creole.AbstractProcessingResource
              extended by gate.creole.AbstractLanguageAnalyser
                  extended by gate.creole.morph.Morph
All Implemented Interfaces:
ANNIEConstants, CustomDuplication, Executable, LanguageAnalyser, ProcessingResource, Resource, FeatureBearer, NameBearer, Serializable

@CreoleResource(name="GATE Morphological analyser",
                helpURL="http://gate.ac.uk/userguide/sec:parsers:morpher",
                comment="Morphological Analyzer for the English Language")
public class Morph
extends AbstractLanguageAnalyser
implements ProcessingResource, CustomDuplication

Description: This class is a wrapper for Interpret, the Morphological Analyzer.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class gate.creole.AbstractProcessingResource
AbstractProcessingResource.InternalStatusListener, AbstractProcessingResource.IntervalProgressListener
 
Field Summary
protected  String affixFeatureName
          Feature Name that should be displayed for the affix
protected  String annotationSetName
          The name of the annotation set used for input
protected  Boolean caseSensitive
          Boolean value that tells if parser should behave in caseSensitive mode
protected  Boolean considerPOSTag
           
protected  Interpret existingInterpret
          If this Morph PR is a duplicate of an existing PR, this property will hold a reference to the original PR's Interpret instance.
protected  Boolean failOnMissingInputAnnotations
           
protected  Interpret interpret
          Instance of BaseWord class - English Morpher
protected  org.apache.log4j.Logger logger
           
protected  String rootFeatureName
          Feature Name that should be displayed for the root word
protected  URL rulesFile
          File which contains rules to be processed
 
Fields inherited from class gate.creole.AbstractLanguageAnalyser
corpus, document
 
Fields inherited from class gate.creole.AbstractProcessingResource
interrupted
 
Fields inherited from class gate.creole.AbstractResource
name
 
Fields inherited from class gate.util.AbstractFeatureBearer
features
 
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
 
Constructor Summary
Morph()
          Default Constructor
 
Method Summary
 Resource duplicate(Factory.DuplicationContext ctx)
          Duplicate this morpher, sharing the compiled regular expression patterns and finite state machine with the duplicate.
 void execute()
          Method is executed after the init() method has finished its execution.
 String findAffix(String word, String cat)
          This method should only be called after init()
 String findBaseWord(String word, String cat)
          This method should only be called after init()
 String getAffixFeatureName()
          Returns the feature name that has been currently set to display the affix word
 String getAnnotationSetName()
          Returns the name of the AnnotationSet that has been provided to create the AnnotationSet
 Boolean getCaseSensitive()
          A method which returns if the parser is in caseSenstive mode
 Boolean getConsiderPOSTag()
           
 Boolean getFailOnMissingInputAnnotations()
           
 String getRootFeatureName()
          Returns the feature name that has been currently set to display the root word
 URL getRulesFile()
          Returns the document under process
 Resource init()
          This method creates the instance of the BaseWord - English Morpher and returns the instance of current class with different attributes and the instance of BaseWord class wrapped into it.
 void setAffixFeatureName(String affixFeatureName)
          Sets the feature name that should be displayed for the affix
 void setAnnotationSetName(String annotationSetName)
          Sets the AnnonationSet name, that is used to create the AnnotationSet
 void setCaseSensitive(Boolean value)
          Sets the caseSensitive value, that is used to tell parser if it should convert document to lowercase before parsing
 void setConsiderPOSTag(Boolean value)
           
 void setExistingInterpret(Interpret existingInterpret)
          Only for use by the duplication mechanism.
 void setFailOnMissingInputAnnotations(Boolean fail)
           
 void setRootFeatureName(String rootFeatureName)
          Sets the feature name that should be displayed for the root word
 void setRulesFile(URL rulesFile)
          Sets the rule file to be processed
 
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
 
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, getRuntimeParameterValues, getRuntimeParameterValues, interrupt, isInterrupted, reInit, removeProgressListener, removeStatusListener
 
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, getBeanInfo, getInitParameterValues, getInitParameterValues, getName, getParameterValue, getParameterValue, getParameterValues, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners
 
Methods inherited from class gate.util.AbstractFeatureBearer
getFeatures, setFeatures
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gate.ProcessingResource
reInit
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.FeatureBearer
getFeatures, setFeatures
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 
Methods inherited from interface gate.Executable
interrupt, isInterrupted
 

Field Detail

rulesFile

protected URL rulesFile
File which contains rules to be processed


interpret

protected Interpret interpret
Instance of BaseWord class - English Morpher


rootFeatureName

protected String rootFeatureName
Feature Name that should be displayed for the root word


affixFeatureName

protected String affixFeatureName
Feature Name that should be displayed for the affix


annotationSetName

protected String annotationSetName
The name of the annotation set used for input


caseSensitive

protected Boolean caseSensitive
Boolean value that tells if parser should behave in caseSensitive mode


considerPOSTag

protected Boolean considerPOSTag

existingInterpret

protected Interpret existingInterpret
If this Morph PR is a duplicate of an existing PR, this property will hold a reference to the original PR's Interpret instance.


failOnMissingInputAnnotations

protected Boolean failOnMissingInputAnnotations

logger

protected org.apache.log4j.Logger logger
Constructor Detail

Morph

public Morph()
Default Constructor

Method Detail

setFailOnMissingInputAnnotations

@RunTime
@Optional
@CreoleParameter(comment="Throw and exception when there are none of the required input annotations",
                 defaultValue="true")
public void setFailOnMissingInputAnnotations(Boolean fail)

getFailOnMissingInputAnnotations

public Boolean getFailOnMissingInputAnnotations()

init

public Resource init()
              throws ResourceInstantiationException
This method creates the instance of the BaseWord - English Morpher and returns the instance of current class with different attributes and the instance of BaseWord class wrapped into it.

Specified by:
init in interface Resource
Overrides:
init in class AbstractProcessingResource
Returns:
Resource
Throws:
ResourceInstantiationException

execute

public void execute()
             throws ExecutionException
Method is executed after the init() method has finished its execution.
Method does the following operations:
  1. creates the annotationSet
  2. fetches word tokens from the document, one at a time
  3. runs the morpher on each individual word token
  4. finds the root and the affix for that word
  5. adds them as features to the current token

    Specified by:
    execute in interface Executable
    Overrides:
    execute in class AbstractProcessingResource
    Throws:
    ExecutionException

findBaseWord

public String findBaseWord(String word,
                           String cat)
This method should only be called after init()

Parameters:
word -
Returns:
the rootWord

findAffix

public String findAffix(String word,
                        String cat)
This method should only be called after init()

Parameters:
word -
Returns:
the afix of the rootWord

setRulesFile

public void setRulesFile(URL rulesFile)
Sets the rule file to be processed

Parameters:
rulesFile - - rule File name to be processed

getRulesFile

public URL getRulesFile()
Returns the document under process


getRootFeatureName

public String getRootFeatureName()
Returns the feature name that has been currently set to display the root word


setRootFeatureName

public void setRootFeatureName(String rootFeatureName)
Sets the feature name that should be displayed for the root word

Parameters:
rootFeatureName -

getAffixFeatureName

public String getAffixFeatureName()
Returns the feature name that has been currently set to display the affix word


setAffixFeatureName

public void setAffixFeatureName(String affixFeatureName)
Sets the feature name that should be displayed for the affix

Parameters:
affixFeatureName -

getAnnotationSetName

public String getAnnotationSetName()
Returns the name of the AnnotationSet that has been provided to create the AnnotationSet


setAnnotationSetName

public void setAnnotationSetName(String annotationSetName)
Sets the AnnonationSet name, that is used to create the AnnotationSet

Parameters:
annotationSetName -

getCaseSensitive

public Boolean getCaseSensitive()
A method which returns if the parser is in caseSenstive mode

Returns:
a Boolean value.

setCaseSensitive

public void setCaseSensitive(Boolean value)
Sets the caseSensitive value, that is used to tell parser if it should convert document to lowercase before parsing


getConsiderPOSTag

public Boolean getConsiderPOSTag()

setConsiderPOSTag

public void setConsiderPOSTag(Boolean value)

setExistingInterpret

public void setExistingInterpret(Interpret existingInterpret)
Only for use by the duplication mechanism.


duplicate

public Resource duplicate(Factory.DuplicationContext ctx)
                   throws ResourceInstantiationException
Duplicate this morpher, sharing the compiled regular expression patterns and finite state machine with the duplicate.

Specified by:
duplicate in interface CustomDuplication
Parameters:
ctx - the current duplication context. If an implementation of this method needs to duplicate any other resources as part of the custom duplication process it should pass this context back to the two-argument form of Factory.duplicate rather than using the single-argument form.
Returns:
an independent copy of this resource.
Throws:
ResourceInstantiationException