gate.creole.gazetteer
Class AbstractGazetteer

java.lang.Object
  extended by gate.util.AbstractFeatureBearer
      extended by gate.creole.AbstractResource
          extended by gate.creole.AbstractProcessingResource
              extended by gate.creole.AbstractLanguageAnalyser
                  extended by gate.creole.gazetteer.AbstractGazetteer
All Implemented Interfaces:
ANNIEConstants, Gazetteer, Executable, LanguageAnalyser, ProcessingResource, Resource, FeatureBearer, NameBearer, Serializable
Direct Known Subclasses:
AbstractOntoGazetteer, DefaultGazetteer, HashGazetteer

public abstract class AbstractGazetteer
extends AbstractLanguageAnalyser
implements Gazetteer

AbstractGazetteer This class implements the common-for-all methods of the Gazetteer interface

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class gate.creole.AbstractProcessingResource
AbstractProcessingResource.InternalStatusListener, AbstractProcessingResource.IntervalProgressListener
 
Field Summary
protected  String annotationSetName
          Used to store the annotation set currently being used for the newly generated annotations
protected  Boolean caseSensitive
          Should this gazetteer be case sensitive.
protected  LinearDefinition definition
          the linear definition of the gazetteer
protected  String encoding
          the encoding of the gazetteer
protected  Set listeners
          the set of gazetteer listeners
protected  URL listsURL
          The value of this property is the URL that will be used for reading the lists that define this Gazetteer
protected  Boolean longestMatchOnly
          Should this gazetteer only match the longest string starting from any offset?
protected  MappingDefinition mappingDefinition
          reference to mapping definition info allows filling of Lookup.ontologyClass according to a list
protected  Boolean wholeWordsOnly
          Should this gazetteer only match whole words.
 
Fields inherited from class gate.creole.AbstractLanguageAnalyser
corpus, document
 
Fields inherited from class gate.creole.AbstractProcessingResource
interrupted
 
Fields inherited from class gate.creole.AbstractResource
name
 
Fields inherited from class gate.util.AbstractFeatureBearer
features
 
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
 
Constructor Summary
AbstractGazetteer()
           
 
Method Summary
 void addGazetteerListener(GazetteerListener gl)
          Registers a Gazetteer Listener
 void fireGazetteerEvent(GazetteerEvent ge)
          fires a Gazetteer Event
 String getAnnotationSetName()
          Gets the AnnotationSet that will be used at the next run for the newly produced annotations.
 Boolean getCaseSensitive()
          Gets the current case sensitivity
 String getEncoding()
           
 LinearDefinition getLinearDefinition()
          Gets the linear definition of this gazetteer. there is no parallel set method because the definition is loaded through the listsUrl on init().
 URL getListsURL()
          Gets the url of the lists.def file
 Boolean getLongestMatchOnly()
           
 MappingDefinition getMappingDefinition()
          Gets the mapping definition of this gazetteer,if such
 Boolean getWholeWordsOnly()
          Gets the value for the wholeWordsOnly parameter.
 void reInit()
          Reinitialises the processing resource.
 void setAnnotationSetName(String newAnnotationSetName)
          Sets the AnnotationSet that will be used at the next run for the newly produced annotations.
 void setCaseSensitive(Boolean newCaseSensitive)
          Triggers case sensitive
 void setEncoding(String newEncoding)
           
 void setListsURL(URL newListsURL)
          Sets the url of the lists.def file
 void setLongestMatchOnly(Boolean longestMatchOnly)
           
 void setMappingDefinition(MappingDefinition mapping)
          Sets the mapping definition if such to this gazetteer
 void setWholeWordsOnly(Boolean wholeWordsOnly)
          Sets the value for the wholeWordsOnly parameter.
 
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
 
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, execute, fireProcessFinished, fireProgressChanged, fireStatusChanged, getRuntimeParameterValues, getRuntimeParameterValues, init, interrupt, isInterrupted, removeProgressListener, removeStatusListener
 
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, flushBeanInfoCache, getBeanInfo, getInitParameterValues, getInitParameterValues, getName, getParameterValue, getParameterValue, getParameterValues, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners
 
Methods inherited from class gate.util.AbstractFeatureBearer
getFeatures, setFeatures
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gate.creole.gazetteer.Gazetteer
add, lookup, remove
 
Methods inherited from interface gate.LanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, init, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.FeatureBearer
getFeatures, setFeatures
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 
Methods inherited from interface gate.Executable
execute, interrupt, isInterrupted
 

Field Detail

listeners

protected Set listeners
the set of gazetteer listeners


annotationSetName

protected String annotationSetName
Used to store the annotation set currently being used for the newly generated annotations


encoding

protected String encoding
the encoding of the gazetteer


listsURL

protected URL listsURL
The value of this property is the URL that will be used for reading the lists that define this Gazetteer


caseSensitive

protected Boolean caseSensitive
Should this gazetteer be case sensitive. The default value is true.


wholeWordsOnly

protected Boolean wholeWordsOnly
Should this gazetteer only match whole words. The default value is true.


longestMatchOnly

protected Boolean longestMatchOnly
Should this gazetteer only match the longest string starting from any offset? This parameter is only relevant when the list of lookups contains proper prefixes of other entries (e.g when both "Dell" and "Dell Europe" are in the lists). The default behaviour (when this parameter is set to true) is to only match the longest entry, "Dell Europe" in this example. This is the default GATE gazetteer behaviour since version 2.0. Setting this parameter to false will cause the gazetteer to match all possible prefixes.


definition

protected LinearDefinition definition
the linear definition of the gazetteer


mappingDefinition

protected MappingDefinition mappingDefinition
reference to mapping definition info allows filling of Lookup.ontologyClass according to a list

Constructor Detail

AbstractGazetteer

public AbstractGazetteer()
Method Detail

setAnnotationSetName

@RunTime
@Optional
@CreoleParameter(comment="The annotation set to be used for the generated annotations")
public void setAnnotationSetName(String newAnnotationSetName)
Sets the AnnotationSet that will be used at the next run for the newly produced annotations.

Specified by:
setAnnotationSetName in interface Gazetteer
Parameters:
newAnnotationSetName - the annotation set name for the annotations that are going to be produced

getAnnotationSetName

public String getAnnotationSetName()
Gets the AnnotationSet that will be used at the next run for the newly produced annotations.

Specified by:
getAnnotationSetName in interface Gazetteer
Returns:
the current AnnotationSet name

setEncoding

@CreoleParameter(comment="The encoding used for reading the definitions",
                 defaultValue="UTF-8")
public void setEncoding(String newEncoding)
Specified by:
setEncoding in interface Gazetteer

getEncoding

public String getEncoding()
Specified by:
getEncoding in interface Gazetteer

getListsURL

public URL getListsURL()
Description copied from interface: Gazetteer
Gets the url of the lists.def file

Specified by:
getListsURL in interface Gazetteer
Returns:
the url of the lists.def file

setListsURL

@CreoleParameter(comment="The URL to the file with list of lists",
                 suffixes="def",
                 defaultValue="resources/gazetteer/lists.def")
public void setListsURL(URL newListsURL)
Description copied from interface: Gazetteer
Sets the url of the lists.def file

Specified by:
setListsURL in interface Gazetteer
Parameters:
newListsURL - the url of the lists.def file to be set

setCaseSensitive

@CreoleParameter(comment="Should this gazetteer diferentiate on case?",
                 defaultValue="true")
public void setCaseSensitive(Boolean newCaseSensitive)
Description copied from interface: Gazetteer
Triggers case sensitive

Specified by:
setCaseSensitive in interface Gazetteer
Parameters:
newCaseSensitive - turn on or off case sensitivity

getCaseSensitive

public Boolean getCaseSensitive()
Description copied from interface: Gazetteer
Gets the current case sensitivity

Specified by:
getCaseSensitive in interface Gazetteer
Returns:
the current case sensitivity

setMappingDefinition

public void setMappingDefinition(MappingDefinition mapping)
Description copied from interface: Gazetteer
Sets the mapping definition if such to this gazetteer

Specified by:
setMappingDefinition in interface Gazetteer
Parameters:
mapping - a mapping definition

getMappingDefinition

public MappingDefinition getMappingDefinition()
Description copied from interface: Gazetteer
Gets the mapping definition of this gazetteer,if such

Specified by:
getMappingDefinition in interface Gazetteer
Returns:
the mapping definition of this gazetteer,if such otherwise null

getLongestMatchOnly

public Boolean getLongestMatchOnly()
Returns:
the longestMatchOnly

setLongestMatchOnly

@RunTime
@CreoleParameter(comment="Should this gazetteer only match the longest string starting from any offset?",
                 defaultValue="true")
public void setLongestMatchOnly(Boolean longestMatchOnly)
Parameters:
longestMatchOnly - the longestMatchOnly to set

getLinearDefinition

public LinearDefinition getLinearDefinition()
Gets the linear definition of this gazetteer. there is no parallel set method because the definition is loaded through the listsUrl on init().

Specified by:
getLinearDefinition in interface Gazetteer
Returns:
the linear definition of the gazetteer

reInit

public void reInit()
            throws ResourceInstantiationException
Description copied from class: AbstractProcessingResource
Reinitialises the processing resource. After calling this method the resource should be in the state it is after calling init. If the resource depends on external resources (such as rules files) then the resource will re-read those resources. If the data used to create the resource has changed since the resource has been created then the resource will change too after calling reInit(). The implementation in this class simply calls AbstractProcessingResource.init(). This functionality must be overriden by derived classes as necessary.

Specified by:
reInit in interface ProcessingResource
Overrides:
reInit in class AbstractProcessingResource
Throws:
ResourceInstantiationException

fireGazetteerEvent

public void fireGazetteerEvent(GazetteerEvent ge)
fires a Gazetteer Event

Specified by:
fireGazetteerEvent in interface Gazetteer
Parameters:
ge - Gazetteer Event to be fired

addGazetteerListener

public void addGazetteerListener(GazetteerListener gl)
Registers a Gazetteer Listener

Specified by:
addGazetteerListener in interface Gazetteer
Parameters:
gl - Gazetteer Listener to be registered

getWholeWordsOnly

public Boolean getWholeWordsOnly()
Gets the value for the wholeWordsOnly parameter.

Returns:
a Boolean value.

setWholeWordsOnly

@RunTime
@CreoleParameter(comment="Should this gazetteer only match whole words?",
                 defaultValue="true")
public void setWholeWordsOnly(Boolean wholeWordsOnly)
Sets the value for the wholeWordsOnly parameter.

Parameters:
wholeWordsOnly - a Boolean value.