Class/Object

org.mitre.jcarafe.crf

FeatureManagerBuilder

Related Docs: object FeatureManagerBuilder | package crf

Permalink

abstract class FeatureManagerBuilder[Obs] extends Serializable

A FeatureManager defines includes a set of common feature function definitions. It also holds a list of actual feature function objects that are applied to a sequence of observations. Sequence labeling applications will need to create a concrete subclass of FeatureManager that specifies exactly which feature functions will be applied. This class defines a simple DSL (Domain-Specific Language) that allows the set of feature functions for a particular application to be clearly specified.

There are also higher-order feature functions that take other feature functions as arguments to easily and compactly specify more complicated feature extraction functionality. The FeatureManager is type-parameterized by Obs which represents the observation type and Info which denotes the type of the auxilliary information (if any) associated with each observation.

An application-specific FeatureManager should subclass this class and specify, within the body of the class definition a set of feature functions, where each function is described as a single expression that returns an instance of FeatureReturn. Below is an example:

object MyFeatureManager extends FeatureManager[String,Map[String,String] {
  "wdFn"      as wdFn
  "capRegFn"  as regexpFn("Capitalized", "[A-Z].*".r)
  "wdNgrm1"    as wdFn ngram (-2 to 0)
  "wdNgrm2"    as wdFn ngram (-1,0,1)
  "cross1"    as wdFn ngram (-1,0) cross (regexpFn("EndIn-ed",".*ed$".r) over (-2 to 2))
}

Each top-level function consists as a String followed by the keyword method name "as" which is then followed by a feature function. That feature function may be either 1) a simple feature function such as wdFn or 2) a complex feature function created by composing other feature functions. For example, the feature function named "wdNgrm1" creates an n-gram consisting of the concatenation of the features that result from applying the wdFn feature function at the positions -2,-1 and 0 relative to the current position. The "cross1" feature function is a more complicated instance that takes the ngram computed from the words at -1 and 0 and conjoins that feature with all the features that result from applying the regular expression function that returns the feature name "EndIn-ed" (when its pattern is matched) over the relative positions -2,-1,0,1,2.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. FeatureManagerBuilder
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FeatureManagerBuilder(lex: Option[BloomLexicon], wdProps: Option[WordProperties], wdScores: Option[WordScores], inducedFeatureMap: Option[InducedFeatureMap], iString: String, inducingNewMap: Boolean)

    Permalink

Type Members

  1. type Fn = (Int, SourceSequence[Obs], Int) ⇒ FeatureReturn

    Permalink
  2. type FnList = List[Fn]

    Permalink

Abstract Value Members

  1. abstract def buildFeatureFns(s: String = "default"): List[FeatureFn[Obs]]

    Permalink

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def _caselessWdFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature as the current observation ignoring case

    Computes a feature as the current observation ignoring case

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the observation feature

  5. def _edgeFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature with value ":E:"

    Computes a feature with value ":E:"

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the unknown word feature :E: as an edge feature

  6. def _edgeFnSemi(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature with value ":E::x" where x is the size of the current segment

    Computes a feature with value ":E::x" where x is the size of the current segment

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the unknown word feature :E: as an edge feature

  7. def _filteredLexFn(down: Boolean, filter: Set[String])(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Add lexicon features, but only if it matches an allowed set of named lexicons Useful for displacing features or other more advanced uses of lexicons

  8. def _lexiconFn(down: Boolean)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  9. def _nodeFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature with value ":U:"

    Computes a feature with value ":U:"

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the unknown word feature :U:

  10. def _nodeFnSemi(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature with value ":U::x" where x is the size of the current segment

    Computes a feature with value ":U::x" where x is the size of the current segment

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the unknown word feature :U:

  11. def _phraseAttributeFn(att: String)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  12. def _phraseLexFn(aph: Boolean)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  13. def _preLabFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature as the hashed conjunction of ALL labels produced from pre-models

    Computes a feature as the hashed conjunction of ALL labels produced from pre-models

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the pre-model labels hash-conjoined

  14. def _regexpFn(fname: String, regexp: Regex)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a FeatureReturn with a single feature value fname if the observation at the current position matches the specified regular expression.

    Computes a FeatureReturn with a single feature value fname if the observation at the current position matches the specified regular expression.

    fname

    The name of the feature

    regexp

    A regular expression applied to the observation

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    Single feature if regexp matches current observation

  15. def _wdFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

    Computes a feature as the current observation

    Computes a feature as the current observation

    s

    Segment length

    sarr

    SourceSequence of ObsSource[Obs] objects

    pos

    Current position within the sequence

    returns

    A FeatureReturn with the observation feature as a hashcode

  16. def allTagFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  17. def antiPrefixFn(size: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  18. def antiSuffixFn(size: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  19. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  20. def attributeFn(att: String)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  21. val caselessWdFn: Fn

    Permalink
  22. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. val denseFeatureWt: Double

    Permalink
  24. def distanceToLeft(att: String, vl: String)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  25. def distanceToRight(att: String, vl: String)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  26. val downLexFn: Fn

    Permalink
  27. val edgeFn: (Int, SourceSequence[Obs], Int) ⇒ FeatureReturn

    Permalink
  28. val edgeFr: FeatureReturn

    Permalink
  29. val edgePrFr: FeatureReturn

    Permalink
  30. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  31. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  32. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  33. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  34. def getFeatureManager: FeatureManager[Obs]

    Permalink
  35. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  36. val iString: String

    Permalink
  37. val inducedFeatureMap: Option[InducedFeatureMap]

    Permalink
  38. val inducingNewMap: Boolean

    Permalink
  39. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  40. val lex: Option[BloomLexicon]

    Permalink
  41. val lexFn: Fn

    Permalink
  42. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  43. def nearCorpAbbrev(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  44. val nearCorpCode: Long

    Permalink
  45. def nearestLeft(att: String, vl: String, cp: Int, sarr: SourceSequence[Obs]): Int

    Permalink
  46. def nearestRight(att: String, vl: String, cp: Int, sarr: SourceSequence[Obs]): Int

    Permalink
  47. val nodeFn: (Int, SourceSequence[Obs], Int) ⇒ FeatureReturn

    Permalink
  48. val nodeFr: FeatureReturn

    Permalink
  49. val nodePrFr: FeatureReturn

    Permalink
  50. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  51. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  52. val numCode: Long

    Permalink
  53. val numRegex: Regex

    Permalink
  54. def phraseFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  55. def phraseWds(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  56. def predicateFn(name: String, fns: List[FeatureFn[Obs]])(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  57. def prefNgrams(size: Int, range: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  58. def prefixFn(size: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  59. val regexpFn: (String, Regex) ⇒ Fn

    Permalink
  60. def reset: Unit

    Permalink
  61. val selfWdCode: Long

    Permalink
  62. def sentPosition(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  63. def sufNgrams(size: Int, range: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  64. def suffixFn(size: Int)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  65. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  66. implicit def toBuiltFeature(s: String): BuiltFeature

    Permalink
  67. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  68. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  69. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  70. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  71. val wdFn: (Int, SourceSequence[Obs], Int) ⇒ FeatureReturn

    Permalink

    Value for _wdFn _

    Value for _wdFn _

    See also

    #_wdFn(Int,SourceSequence[Obs],Int)

  72. def wdFnNorm(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  73. def wdLen(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  74. val wdProps: Option[WordProperties]

    Permalink
  75. val wdScoreCode: Long

    Permalink
  76. val wdScores: Option[WordScores]

    Permalink
  77. def wdShape(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  78. val wdShapeCode: Long

    Permalink
  79. def weightedAttributes(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  80. def wordPropertiesFn(down: Boolean)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  81. def wordPropertiesPrefixesFn(interval: Int, down: Boolean)(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink
  82. def wordScoresFn(s: Int, sarr: SourceSequence[Obs], pos: Int): FeatureReturn

    Permalink

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped