Class NameFinderME

  • All Implemented Interfaces:
    TokenNameFinder

    public class NameFinderME
    extends java.lang.Object
    implements TokenNameFinder
    Class for creating a maximum-entropy-based name finder.
    • Constructor Detail

      • NameFinderME

        @Deprecated
        public NameFinderME​(TokenNameFinderModel model,
                            AdaptiveFeatureGenerator generator,
                            int beamSize,
                            SequenceValidator<java.lang.String> sequenceValidator)
        Deprecated.
        the beam size is now configured during training time in the trainer parameter file via beamSearch.beamSize
        Initializes the name finder with the specified model.
        Parameters:
        model -
        beamSize -
      • NameFinderME

        @Deprecated
        public NameFinderME​(TokenNameFinderModel model,
                            AdaptiveFeatureGenerator generator,
                            int beamSize)
        Deprecated.
        the beam size is now configured during training time in the trainer parameter file via beamSearch.beamSize
      • NameFinderME

        @Deprecated
        public NameFinderME​(TokenNameFinderModel model,
                            int beamSize)
        Deprecated.
        the beam size is now configured during training time in the trainer parameter file via beamSearch.beamSize
    • Method Detail

      • find

        public Span[] find​(java.lang.String[] tokens)
        Description copied from interface: TokenNameFinder
        Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
        Specified by:
        find in interface TokenNameFinder
        Parameters:
        tokens - an array of the tokens or words of the sequence, typically a sentence.
        Returns:
        an array of spans for each of the names identified.
      • find

        public Span[] find​(java.lang.String[] tokens,
                           java.lang.String[][] additionalContext)
        Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
        Parameters:
        tokens - an array of the tokens or words of the sequence, typically a sentence.
        additionalContext - features which are based on context outside of the sentence but which should also be used.
        Returns:
        an array of spans for each of the names identified.
      • clearAdaptiveData

        public void clearAdaptiveData()
        Forgets all adaptive data which was collected during previous calls to one of the find methods. This method is typical called at the end of a document.
        Specified by:
        clearAdaptiveData in interface TokenNameFinder
      • probs

        public void probs​(double[] probs)
        Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the number of tokens in the previous call to chunk.
        Parameters:
        probs - An array used to hold the probabilities of the last decoded sequence.
      • probs

        public double[] probs()
        Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk.
        Returns:
        An array with the same number of probabilities as tokens were sent to chunk when it was last called.
      • probs

        public double[] probs​(Span[] spans)
        Returns an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.
        Parameters:
        spans - The spans of the names for which probabilities are desired.
        Returns:
        an array of probabilities for each of the specified spans.
      • train

        @Deprecated
        public static TokenNameFinderModel train​(java.lang.String languageCode,
                                                 java.lang.String type,
                                                 ObjectStream<NameSample> samples,
                                                 TrainingParameters trainParams,
                                                 AdaptiveFeatureGenerator generator,
                                                 java.util.Map<java.lang.String,​java.lang.Object> resources)
                                          throws java.io.IOException
        Trains a name finder model.
        Parameters:
        languageCode - the language of the training data
        type - null or an override type for all types in the training data
        samples - the training data
        trainParams - machine learning train parameters
        generator - null or the feature generator
        resources - the resources for the name finder or null if none
        Returns:
        the newly trained model
        Throws:
        java.io.IOException
      • train

        @Deprecated
        public static TokenNameFinderModel train​(java.lang.String languageCode,
                                                 java.lang.String type,
                                                 ObjectStream<NameSample> samples,
                                                 TrainingParameters trainParams,
                                                 byte[] featureGeneratorBytes,
                                                 java.util.Map<java.lang.String,​java.lang.Object> resources)
                                          throws java.io.IOException
        Trains a name finder model.
        Parameters:
        languageCode - the language of the training data
        type - null or an override type for all types in the training data
        samples - the training data
        trainParams - machine learning train parameters
        featureGeneratorBytes - descriptor to configure the feature generation or null
        resources - the resources for the name finder or null if none
        Returns:
        the newly trained model
        Throws:
        java.io.IOException
      • dropOverlappingSpans

        public static Span[] dropOverlappingSpans​(Span[] spans)
        Removes spans with are intersecting or crossing in anyway.

        The following rules are used to remove the spans:
        Identical spans: The first span in the array after sorting it remains
        Intersecting spans: The first span after sorting remains
        Contained spans: All spans which are contained by another are removed

        Parameters:
        spans -
        Returns:
        non-overlapping spans