Package opennlp.tools.ml.model
Class TwoPassDataIndexer
- java.lang.Object
-
- opennlp.tools.ml.model.AbstractDataIndexer
-
- opennlp.tools.ml.model.TwoPassDataIndexer
-
- All Implemented Interfaces:
DataIndexer
public class TwoPassDataIndexer extends AbstractDataIndexer
Collecting event and context counts by making two passes over the events. The first pass determines which contexts will be used by the model, and the second pass creates the events in memory containing only the contexts which will be used. This greatly reduces the amount of memory required for storing the events. During the first pass a temporary event file is created which is read during the second pass.
-
-
Constructor Summary
Constructors Constructor Description TwoPassDataIndexer(ObjectStream<Event> eventStream)
One argument constructor for DataIndexer which calls the two argument constructor assuming no cutoff.TwoPassDataIndexer(ObjectStream<Event> eventStream, int cutoff)
TwoPassDataIndexer(ObjectStream<Event> eventStream, int cutoff, boolean sort)
Two argument constructor for DataIndexer.
-
Method Summary
-
Methods inherited from class opennlp.tools.ml.model.AbstractDataIndexer
getContexts, getNumEvents, getNumTimesEventsSeen, getOutcomeLabels, getOutcomeList, getPredCounts, getPredLabels, getValues
-
-
-
-
Constructor Detail
-
TwoPassDataIndexer
public TwoPassDataIndexer(ObjectStream<Event> eventStream) throws java.io.IOException
One argument constructor for DataIndexer which calls the two argument constructor assuming no cutoff.- Parameters:
eventStream
- An Event[] which contains the a list of all the Events seen in the training data.- Throws:
java.io.IOException
-
TwoPassDataIndexer
public TwoPassDataIndexer(ObjectStream<Event> eventStream, int cutoff) throws java.io.IOException
- Throws:
java.io.IOException
-
TwoPassDataIndexer
public TwoPassDataIndexer(ObjectStream<Event> eventStream, int cutoff, boolean sort) throws java.io.IOException
Two argument constructor for DataIndexer.- Parameters:
eventStream
- An Event[] which contains the a list of all the Events seen in the training data.cutoff
- The minimum number of times a predicate must have been observed in order to be included in the model.- Throws:
java.io.IOException
-
-