Class DocumentProcessor
- All Implemented Interfaces:
com.yahoo.component.Component
,com.yahoo.component.Deconstructable
,Comparable<com.yahoo.component.Component>
- Direct Known Subclasses:
SimpleDocumentProcessor
A document processor is a component which performs some operation on a document or document update. Document processors are asynchronous, they may request some data and then return. The processing framework is responsible for calling processors again at unspecified times until they are done processing the document or document update.
Document processor instances are chained together by the framework to realize a complete processing pipeline. The processing chain is represented by the processor instances themselves, see getNext/setNext. Document processors may optionally control the routing through the chain by setting the next processor on ongoing processings.
A processing may contain one or multiple documents or document updates. Document processors may optionally handle collections of processors in some other way than just processing each one in order.
A document processor must have an empty constructor. When instantiated from Vespa config (as opposed to being instantiated programmatically in a stand-alone Docproc system), the framework is responsible for configuring the processor using setConfig(). If a document processor wants to do some initial setup after configuration has been set, but before it has begun processing documents or document updates, it should override initialize().
Document processors must be thread safe. To ensure this, make sure that access to any mutable, thread-unsafe state held in a field by the processor is synchronized.
- Author:
- bratseth
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic final class
static class
An enumeration of possible results of calling a process method -
Field Summary
Fields inherited from class com.yahoo.component.AbstractComponent
isDeconstructable
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionSchema map for field names (doctype,from)→toabstract DocumentProcessor.Progress
process
(Processing processing) Processes a processing, which can contain zero or more document bases.void
setFieldMap
(Map<com.yahoo.collections.Pair<String, String>, String> fieldMap) Sets the schema map for field namestoString()
Methods inherited from class com.yahoo.component.chain.ChainedComponent
getAnnotatedDependencies, getDefaultAnnotatedDependencies, getDependencies, initDependencies
Methods inherited from class com.yahoo.component.AbstractComponent
clone, compareTo, deconstruct, getClassName, getId, getIdString, hasInitializedId, initId, isDeconstructable, setIsDeconstructable
-
Constructor Details
-
DocumentProcessor
public DocumentProcessor()
-
-
Method Details
-
process
Processes a processing, which can contain zero or more document bases. The implementing document processor is free to modify, replace or delete elements in the list inside processing.- Parameters:
processing
- the processing to process- Returns:
- the outcome of this processing
-
setFieldMap
Sets the schema map for field names -
getFieldMap
Schema map for field names (doctype,from)→to -
getDocMap
-
toString
- Overrides:
toString
in classcom.yahoo.component.AbstractComponent
-