InputT - the type of the DoFn's (main) input elementsOutputT - the type of the DoFn's (main) output elementspublic class DoFnTester<InputT,OutputT> extends Object
DoFn.
For example:
DoFn<InputT, OutputT> fn = ...;
DoFnTester<InputT, OutputT> fnTester = DoFnTester.of(fn);
// Set arguments shared across all batches:
fnTester.setSideInputs(...); // If fn takes side inputs.
fnTester.setSideOutputTags(...); // If fn writes to side outputs.
// Process a batch containing a single input element:
Input testInput = ...;
List<OutputT> testOutputs = fnTester.processBatch(testInput);
Assert.assertThat(testOutputs, Matchers.hasItems(...));
// Process a bigger batch:
Assert.assertThat(fnTester.processBatch(i1, i2, ...), Matchers.hasItems(...));
| Modifier and Type | Class and Description |
|---|---|
static class |
DoFnTester.CloningBehavior
Whether or not a
DoFnTester should clone the DoFn under test. |
static class |
DoFnTester.OutputElementWithTimestamp<OutputT>
Holder for an OutputElement along with its associated timestamp.
|
| Modifier and Type | Method and Description |
|---|---|
void |
clearOutputElements()
Clears the record of the elements output so far to the main output.
|
<T> void |
clearSideOutputElements(TupleTag<T> tag)
Clears the record of the elements output so far to the side
output with the given tag.
|
void |
finishBundle()
Calls
DoFn.finishBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context) of the DoFn under test. |
<AggregateT> |
getAggregatorValue(Aggregator<?,AggregateT> agg)
Returns the value of the provided
Aggregator. |
DoFnTester.CloningBehavior |
getCloningBehavior()
Indicates whether this
DoFnTester will clone the DoFn under test. |
static <InputT,OutputT> |
of(DoFn<InputT,OutputT> fn)
Returns a
DoFnTester supporting unit-testing of the given
DoFn. |
static <InputT,OutputT> |
of(DoFnWithContext<InputT,OutputT> fn)
Returns a
DoFnTester supporting unit-testing of the given
DoFn. |
List<OutputT> |
peekOutputElements()
Returns the elements output so far to the main output.
|
List<DoFnTester.OutputElementWithTimestamp<OutputT>> |
peekOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps.
|
<T> List<T> |
peekSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the
given tag.
|
List<OutputT> |
processBatch(InputT... inputElements)
A convenience method for testing
DoFns with bundles of elements. |
List<OutputT> |
processBatch(Iterable<? extends InputT> inputElements)
A convenience operation that first calls
startBundle(),
then calls processElement(InputT) on each of the input elements, then
calls finishBundle(), then returns the result of
takeOutputElements(). |
void |
processElement(InputT element)
Calls
DoFn.processElement(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.ProcessContext) on the DoFn under test, in a
context where DoFn.ProcessContext.element() returns the
given element. |
void |
setCloningBehavior(DoFnTester.CloningBehavior newValue)
Instruct this
DoFnTester whether or not to clone the DoFn under test. |
void |
setSideInput(PCollectionView<?> sideInput,
Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>> value)
Registers the values of a side input
PCollectionView to
pass to the DoFn under test. |
void |
setSideInputInGlobalWindow(PCollectionView<?> sideInput,
Iterable<?> value)
Registers the values for a side input
PCollectionView to
pass to the DoFn under test. |
void |
setSideInputs(Map<PCollectionView<?>,Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>>> sideInputs)
Registers the tuple of values of the side input
PCollectionViews to
pass to the DoFn under test. |
void |
setSideOutputTags(TupleTagList sideOutputTags)
Registers the list of
TupleTags that can be used by the
DoFn under test to output to side output
PCollections. |
void |
startBundle()
Calls
DoFn.startBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context) on the DoFn under test. |
List<OutputT> |
takeOutputElements()
Returns the elements output so far to the main output.
|
List<DoFnTester.OutputElementWithTimestamp<OutputT>> |
takeOutputElementsWithTimestamp()
Returns the elements output so far to the main output with associated timestamps.
|
<T> List<T> |
takeSideOutputElements(TupleTag<T> tag)
Returns the elements output so far to the side output with the given tag.
|
public static <InputT,OutputT> DoFnTester<InputT,OutputT> of(DoFn<InputT,OutputT> fn)
DoFnTester supporting unit-testing of the given
DoFn.public static <InputT,OutputT> DoFnTester<InputT,OutputT> of(DoFnWithContext<InputT,OutputT> fn)
DoFnTester supporting unit-testing of the given
DoFn.public void setSideInputs(Map<PCollectionView<?>,Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>>> sideInputs)
PCollectionViews to
pass to the DoFn under test.
If needed, first creates a fresh instance of the DoFn
under test.
If this isn't called, DoFnTester assumes the
DoFn takes no side inputs.
public void setSideInput(PCollectionView<?> sideInput, Iterable<com.google.cloud.dataflow.sdk.util.WindowedValue<?>> value)
PCollectionView to
pass to the DoFn under test.
If needed, first creates a fresh instance of the DoFn
under test.
If this isn't called, DoFnTester assumes the
DoFn takes no side inputs.
public void setSideInputInGlobalWindow(PCollectionView<?> sideInput, Iterable<?> value)
PCollectionView to
pass to the DoFn under test. All values are placed
in the global window.public void setSideOutputTags(TupleTagList sideOutputTags)
TupleTags that can be used by the
DoFn under test to output to side output
PCollections.
If needed, first creates a fresh instance of the DoFn under test.
If this isn't called, DoFnTester assumes the
DoFn doesn't emit to any side outputs.
public void setCloningBehavior(DoFnTester.CloningBehavior newValue)
DoFnTester whether or not to clone the DoFn under test.public DoFnTester.CloningBehavior getCloningBehavior()
DoFnTester will clone the DoFn under test.public List<OutputT> processBatch(Iterable<? extends InputT> inputElements)
startBundle(),
then calls processElement(InputT) on each of the input elements, then
calls finishBundle(), then returns the result of
takeOutputElements().@SafeVarargs public final List<OutputT> processBatch(InputT... inputElements)
DoFns with bundles of elements.
Logic proceeds as follows:
startBundle().processElement(InputT) on each of the arguments.finishBundle().takeOutputElements().public void startBundle()
DoFn.startBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context) on the DoFn under test.
If needed, first creates a fresh instance of the DoFn under test.
public void processElement(InputT element)
DoFn.processElement(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.ProcessContext) on the DoFn under test, in a
context where DoFn.ProcessContext.element() returns the
given element.
Will call startBundle() automatically, if it hasn't
already been called.
IllegalStateException - if the DoFn under test has already
been finishedpublic void finishBundle()
DoFn.finishBundle(com.google.cloud.dataflow.sdk.transforms.DoFn<InputT, OutputT>.Context) of the DoFn under test.
Will call startBundle() automatically, if it hasn't
already been called.
IllegalStateException - if the DoFn under test has already
been finishedpublic List<OutputT> peekOutputElements()
takeOutputElements(),
clearOutputElements()@Experimental public List<DoFnTester.OutputElementWithTimestamp<OutputT>> peekOutputElementsWithTimestamp()
public void clearOutputElements()
peekOutputElements()public List<OutputT> takeOutputElements()
peekOutputElements()@Experimental public List<DoFnTester.OutputElementWithTimestamp<OutputT>> takeOutputElementsWithTimestamp()
public <T> List<T> peekSideOutputElements(TupleTag<T> tag)
public <T> void clearSideOutputElements(TupleTag<T> tag)
public <T> List<T> takeSideOutputElements(TupleTag<T> tag)
public <AggregateT> AggregateT getAggregatorValue(Aggregator<?,AggregateT> agg)
Aggregator.