public class TestPartitionLimitBatch extends BaseTestOpBatchEmitOutcome
PhysicalOpUnitTestBase.BatchIterator, PhysicalOpUnitTestBase.MockExecutorFragmentContext, PhysicalOpUnitTestBase.MockPhysicalOperatoremptyInputRowSet, inputContainer, inputContainerSv2, inputOutcomes, inputSchema, nonEmptyInputRowSet, outputRecordCountclasspathScan, dirTestWatcher, drillbitContext, drillConf, fragContext, opContext, opCreatorReg, operatorFixture, scanDecodeExecutor, scanExecutorc, optionManager| Constructor and Description |
|---|
TestPartitionLimitBatch() |
| Modifier and Type | Method and Description |
|---|---|
void |
afterTestCleanup()
Cleanup method executed post each test
|
static void |
partitionLimitSetup() |
void |
testPartitionLimit_AcrossEmitOutcome()
Verifies that
PartitionLimitRecordBatch refreshes it's state after seeing first EMIT outcome and works on
data batches following it as new set's of incoming batch and apply the partition limit rule from fresh on those. |
void |
testPartitionLimit_EmptyBatchEmitOutcome()
Verifies that empty batch with both OK_NEW_SCHEMA and EMIT outcome is not ignored by
PartitionLimitRecordBatch and is passed to the downstream operator. |
void |
testPartitionLimit_IgnoreOnePartitionIdWithOffset()
Verifies
PartitionLimitRecordBatch works correctly in cases where start offset is such that all the
records of a partition id is ignored but records in other partition id is selected. |
void |
testPartitionLimit_LargeOffsetIgnoreAllRecords() |
void |
testPartitionLimit_Limit0()
Verifies
PartitionLimitRecordBatch works correctly when start and end offset is same. |
void |
testPartitionLimit_MultipleEmit_SingleMultipleBatch_WithSV2_FilteredRows()
Verifies
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches (with sv2) within each EMIT boundary. |
void |
testPartitionLimit_MultipleEmit_SingleMultipleBatch_WithSV2()
Verifies
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches (with sv2) within each EMIT boundary. |
void |
testPartitionLimit_MultipleEmit_SingleMultipleBatch()
Verifies
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches within each EMIT boundary. |
void |
testPartitionLimit_NegativeOffset()
Verifies
PartitionLimitRecordBatch takes care of provided negative start offset correctly |
void |
testPartitionLimit_NoLimit()
Verifies
PartitionLimitRecordBatch works correctly for cases where no end offset is mentioned. |
void |
testPartitionLimit_NonEmptyBatchEmitOutcome()
Verifies
PartitionLimitRecordBatch considers all the batch until it sees EMIT outcome and return output
batch with data that meets the PartitionLimitRecordBatch criteria. |
void |
testPartitionLimit_NonEmptyFirst_EmptyOKEmitOutcome()
Verifies that when the
PartitionLimitRecordBatch number of records is found with first incoming batch,
then next empty incoming batch with OK outcome is ignored, but the empty EMIT outcome batch is not ignored. |
void |
testPartitionLimit_PartitionIdSelectedAcrossBatches()
Verifies
PartitionLimitRecordBatch works correctly in cases a partition id spans across batches and
limit condition is met by picking records from multiple batch for same partition id. |
void |
testPartitionLimit_PartitionIdSpanningAcrossBatches_WithOffset() |
void |
testPartitionLimit_PartitionIdSpanningAcrossBatches()
Verifies that
PartitionLimitRecordBatch considers same partition id across batches but within EMIT
boundary to impose limit condition. |
void |
testPartitionLimit_ResetsAfterFirstEmitOutcome()
Verifies that
PartitionLimitRecordBatch batch operates on batches across EMIT boundary with fresh
configuration. |
afterTest, beforeTest, setUpBeforeClassgetJsonReadersFromBatchString, getJsonReadersFromInputFiles, getOpCreatorReg, getReaderListForJsonBatches, joinCond, legacyOpTestBuilder, mockFragmentContext, mockOpContext, opTestBuilder, ordering, parseExprs, setup, teardownclear, getLocalFileSystem, mockDrillbitContext, mockUsDateFormatSymbols, mockUtcDateTimeZone, parseExpr, setupOptionManagerpublic static void partitionLimitSetup()
public void afterTestCleanup()
public void testPartitionLimit_EmptyBatchEmitOutcome()
PartitionLimitRecordBatch and is passed to the downstream operator.public void testPartitionLimit_NonEmptyBatchEmitOutcome()
PartitionLimitRecordBatch considers all the batch until it sees EMIT outcome and return output
batch with data that meets the PartitionLimitRecordBatch criteria.public void testPartitionLimit_ResetsAfterFirstEmitOutcome()
PartitionLimitRecordBatch batch operates on batches across EMIT boundary with fresh
configuration. That is it considers partition column data separately for batches across EMIT boundary.public void testPartitionLimit_NonEmptyFirst_EmptyOKEmitOutcome()
PartitionLimitRecordBatch number of records is found with first incoming batch,
then next empty incoming batch with OK outcome is ignored, but the empty EMIT outcome batch is not ignored.
Empty incoming batch with EMIT outcome produces empty output batch with EMIT outcome.public void testPartitionLimit_AcrossEmitOutcome()
PartitionLimitRecordBatch refreshes it's state after seeing first EMIT outcome and works on
data batches following it as new set's of incoming batch and apply the partition limit rule from fresh on those.
So for first set of batches with OK_NEW_SCHEMA and EMIT outcome the total number of records received being less
than limit condition, it still produces an output with that many records for each partition key (in this case 1
even though limit number of records is 2).
After seeing EMIT, it refreshes it's state and operate on next input batches to again return limit number of
records per partition id. So for 3rd batch with 6 records and 3 partition id and with EMIT outcome it produces an
output batch with <=2 records for each partition id.public void testPartitionLimit_PartitionIdSpanningAcrossBatches()
PartitionLimitRecordBatch considers same partition id across batches but within EMIT
boundary to impose limit condition.public void testPartitionLimit_PartitionIdSpanningAcrossBatches_WithOffset()
public void testPartitionLimit_PartitionIdSelectedAcrossBatches()
PartitionLimitRecordBatch works correctly in cases a partition id spans across batches and
limit condition is met by picking records from multiple batch for same partition id.public void testPartitionLimit_IgnoreOnePartitionIdWithOffset()
PartitionLimitRecordBatch works correctly in cases where start offset is such that all the
records of a partition id is ignored but records in other partition id is selected.public void testPartitionLimit_LargeOffsetIgnoreAllRecords()
public void testPartitionLimit_Limit0()
PartitionLimitRecordBatch works correctly when start and end offset is same. In this case it
works as Limit 0 scenario where it will not output any rows for any partition id across batches.public void testPartitionLimit_NoLimit()
PartitionLimitRecordBatch works correctly for cases where no end offset is mentioned. This
necessary means selecting all the records in a partition.public void testPartitionLimit_NegativeOffset()
PartitionLimitRecordBatch takes care of provided negative start offset correctlypublic void testPartitionLimit_MultipleEmit_SingleMultipleBatch()
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches within each EMIT boundary. It resets it states correctly across EMIT boundary and then operates on all
the batches within EMIT boundary at a time.public void testPartitionLimit_MultipleEmit_SingleMultipleBatch_WithSV2()
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches (with sv2) within each EMIT boundary. It resets it states correctly across EMIT boundary and then
operates on all the batches within EMIT boundary at a time.public void testPartitionLimit_MultipleEmit_SingleMultipleBatch_WithSV2_FilteredRows()
PartitionLimitRecordBatch behaves correctly across EMIT boundary with single or multiple
batches (with sv2) within each EMIT boundary. It resets it states correctly across EMIT boundary and then
operates on all the batches within EMIT boundary at a time.Copyright © 2022 The Apache Software Foundation. All rights reserved.