Class DefaultClusteredGroupPartitioner
- java.lang.Object
-
- org.apache.druid.query.rowsandcols.semantic.DefaultClusteredGroupPartitioner
-
- All Implemented Interfaces:
ClusteredGroupPartitioner
public class DefaultClusteredGroupPartitioner extends Object implements ClusteredGroupPartitioner
-
-
Constructor Summary
Constructors Constructor Description DefaultClusteredGroupPartitioner(RowsAndColumns rac)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int[]
computeBoundaries(List<String> columns)
Computes and returns a list of contiguous boundaries for independent groups.ArrayList<RowsAndColumns>
partitionOnBoundaries(List<String> partitionColumns)
Semantically equivalent to computeBoundaries, but returns a list of RowsAndColumns objects instead of just boundary positions.
-
-
-
Constructor Detail
-
DefaultClusteredGroupPartitioner
public DefaultClusteredGroupPartitioner(RowsAndColumns rac)
-
-
Method Detail
-
computeBoundaries
public int[] computeBoundaries(List<String> columns)
Description copied from interface:ClusteredGroupPartitioner
Computes and returns a list of contiguous boundaries for independent groups. All rows in a specific grouping should have the same values for the identified columns. Additionally, as this is assuming it is dealing with clustered data, there should only be a single entry in the return value for a given set of values of the columns.Note that implementations are not expected to do any validation that the data is pre-clustered. There is no expectation that an implementation will identify that the same cluster existed non-contiguously. It is up to the caller to ensure that data is clustered correctly before invoking this method.
- Specified by:
computeBoundaries
in interfaceClusteredGroupPartitioner
- Parameters:
columns
- the columns to partition on- Returns:
- an int[] representing the start (inclusive) and stop (exclusive) offsets of boundaries. Boundaries are contiguous, so the stop of the previous boundary is the start of the subsequent one.
-
partitionOnBoundaries
public ArrayList<RowsAndColumns> partitionOnBoundaries(List<String> partitionColumns)
Description copied from interface:ClusteredGroupPartitioner
Semantically equivalent to computeBoundaries, but returns a list of RowsAndColumns objects instead of just boundary positions. This is useful as it allows the concrete implementation to return RowsAndColumns objects that are aware of the internal representation of the data and thus can provide optimized implementations of other semantic interfaces as the "child" RowsAndColumns are used- Specified by:
partitionOnBoundaries
in interfaceClusteredGroupPartitioner
- Parameters:
partitionColumns
- the columns to partition on- Returns:
- a list of RowsAndColumns representing the data grouped by the partition columns.
-
-