Class JoinDataSource
- java.lang.Object
-
- org.apache.druid.query.JoinDataSource
-
- All Implemented Interfaces:
DataSource
public class JoinDataSource extends Object implements DataSource
Represents a join of two datasources.Logically, this datasource contains the result of:
(1) prefixing all right-side columns with "rightPrefix" (2) then, joining the left and (prefixed) right sides using the provided type and condition
Any columns from the left-hand side that start with "rightPrefix", and are at least one character longer than the prefix, will be shadowed. It is up to the caller to ensure that no important columns are shadowed by the chosen prefix.
When analyzed by
DataSourceAnalysis
, the right-hand side of this datasource will become aPreJoinableClause
object.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static JoinDataSource
create(DataSource left, DataSource right, String rightPrefix, String condition, JoinType joinType, DimFilter leftFilter, ExprMacroTable macroTable, JoinableFactoryWrapper joinableFactoryWrapper)
Create a join dataSource from a string condition.static JoinDataSource
create(DataSource left, DataSource right, String rightPrefix, JoinConditionAnalysis conditionAnalysis, JoinType joinType, DimFilter leftFilter, JoinableFactoryWrapper joinableFactoryWrapper)
Create a join dataSource from an existingJoinConditionAnalysis
.Function<SegmentReference,SegmentReference>
createSegmentMapFunction(Query query, AtomicLong cpuTimeAccumulator)
Returns a segment function on to how to segment should be modified.boolean
equals(Object o)
DataSourceAnalysis
getAnalysis()
Get the analysis for a data sourcebyte[]
getCacheKey()
Compute a cache key prefix for a data source.List<DataSource>
getChildren()
Returns datasources that this datasource depends on.String
getCondition()
JoinConditionAnalysis
getConditionAnalysis()
JoinableFactoryWrapper
getJoinableFactoryWrapper()
JoinType
getJoinType()
DataSource
getLeft()
DimFilter
getLeftFilter()
DataSource
getRight()
String
getRightPrefix()
Set<String>
getTableNames()
Returns the names of all table datasources involved in this query.Set<String>
getVirtualColumnCandidates()
Computes a set of column names for left table expressions in join condition which may already have been defined as a virtual column in the virtual column registry.int
hashCode()
boolean
isCacheable(boolean isBroker)
Returns true if queries on this dataSource are cacheable at both the result level and per-segment level.boolean
isConcrete()
Returns true if this datasource can be the base datasource of query processing.boolean
isGlobal()
Returns true if all servers have a full copy of this datasource.String
toString()
DataSource
withChildren(List<DataSource> children)
Return a new DataSource, identical to this one, with different children.DataSource
withUpdatedDataSource(DataSource newSource)
Returns an updated datasource based on the specified new source.
-
-
-
Method Detail
-
create
public static JoinDataSource create(DataSource left, DataSource right, String rightPrefix, String condition, JoinType joinType, @Nullable DimFilter leftFilter, ExprMacroTable macroTable, @Nullable JoinableFactoryWrapper joinableFactoryWrapper)
Create a join dataSource from a string condition.
-
create
public static JoinDataSource create(DataSource left, DataSource right, String rightPrefix, JoinConditionAnalysis conditionAnalysis, JoinType joinType, DimFilter leftFilter, @Nullable JoinableFactoryWrapper joinableFactoryWrapper)
Create a join dataSource from an existingJoinConditionAnalysis
.
-
getTableNames
public Set<String> getTableNames()
Description copied from interface:DataSource
Returns the names of all table datasources involved in this query. Does not include names for non-tables, like lookups or inline datasources.- Specified by:
getTableNames
in interfaceDataSource
-
getLeft
public DataSource getLeft()
-
getRight
public DataSource getRight()
-
getRightPrefix
public String getRightPrefix()
-
getCondition
public String getCondition()
-
getConditionAnalysis
public JoinConditionAnalysis getConditionAnalysis()
-
getJoinType
public JoinType getJoinType()
-
getJoinableFactoryWrapper
@Nullable public JoinableFactoryWrapper getJoinableFactoryWrapper()
-
getChildren
public List<DataSource> getChildren()
Description copied from interface:DataSource
Returns datasources that this datasource depends on. Will be empty for leaf datasources like 'table'.- Specified by:
getChildren
in interfaceDataSource
-
withChildren
public DataSource withChildren(List<DataSource> children)
Description copied from interface:DataSource
Return a new DataSource, identical to this one, with different children. The number of children must be equal to the number of children that this datasource already has.- Specified by:
withChildren
in interfaceDataSource
-
isCacheable
public boolean isCacheable(boolean isBroker)
Description copied from interface:DataSource
Returns true if queries on this dataSource are cacheable at both the result level and per-segment level. Currently, dataSources that do not actually reference segments (like 'inline'), are not cacheable since cache keys are always based on segment identifiers.- Specified by:
isCacheable
in interfaceDataSource
-
isGlobal
public boolean isGlobal()
Description copied from interface:DataSource
Returns true if all servers have a full copy of this datasource. True for things like inline, lookup, etc, or for queries of those.Currently this is coupled with joinability - if this returns true then the query engine expects there exists a
JoinableFactory
which might build aJoinable
for this datasource directly. If a subquery 'inline' join is required to join this datasource on the right hand side, then this value must be false for now.In the future, instead of directly using this method, the query planner and engine should consider
JoinableFactory.isDirectlyJoinable(DataSource)
when determining if the right hand side is directly joinable, which would allow decoupling this property from joins.- Specified by:
isGlobal
in interfaceDataSource
-
isConcrete
public boolean isConcrete()
Description copied from interface:DataSource
Returns true if this datasource can be the base datasource of query processing. Base datasources drive query processing. If the base datasource isTableDataSource
, for example, queries are processed in parallel on data servers. If the base datasource isInlineDataSource
, queries are processed on the Broker. SeeDataSourceAnalysis.getBaseDataSource()
for further discussion. Datasources that are *not* concrete must be pre-processed in some way before they can be processed by the main query stack. For example,QueryDataSource
must be executed first and substituted with its results.- Specified by:
isConcrete
in interfaceDataSource
- See Also:
which uses this
,which uses this
-
getVirtualColumnCandidates
public Set<String> getVirtualColumnCandidates()
Computes a set of column names for left table expressions in join condition which may already have been defined as a virtual column in the virtual column registry. It helps to remove any extraenous virtual columns created and only use the relevant ones.- Returns:
- a set of column names which might be virtual columns on left table in join condition
-
createSegmentMapFunction
public Function<SegmentReference,SegmentReference> createSegmentMapFunction(Query query, AtomicLong cpuTimeAccumulator)
Description copied from interface:DataSource
Returns a segment function on to how to segment should be modified.- Specified by:
createSegmentMapFunction
in interfaceDataSource
- Parameters:
query
- the input querycpuTimeAccumulator
- the cpu time accumulator- Returns:
- the segment function
-
withUpdatedDataSource
public DataSource withUpdatedDataSource(DataSource newSource)
Description copied from interface:DataSource
Returns an updated datasource based on the specified new source.- Specified by:
withUpdatedDataSource
in interfaceDataSource
- Parameters:
newSource
- the new datasource to be used to update an existing query- Returns:
- the updated datasource to be used
-
getCacheKey
public byte[] getCacheKey()
Description copied from interface:DataSource
Compute a cache key prefix for a data source. This includes the data sources that participate in the RHS of a join as well as any query specific constructs associated with join data source such as base table filter. This key prefix can be used in segment level cache or result level cache. The function can return following - Non-empty byte array - If there is join datasource involved and caching is possible. The result includes join condition expression, join type and cache key returned by joinable factory for eachPreJoinableClause
- NULL - There is a join but caching is not possible. It may happen if one of the participating datasource in the JOIN is not cacheable.- Specified by:
getCacheKey
in interfaceDataSource
- Returns:
- the cache key to be used as part of query cache key
-
getAnalysis
public DataSourceAnalysis getAnalysis()
Description copied from interface:DataSource
Get the analysis for a data source- Specified by:
getAnalysis
in interfaceDataSource
- Returns:
- The
DataSourceAnalysis
object for the callee data source
-
-