Class CascadesPlanner
- java.lang.Object
-
- com.apple.foundationdb.record.query.plan.temp.CascadesPlanner
-
- All Implemented Interfaces:
QueryPlanner
@API(EXPERIMENTAL) public class CascadesPlanner extends Object implements QueryPlanner
A Cascades-style query planner that converts aRecordQuery
to aRecordQueryPlan
, possibly using secondary indexes defined in aRecordMetaData
to execute the query efficiently.Cascades is a framework for a query optimization introduced by Graefe in 1995. In Cascades, all parsed queries, query plans, and intermediate state between the two are represented in a unified tree of
RelationalExpression
, which includes types such asRecordQueryPlan
andQueryComponent
. This highly flexible data structure reifies essentially the entire state of the planner (i.e., partially planned elements, current optimization, goals, etc.) and allows individual planning steps to be modular and stateless by keeping all state in theRelationalExpression
tree.Like many optimization frameworks, Cascades is driven by sets of
PlannerRule
s that can be defined forRelationalExpression
s,PartialMatch
es andMatchPartition
s, each of which describes a particular transformation and encapsulates the logic for determining its applicability and applying it. The planner searches through itsPlannerRuleSet
to find a matching rule and then executes that rule, creating zero or more additionalPlannerExpression
s and/or zero or more additionalPartialMatch
es. A rule is defined by:-
An
ExpressionMatcher
that defines a finite-depth tree of matchers that inspect the structure (i.e., the type-level information) of some subgraph of the current planner expression, the current partial match, or the current match partition. -
A
PlannerRule.onMatch(PlannerRuleCall)
method that is run for each successful match, producing zero or more new expressions and/or zero or more new partial matches.
Since rules can be applied speculatively and need not be "reductive" in any reasonable sense, it is common for cyclic rule application to occur. Furthermore, the number of possible expression trees being considered at any time can be enormous, since every rule might apply to many of the existing trees under consideration by the planner. To mitigate this, Cascades uses aggressive memoization, which is represented by the memo data structure. The memo provides an efficient interface for storing a forest of expressions, where there might be substantial overlap between different trees in the forest. The memo is composed of expression groups (or just groups), which are equivalence classes of expressions. In this implementation, the memo structure is an implicit data structure represented by
GroupExpressionRef
s, each of which represents a group expression in Cascades and contains a set ofRelationalExpression
s. In turn,RelationalExpression
s have some number of children, each of which is aGroupExpressionRef
and which can be traversed by the planner via theRelationalExpression.getQuantifiers()
method.A Cascades planner operates by repeatedly executing a
CascadesPlanner.Task
from the task execution stack (in this case), which performs some actions and may schedule other tasks by pushing them onto the stack. The tasks in this particular planner are the implementors of theCascadesPlanner.Task
interface.Since a Cascades-style planner produces many possible query plans, it needs some way to decide which ones to select. This is generally done with a cost model that scores plans according to some cost metric. For now, we use the
CascadesCostModel
which is a heuristic model implemented as aComparator
.Simplified enqueue/execute overview:
CascadesPlanner.OptimizeGroup
if (not explored) enqueues this (again)CascadesPlanner.ExploreExpression
for each group member sets explored totrue
else prune to find best plan; doneCascadesPlanner.ExploreGroup
enqueuesCascadesPlanner.ExploreExpression
for each group member sets explored totrue
CascadesPlanner.ExploreExpression
enqueues all transformations (CascadesPlanner.TransformMatchPartition
) for match partitions of current (group, expression) all transformations (CascadesPlanner.TransformExpression
) for current (group, expression)CascadesPlanner.ExploreGroup
for all ranged over groups after execution of any TransformXXX enqueuesCascadesPlanner.AdjustMatch
for each yieldedPartialMatch
CascadesPlanner.OptimizeInputs
followed byCascadesPlanner.ExploreExpression
for each yieldedRecordQueryPlan
CascadesPlanner.ExploreExpression
for each yieldedRelationalExpression
that is not aRecordQueryPlan
CascadesPlanner.AdjustMatch
enqueues all transformations (CascadesPlanner.TransformPartialMatch
) for current (group, expression, partial match)CascadesPlanner.OptimizeInputs
enqueuesCascadesPlanner.OptimizeGroup
for all ranged over groups-
Transforms on expressions
CascadesPlanner.TransformExpression
: These are the classical transforms creating new variations in the expression memoization structure. The root for the corresponding rules is always of typeRelationalExpression
. -
Transforms on partial matches
CascadesPlanner.TransformPartialMatch
: These transforms are executed when a partial match is found and typically only yield other new partial matches for the current (group, expression) pair. The root for the corresponding rules is always of typePartialMatch
. -
Transforms on match partitions
CascadesPlanner.TransformMatchPartition
: These transforms are executed only after all transforms (bothCascadesPlanner.TransformExpression
s andCascadesPlanner.TransformPartialMatch
) have been executed for a current (group, expression). Note, that this kind transformation task can be repeatedly executed for a given group but it is guaranteed to only be executed once for a (group, expression) pair. The root for the corresponding rules is always of typeMatchPartition
. These are the rules that react to all synthesized matches for an expression at once.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
CascadesPlanner.Task
Represents actual tasks in the task stack of the planner.-
Nested classes/interfaces inherited from interface com.apple.foundationdb.record.query.plan.QueryPlanner
QueryPlanner.IndexScanPreference
-
-
Constructor Summary
Constructors Constructor Description CascadesPlanner(RecordMetaData metaData, RecordStoreState recordStoreState)
CascadesPlanner(RecordMetaData metaData, RecordStoreState recordStoreState, PlannerRuleSet ruleSet)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description RecordQueryPlannerConfiguration
getConfiguration()
RecordMetaData
getRecordMetaData()
Get theRecordMetaData
for this planner.RecordStoreState
getRecordStoreState()
Get theRecordStoreState
for this planner.RecordQueryPlan
plan(RecordQuery query)
Create a plan to get the results of the provided query.QueryPlanResult
planQuery(RecordQuery query)
Create a plan to get the results of the provided query.void
setConfiguration(RecordQueryPlannerConfiguration configuration)
void
setIndexScanPreference(QueryPlanner.IndexScanPreference indexScanPreference)
Set whetherRecordQueryIndexPlan
is preferred overRecordQueryScanPlan
even when it does not satisfy any additional conditions.void
setMaxTaskQueueSize(int maxTaskQueueSize)
Set the size limit of the Cascades planner task queue.void
setMaxTotalTaskCount(int maxTotalTaskCount)
Set a limit on the number of tasks that can be executed as part of the Cascades planner planning.
-
-
-
Constructor Detail
-
CascadesPlanner
public CascadesPlanner(@Nonnull RecordMetaData metaData, @Nonnull RecordStoreState recordStoreState)
-
CascadesPlanner
public CascadesPlanner(@Nonnull RecordMetaData metaData, @Nonnull RecordStoreState recordStoreState, @Nonnull PlannerRuleSet ruleSet)
-
-
Method Detail
-
plan
@Nonnull public RecordQueryPlan plan(@Nonnull RecordQuery query)
Description copied from interface:QueryPlanner
Create a plan to get the results of the provided query.- Specified by:
plan
in interfaceQueryPlanner
- Parameters:
query
- a query for records on this planner's metadata- Returns:
- a plan that will return the results of the provided query when executed
-
planQuery
@Nonnull public QueryPlanResult planQuery(@Nonnull RecordQuery query)
Description copied from interface:QueryPlanner
Create a plan to get the results of the provided query. This method returns aQueryPlanResult
that contains the same plan as returned byQueryPlanner.plan(RecordQuery)
with additional information provided in theQueryPlanInfo
- Specified by:
planQuery
in interfaceQueryPlanner
- Parameters:
query
- a query for records on this planner's metadata- Returns:
- a
QueryPlanResult
that contains the plan for the query with additional information
-
getRecordMetaData
@Nonnull public RecordMetaData getRecordMetaData()
Description copied from interface:QueryPlanner
Get theRecordMetaData
for this planner.- Specified by:
getRecordMetaData
in interfaceQueryPlanner
- Returns:
- the meta-data
-
getRecordStoreState
@Nonnull public RecordStoreState getRecordStoreState()
Description copied from interface:QueryPlanner
Get theRecordStoreState
for this planner.- Specified by:
getRecordStoreState
in interfaceQueryPlanner
- Returns:
- the record store state
-
setIndexScanPreference
public void setIndexScanPreference(@Nonnull QueryPlanner.IndexScanPreference indexScanPreference)
Description copied from interface:QueryPlanner
Set whetherRecordQueryIndexPlan
is preferred overRecordQueryScanPlan
even when it does not satisfy any additional conditions. Scanning without an index is more efficient, but will have to skip over unrelated record types. For that reason, it is safer to use an index, except when there is only one record type. If the meta-data has more than one record type but the record store does not, this can be overridden.- Specified by:
setIndexScanPreference
in interfaceQueryPlanner
- Parameters:
indexScanPreference
- whether to prefer index scan over record scan
-
setMaxTaskQueueSize
public void setMaxTaskQueueSize(int maxTaskQueueSize)
Set the size limit of the Cascades planner task queue. If the planner tries to add a task to the queue beyond the maximum size, planning will fail. Default value is 0, which means "unbound".- Parameters:
maxTaskQueueSize
- the maximum size of the queue.
-
setMaxTotalTaskCount
public void setMaxTotalTaskCount(int maxTotalTaskCount)
Set a limit on the number of tasks that can be executed as part of the Cascades planner planning. If the planner tries to execute a task after the maximum number was exceeded, planning will fail. Default value is 0, which means "unbound".- Parameters:
maxTotalTaskCount
- the maximum number of tasks.
-
getConfiguration
@Nonnull public RecordQueryPlannerConfiguration getConfiguration()
-
setConfiguration
public void setConfiguration(@Nonnull RecordQueryPlannerConfiguration configuration)
-
-