@API(value=EXPERIMENTAL) public class CascadesPlanner extends Object implements QueryPlanner
RecordQuery
to a RecordQueryPlan
, possibly using
secondary indexes defined in a RecordMetaData
to execute the query efficiently.
Cascades is a
framework for a query optimization introduced by Graefe in 1995. In Cascades, all parsed queries, query plans, and
intermediate state between the two are represented in a unified tree of PlannerExpression
, which includes
types such as RecordQueryPlan
and QueryComponent
.
This highly flexible data structure reifies essentially the entire state of the planner (i.e., partially planned
elements, current optimization, goals, etc.) and allows individual planning steps to be modular and stateless by
keeping all state in the PlannerExpression
tree.
Like many optimization frameworks, Cascades is driven by a set of PlannerRule
s, each of which describes a
particular transformation and encapsulates the logic for determining its applicability and applying it. The planner
searches through its PlannerRuleSet
to find a matching rule and then executes that rule, creating zero or
more additional PlannerExpression
s. A rule is defined by:
ExpressionMatcher
that defines a
finite-depth tree of matchers that inspect the structure (i.e., the type-level information) of some sub-graph
of the current planner expression.
PlannerRule.onMatch(PlannerRuleCall)
method that is run for each successful match, producing zero
or more new expressions.
Since rules can be applied speculatively and need not be "reductive" in any reasonable sense, it is common for cyclic
rule application to occur. Furthermore, the number of possible expression trees being considered at any time can be
enormous, since every rule might apply to many of the existing trees under consideration by the planner. To mitigate
this, Cascades uses aggressive memoization, which is represented by the memo data structure. The memo
provides an efficient interface for storing a forest of expressions, where there might be substantial overlap between
different trees in the forest. The memo is composed of expression groups (or just groups), which are
equivalence classes of expressions. In this implementation, the memo structure is an implicit data structure
represented by GroupExpressionRef
s, each of which represents a group expression in Cascades and contains
a set of PlannerExpression
s. In turn, PlannerExpression
s have some number of children, each
of which is a GroupExpressionRef
and which can be traversed by the planner via the
PlannerExpression.getPlannerExpressionChildren()
method.
A Cascades planner operates by repeatedly executing a Task
from the task execution stack (in this case),
which performs some actions and may schedule other tasks by pushing them onto the stack. The tasks in this particular
planner are the implementors of the Task
interface.
Since a Cascades-style planner produces many possible query plans, it needs some way to decide which ones to select.
This is generally done with a cost model that scores plans according to some cost metric. For now, we use the
CascadesCostModel
which is a heuristic model implemented as a Comparator
.
GroupExpressionRef
,
PlannerExpression
,
PlannerRule
,
CascadesCostModel
QueryPlanner.IndexScanPreference
Constructor and Description |
---|
CascadesPlanner(RecordMetaData metaData,
RecordStoreState recordStoreState) |
CascadesPlanner(RecordMetaData metaData,
RecordStoreState recordStoreState,
PlannerRuleSet ruleSet) |
Modifier and Type | Method and Description |
---|---|
RecordQueryPlan |
plan(RecordQuery query)
Create a plan to get the results of the provided query.
|
void |
setIndexScanPreference(QueryPlanner.IndexScanPreference indexScanPreference)
Set whether
RecordQueryIndexPlan is preferred over
RecordQueryScanPlan even when it does not satisfy any
additional conditions. |
public CascadesPlanner(@Nonnull RecordMetaData metaData, @Nonnull RecordStoreState recordStoreState)
public CascadesPlanner(@Nonnull RecordMetaData metaData, @Nonnull RecordStoreState recordStoreState, @Nonnull PlannerRuleSet ruleSet)
@Nonnull public RecordQueryPlan plan(@Nonnull RecordQuery query)
QueryPlanner
plan
in interface QueryPlanner
query
- a query for records on this planner's metadatapublic void setIndexScanPreference(@Nonnull QueryPlanner.IndexScanPreference indexScanPreference)
QueryPlanner
RecordQueryIndexPlan
is preferred over
RecordQueryScanPlan
even when it does not satisfy any
additional conditions.
Scanning without an index is more efficient, but will have to skip over unrelated record types.
For that reason, it is safer to use an index, except when there is only one record type.
If the meta-data has more than one record type but the record store does not, this can be overridden.setIndexScanPreference
in interface QueryPlanner
indexScanPreference
- whether to prefer index scan over record scan