Partial join order in a specific level.
Partial join order in a specific level.
Set of item ids participating in this partial plan.
The plan tree with the lowest cost for these items found so far.
Join conditions included in the plan.
The cost of this plan tree is the sum of costs of all intermediate joins.
Map[set of item ids, join plan for these items]
Returns true if expr
can be evaluated using only the output of plan
.
Returns true if expr
can be evaluated using only the output of plan
. This method
can be used to determine when it is acceptable to move expression evaluation within a query
plan.
For example consider a join between two relations R(a, b) and S(c, d).
- canEvaluate(EqualTo(a,b), R)
returns true
- canEvaluate(EqualTo(a,c), R)
returns false
- canEvaluate(Literal(1), R)
returns true
as literals CAN be evaluated on any plan
Returns true iff expr
could be evaluated as a condition within join.
Returns true iff expr
could be evaluated as a condition within join.
Reorder the joins using a dynamic programming algorithm. This implementation is based on the paper: Access Path Selection in a Relational Database Management System. http://www.inf.ed.ac.uk/teaching/courses/adbs/AccessPath.pdf
First we put all items (basic joined nodes) into level 0, then we build all two-way joins at level 1 from plans at level 0 (single items), then build all 3-way joins from plans at previous levels (two-way joins and single items), then 4-way joins ... etc, until we build all n-way joins and pick the best plan among them.
When building m-way joins, we only keep the best plan (with the lowest cost) for the same set of m items. E.g., for 3-way joins, we keep only the best plan for items {A, B, C} among plans (A J B) J C, (A J C) J B and (B J C) J A. We also prune cartesian product candidates when building a new plan if there exists no join condition involving references from both left and right. This pruning strategy significantly reduces the search space. E.g., given A J B J C J D with join conditions A.k1 = B.k1 and B.k2 = C.k2 and C.k3 = D.k3, plans maintained for each level are as follows: level 0: p({A}), p({B}), p({C}), p({D}) level 1: p({A, B}), p({B, C}), p({C, D}) level 2: p({A, B, C}), p({B, C, D}) level 3: p({A, B, C, D}) where p({A, B, C, D}) is the final output plan.
For cost evaluation, since physical costs for operators are not available currently, we use cardinalities and sizes to compute costs.