the StarSchemaInfo used to build this StarSchema Graph.
the node that represents the Fact Table
maps a tableName to the StarTable node in the StarSchema Graph.
provides a mapping from a columnName to its table.
provides a mapping from a columnName to its table.
the node that represents the Fact Table
The seq of expressions representing one side of a join must all be AttributeReferences and must be from the same table.
The seq of expressions representing one side of a join must all be AttributeReferences and must be from the same table. If this condition is met, the table's name is returned.
the StarSchemaInfo used to build this StarSchema Graph.
Does the join predicate represented by the left and right join keys match a join in the StarSchema.
Does the join predicate represented by the left and right join keys match a join in the StarSchema. So a join like
lineitem li join part p on li.l_partkey = p.p_partkey
is represented as
Seq(AttributeReference("l_partkey")), Seq(AttributeReference("p_partkey"))
The following constraints must be met for the joining condition to be a join from this StarSchema:
lineitem li join part p on li.l_partkey = p.p_partkey }}} represented as
Seq(AttributeReference("l_partkey")), Seq(AttributeReference("p_partkey"))
The following constraints must be met for the joining condition to be a join from this StarSchema:
maps a tableName to the StarTable node in the StarSchema Graph.
Represents a StarSchema. The Star Schemas we support have the following constraints:
The first 2 points are not an issue only in the most involved star schema models; for e.g. we show how tpch can be modeled below. The 3rd restriction is an implementation issue: when performing QueryPlan rewrites we don't have access to the table an Attribute belongs to, for now we get around this issue by forcing column names to be unique across the Star Schema.
Tpch Model:
Because of our restrictions we have had to model the Nation table as separate CustNation and SuppNation tables. Similar separation has to be done for CustRegion and SuppRegion. Having to setup separate entities for Supplier and Customer Nation is not atypical when directly writing SQLs; these would be views on the same Nation Dimension table. Currently we are being more restrictive than this, we require the 2 views to be tables in the Metastore(this is because during Plan rewrite we loose the Table association in Attributereferences. But note, this doesn't require the data to be copied, both tables can point to the same underlying data in the storage layer.
We have to rename the column names in the 2 Nation(and region) tables. This is so that we can infer the Attribute to Tables(in the Star Schema) associations in a Query Plan.
the StarSchemaInfo used to build this StarSchema Graph.
the node that represents the Fact Table
maps a tableName to the StarTable node in the StarSchema Graph.
provides a mapping from a columnName to its table.