Optimized cast for a column in a row to double.
Cast a given column in a schema to epoch time in long milliseconds.
A relation having a parent-child relationship with a base relation.
A relation having a parent-child relationship with a base relation.
::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source with a given schema.
::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source with a given schema. When Spark SQL is given a DDL operation with a USING clause specified (to specify the implemented SchemaRelationProvider) and a user defined schema, this interface is used to pass in the parameters specified by a user.
Users may specify the fully qualified class name of a given data source.
When that class is not found Spark SQL will append the class name
DefaultSource
to the path, allowing for less verbose invocation.
For example, 'org.apache.spark.sql.json' would resolve to the data source
'org.apache.spark.sql.json.DefaultSource'.
A new instance of this class with be instantiated each time a DDL call is made.
The difference between a SchemaRelationProvider and an ExternalSchemaRelationProvider is that latter accepts schema and other clauses in DDL string and passes over to the backend as is, while the schema specified for former is parsed by Spark SQL. A relation provider can inherit both SchemaRelationProvider and ExternalSchemaRelationProvider if it can support both Spark SQL schema and backend-specific schema.
Some extensions to JdbcDialect
used by Snappy implementation.
A relation having a parent-child relationship with one or more
DependentRelation
s as children.
A relation having a parent-child relationship with one or more
DependentRelation
s as children.
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Unsafe Row objects.
::DeveloperApi:: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Unsafe Row objects.
The actual filter should be the conjunction of all filters
,
i.e. they should be "and" together.
The pushed down filters are currently purely an optimization as they will all be evaluated again. This means it is safe to use them with methods that produce false positives such as filtering partitions based on a bloom filter.
1.3.0
::DeveloperApi
::DeveloperApi
An extension to InsertableRelation
that allows for data to be
inserted (possibily having different schema) into the target relation after
comparing against the result of insertSchema
.
A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way. Includes support for merging two StatVarianceCounters.
Taken from Spark's StatCounter implementation removing max and min.
Tracks the child DependentRelation
s for all
ParentRelation
s.
Tracks the child DependentRelation
s for all
ParentRelation
s. This is an optimization for faster access
to avoid scanning the entire catalog.
Support for DML and other operations on external tables.