Package

io.getquill.context

spark

Permalink

package spark

Visibility
  1. Public
  2. All

Type Members

  1. sealed trait Binding extends AnyRef

    Permalink
  2. case class DatasetBinding[T](ds: Dataset[T]) extends Binding with Product with Serializable

    Permalink
  3. trait Decoders extends AnyRef

    Permalink
  4. trait Encoders extends AnyRef

    Permalink
  5. class SparkDialect extends SparkIdiom

    Permalink
  6. trait SparkIdiom extends SqlIdiom with CannotReturn

    Permalink
  7. case class ValueBinding(str: String) extends Binding with Product with Serializable

    Permalink

Value Members

  1. object AliasNestedQueryColumns

    Permalink
  2. object SparkDialect extends SparkDialect

    Permalink
  3. object SparkDialectRecursor

    Permalink

    This helper object is needed to instantiate SparkDialect instances that have multipleSelect enabled.

    This helper object is needed to instantiate SparkDialect instances that have multipleSelect enabled. This is a simple alternative to query re-writing that allows Quill table aliases to be selected in a away that Spark can understand them.

    Quill will represent table variable returns as identifiers e.g.
    Query[Foo.join(Query[Bar]).on({case (f,b) => f.id == b.fk})
    will become:
    select f.*, b.* from Foo f join Bar b on f.id == b.fk In order for this to work properly all we have to do is change the query to:
    select struct(f.*), struct(b.*) from Foo f join Bar b on f.id == b.fk

    Since the expression of the f and b identifiers in the query happens inside a tokenizer, all that is needed for a tokenizer to be able the add the struct part is to introduce a multipleSelect variable that will guide the expansion. The SparkDialectRecursor has been introduced specifically for this reason. I.e. so that SparkDialect contexts can be recursively declared with a new multipleSelect value.

    Multiple selection enabling typically needs to happen in two instances:

    • Multiple table aliases are selected as multiple SelectValues. This typically happens when the output of a query is a single tuple with multiple case classes in it (as is the case with the example above). Or a true nested entity (as these are supported in Spark). The runSuperAstParser method covers this case.
    • Multiple table aliases are inside of a single case-class SelectValue. This typically happens when a Ad-Hoc case class is used. The runCaseClassWithMultipleSelect method covers this case.
  4. package norm

    Permalink

Ungrouped