Class/Object

com.youtube.vitess.proto.vtgate

SplitQueryRequest

Related Docs: object SplitQueryRequest | package vtgate

Permalink

final case class SplitQueryRequest(callerId: Option[CallerID] = None, keyspace: String = "", query: Option[BoundQuery] = None, splitColumn: Seq[String] = _root_.scala.collection.Seq.empty, splitCount: Long = 0L, numRowsPerQueryPart: Long = 0L, algorithm: Algorithm = ..., useSplitQueryV2: Boolean = false) extends GeneratedMessage with Message[SplitQueryRequest] with Updatable[SplitQueryRequest] with Product with Serializable

SplitQueryRequest is the payload to SplitQuery.

SplitQuery takes a "SELECT" query and generates a list of queries called "query-parts". Each query-part consists of the original query with an added WHERE clause that restricts the query-part to operate only on rows whose values in the the columns listed in the "split_column" field of the request (see below) are in a particular range.

It is guaranteed that the set of rows obtained from executing each query-part on a database snapshot and merging (without deduping) the results is equal to the set of rows obtained from executing the original query on the same snapshot with the rows containing NULL values in any of the split_column's excluded.

This is typically called by the MapReduce master when reading from Vitess. There it's desirable that the sets of rows returned by the query-parts have roughly the same size.

callerId

caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.

keyspace

keyspace to target the query to.

query

The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.

splitColumn

Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.

splitCount

You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).

algorithm

The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.

useSplitQueryV2

TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.

Annotations
@SerialVersionUID()
Linear Supertypes
Product, Equals, Updatable[SplitQueryRequest], Message[SplitQueryRequest], GeneratedMessage, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SplitQueryRequest
  2. Product
  3. Equals
  4. Updatable
  5. Message
  6. GeneratedMessage
  7. Serializable
  8. Serializable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SplitQueryRequest(callerId: Option[CallerID] = None, keyspace: String = "", query: Option[BoundQuery] = None, splitColumn: Seq[String] = _root_.scala.collection.Seq.empty, splitCount: Long = 0L, numRowsPerQueryPart: Long = 0L, algorithm: Algorithm = ..., useSplitQueryV2: Boolean = false)

    Permalink

    callerId

    caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.

    keyspace

    keyspace to target the query to.

    query

    The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.

    splitColumn

    Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.

    splitCount

    You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).

    algorithm

    The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.

    useSplitQueryV2

    TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def addAllSplitColumn(__vs: TraversableOnce[String]): SplitQueryRequest

    Permalink
  5. def addSplitColumn(__vs: String*): SplitQueryRequest

    Permalink
  6. val algorithm: Algorithm

    Permalink

    The algorithm to use to split the query.

    The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.

  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. val callerId: Option[CallerID]

    Permalink

    caller_id identifies the caller.

    caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.

  9. def clearCallerId: SplitQueryRequest

    Permalink
  10. def clearQuery: SplitQueryRequest

    Permalink
  11. def clearSplitColumn: SplitQueryRequest

    Permalink
  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. def companion: SplitQueryRequest.type

    Permalink
    Definition Classes
    SplitQueryRequest → GeneratedMessage
  14. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  16. def getCallerId: CallerID

    Permalink
  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. def getField(__field: FieldDescriptor): PValue

    Permalink
    Definition Classes
    SplitQueryRequest → GeneratedMessage
  19. def getFieldByNumber(__fieldNumber: Int): Any

    Permalink
    Definition Classes
    SplitQueryRequest → GeneratedMessage
  20. def getQuery: BoundQuery

    Permalink
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. val keyspace: String

    Permalink

    keyspace to target the query to.

  23. def mergeFrom(_input__: CodedInputStream): SplitQueryRequest

    Permalink
    Definition Classes
    SplitQueryRequest → Message
  24. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  25. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. val numRowsPerQueryPart: Long

    Permalink
  28. val query: Option[BoundQuery]

    Permalink

    The query and bind variables to produce splits for.

    The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.

  29. final def serializedSize: Int

    Permalink
    Definition Classes
    SplitQueryRequest → GeneratedMessage
  30. val splitColumn: Seq[String]

    Permalink

    Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range.

    Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.

  31. val splitCount: Long

    Permalink

    You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return.

    You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).

  32. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  33. def toByteArray: Array[Byte]

    Permalink
    Definition Classes
    GeneratedMessage
  34. def toByteString: ByteString

    Permalink
    Definition Classes
    GeneratedMessage
  35. def toPMessage: PMessage

    Permalink
    Definition Classes
    GeneratedMessage
  36. def toString(): String

    Permalink
    Definition Classes
    SplitQueryRequest → AnyRef → Any
  37. def update(ms: (Lens[SplitQueryRequest, SplitQueryRequest]) ⇒ Mutation[SplitQueryRequest]*): SplitQueryRequest

    Permalink
    Definition Classes
    Updatable
  38. val useSplitQueryV2: Boolean

    Permalink

    TODO(erez): This field is no longer used by the server code.

    TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.

  39. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  40. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. def withAlgorithm(__v: Algorithm): SplitQueryRequest

    Permalink
  43. def withCallerId(__v: CallerID): SplitQueryRequest

    Permalink
  44. def withKeyspace(__v: String): SplitQueryRequest

    Permalink
  45. def withNumRowsPerQueryPart(__v: Long): SplitQueryRequest

    Permalink
  46. def withQuery(__v: BoundQuery): SplitQueryRequest

    Permalink
  47. def withSplitColumn(__v: Seq[String]): SplitQueryRequest

    Permalink
  48. def withSplitCount(__v: Long): SplitQueryRequest

    Permalink
  49. def withUseSplitQueryV2(__v: Boolean): SplitQueryRequest

    Permalink
  50. def writeDelimitedTo(output: OutputStream): Unit

    Permalink
    Definition Classes
    GeneratedMessage
  51. def writeTo(_output__: CodedOutputStream): Unit

    Permalink
    Definition Classes
    SplitQueryRequest → GeneratedMessage
  52. def writeTo(output: OutputStream): Unit

    Permalink
    Definition Classes
    GeneratedMessage

Deprecated Value Members

  1. def getAllFields: Map[FieldDescriptor, Any]

    Permalink
    Definition Classes
    GeneratedMessage
    Annotations
    @deprecated
    Deprecated

    (Since version 0.6.0) Use toPMessage

  2. def getField(field: FieldDescriptor): Any

    Permalink
    Definition Classes
    GeneratedMessage
    Annotations
    @deprecated
    Deprecated

    (Since version 0.6.0) Use getField that accepts a ScalaPB descriptor and returns PValue

Inherited from Product

Inherited from Equals

Inherited from Updatable[SplitQueryRequest]

Inherited from Message[SplitQueryRequest]

Inherited from GeneratedMessage

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped