SplitQueryRequest

SplitQueryRequest is the payload to SplitQuery.

SplitQuery takes a "SELECT" query and generates a list of queries called "query-parts". Each query-part consists of the original query with an added WHERE clause that restricts the query-part to operate only on rows whose values in the the columns listed in the "split_column" field of the request (see below) are in a particular range.

It is guaranteed that the set of rows obtained from executing each query-part on a database snapshot and merging (without deduping) the results is equal to the set of rows obtained from executing the original query on the same snapshot with the rows containing NULL values in any of the split_column's excluded.

This is typically called by the MapReduce master when reading from Vitess. There it's desirable that the sets of rows returned by the query-parts have roughly the same size.

callerId: caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.
keyspace: keyspace to target the query to.
query: The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.
splitColumn: Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.
splitCount: You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).
algorithm: The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.
useSplitQueryV2: TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.

Annotations: @SerialVersionUID()

Linear Supertypes

Product, Equals, Updatable[SplitQueryRequest], Message[SplitQueryRequest], GeneratedMessage, Serializable, Serializable, AnyRef, Any

Instance Constructors

new SplitQueryRequest(callerId: Option[CallerID] = None, keyspace: String = "", query: Option[BoundQuery] = None, splitColumn: Seq[String] = _root_.scala.collection.Seq.empty, splitCount: Long = 0L, numRowsPerQueryPart: Long = 0L, algorithm: Algorithm = ..., useSplitQueryV2: Boolean = false)

callerId
caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.
keyspace
keyspace to target the query to.
query
The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.
splitColumn
Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.
splitCount
You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).
algorithm
The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.
useSplitQueryV2
TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def addAllSplitColumn(__vs: TraversableOnce[String]): SplitQueryRequest
def addSplitColumn(__vs: String*): SplitQueryRequest
val algorithm: Algorithm

The algorithm to use to split the query.
The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS.
final def asInstanceOf[T0]: T0

Definition Classes
Any
val callerId: Option[CallerID]

caller_id identifies the caller.
caller_id identifies the caller. This is the effective caller ID, set by the application to further identify the caller.
def clearCallerId: SplitQueryRequest
def clearQuery: SplitQueryRequest
def clearSplitColumn: SplitQueryRequest
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def companion: SplitQueryRequest.type

Definition Classes
SplitQueryRequest → GeneratedMessage
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def getCallerId: CallerID
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getField(__field: FieldDescriptor): PValue

Definition Classes
SplitQueryRequest → GeneratedMessage
def getFieldByNumber(__fieldNumber: Int): Any

Definition Classes
SplitQueryRequest → GeneratedMessage
def getQuery: BoundQuery
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val keyspace: String

keyspace to target the query to.
def mergeFrom(_input__: CodedInputStream): SplitQueryRequest

Definition Classes
SplitQueryRequest → Message
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val numRowsPerQueryPart: Long
val query: Option[BoundQuery]

The query and bind variables to produce splits for.
The query and bind variables to produce splits for. The given query must be a simple query of the form SELECT <cols> FROM <table> WHERE <filter>. It must not contain subqueries nor any of the keywords JOIN, GROUP BY, ORDER BY, LIMIT, DISTINCT. Furthermore, <table> must be a single “concrete” table. It cannot be a view.
final def serializedSize: Int

Definition Classes
SplitQueryRequest → GeneratedMessage
val splitColumn: Seq[String]

Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range.
Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information.
val splitCount: Long

You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return.
You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards).
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toByteArray: Array[Byte]

Definition Classes
GeneratedMessage
def toByteString: ByteString

Definition Classes
GeneratedMessage
def toPMessage: PMessage

Definition Classes
GeneratedMessage
def toString(): String

Definition Classes
SplitQueryRequest → AnyRef → Any
def update(ms: (Lens[SplitQueryRequest, SplitQueryRequest]) ⇒ Mutation[SplitQueryRequest]*): SplitQueryRequest

Definition Classes
Updatable
val useSplitQueryV2: Boolean

TODO(erez): This field is no longer used by the server code.
TODO(erez): This field is no longer used by the server code. Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production.
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def withAlgorithm(__v: Algorithm): SplitQueryRequest
def withCallerId(__v: CallerID): SplitQueryRequest
def withKeyspace(__v: String): SplitQueryRequest
def withNumRowsPerQueryPart(__v: Long): SplitQueryRequest
def withQuery(__v: BoundQuery): SplitQueryRequest
def withSplitColumn(__v: Seq[String]): SplitQueryRequest
def withSplitCount(__v: Long): SplitQueryRequest
def withUseSplitQueryV2(__v: Boolean): SplitQueryRequest
def writeDelimitedTo(output: OutputStream): Unit

Definition Classes
GeneratedMessage
def writeTo(_output__: CodedOutputStream): Unit

Definition Classes
SplitQueryRequest → GeneratedMessage
def writeTo(output: OutputStream): Unit

Definition Classes
GeneratedMessage

Deprecated Value Members

def getAllFields: Map[FieldDescriptor, Any]

Definition Classes
GeneratedMessage
Annotations
@deprecated
Deprecated
(Since version 0.6.0) Use toPMessage
def getField(field: FieldDescriptor): Any

Definition Classes
GeneratedMessage
Annotations
@deprecated
Deprecated
(Since version 0.6.0) Use getField that accepts a ScalaPB descriptor and returns PValue

Related Docs: object SplitQueryRequest | package vtgate

Instance Constructors

new SplitQueryRequest(callerId: Option[CallerID] = None, keyspace: String = "", query: Option[BoundQuery] = None, splitColumn: Seq[String] = _root_.scala.collection.Seq.empty, splitCount: Long = 0L, numRowsPerQueryPart: Long = 0L, algorithm: Algorithm = ..., useSplitQueryV2: Boolean = false)

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def addAllSplitColumn(__vs: TraversableOnce[String]): SplitQueryRequest

def addSplitColumn(__vs: String*): SplitQueryRequest

val algorithm: Algorithm

final def asInstanceOf[T0]: T0

val callerId: Option[CallerID]

def clearCallerId: SplitQueryRequest

def clearQuery: SplitQueryRequest

def clearSplitColumn: SplitQueryRequest

def clone(): AnyRef

def companion: SplitQueryRequest.type

final def eq(arg0: AnyRef): Boolean

def finalize(): Unit

def getCallerId: CallerID

final def getClass(): Class[_]

def getField(__field: FieldDescriptor): PValue

def getFieldByNumber(__fieldNumber: Int): Any

def getQuery: BoundQuery

final def isInstanceOf[T0]: Boolean

val keyspace: String

def mergeFrom(_input__: CodedInputStream): SplitQueryRequest

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val numRowsPerQueryPart: Long

val query: Option[BoundQuery]

final def serializedSize: Int

val splitColumn: Seq[String]

val splitCount: Long

final def synchronized[T0](arg0: ⇒ T0): T0

def toByteArray: Array[Byte]

def toByteString: ByteString

def toPMessage: PMessage

def toString(): String

def update(ms: (Lens[SplitQueryRequest, SplitQueryRequest]) ⇒ Mutation[SplitQueryRequest]*): SplitQueryRequest

val useSplitQueryV2: Boolean

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

def withAlgorithm(__v: Algorithm): SplitQueryRequest

def withCallerId(__v: CallerID): SplitQueryRequest

def withKeyspace(__v: String): SplitQueryRequest

def withNumRowsPerQueryPart(__v: Long): SplitQueryRequest

def withQuery(__v: BoundQuery): SplitQueryRequest

def withSplitColumn(__v: Seq[String]): SplitQueryRequest

def withSplitCount(__v: Long): SplitQueryRequest

def withUseSplitQueryV2(__v: Boolean): SplitQueryRequest

def writeDelimitedTo(output: OutputStream): Unit

def writeTo(_output__: CodedOutputStream): Unit

def writeTo(output: OutputStream): Unit

Deprecated Value Members

def getAllFields: Map[FieldDescriptor, Any]

def getField(field: FieldDescriptor): Any

Inherited from Product

Inherited from Equals

Inherited from Updatable[SplitQueryRequest]

Inherited from Message[SplitQueryRequest]

Inherited from GeneratedMessage

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped