quasar.yggdrasil.table.BlockStoreColumnarTableModule
Sorts the KV table by ascending or descending order based on a seq of transformations applied to the rows.
Sorts the KV table by ascending or descending order based on a seq of transformations applied to the rows.
The transspecs to use to obtain the values to sort on
The transspec to use to obtain the non-sorting values
Whether to sort ascending or descending
If true, the same key values will sort into a single row, otherwise we assign a unique row ID as part of the key so that multiple equal values are preserved
For each distinct path in the table, load all columns identified by the specified jtype and concatenate the resulting slices into a new table.
For each distinct path in the table, load all columns identified by the specified jtype and concatenate the resulting slices into a new table.
Sorts the KV table by ascending or descending order of a transformation applied to the rows.
Sorts the KV table by ascending or descending order of a transformation applied to the rows.
The transspec to use to obtain the values to sort on
Whether to sort ascending or descending
If true, the same key values will sort into a single row, otherwise we assign a unique row ID as part of the key so that multiple equal values are preserved
Converts a table to an internal table, if possible.
Converts a table to an internal table, if possible. If the table is
already an InternalTable
or a SingletonTable
, then the conversion
will always succeed. If the table is an ExternalTable
, then if it has
less than limit
rows, it will be converted to an InternalTable
,
otherwise it will stay an ExternalTable
.
Returns a table where each slice (except maybe the last) has slice size length
.
Returns a table where each slice (except maybe the last) has slice size length
.
Also removes slices of size zero. If an optional maxLength0
size is provided,
then the slices need only land in the range between length
and maxLength0
.
For slices being loaded from ingest, it is often the case that we are missing a
few rows at the end, so we shouldn't be too strict.
Cogroups this table with another table, using equality on the specified transformation on rows of the table.
Cogroups this table with another table, using equality on the specified transformation on rows of the table.
Removes all rows in the table for which definedness is satisfied Remaps the indicies.
Removes all rows in the table for which definedness is satisfied Remaps the indicies.
Performs a full cartesian cross on this table with the specified table, applying the specified transformation to merge the two tables into a single table.
Performs a full cartesian cross on this table with the specified table, applying the specified transformation to merge the two tables into a single table.
Yields a new table with distinct rows.
Yields a new table with distinct rows. Assumes this table is sorted.
Force the table to a backing store, and provice a restartable table over the results.
Force the table to a backing store, and provice a restartable table over the results.
In order to call partitionMerge, the table must be sorted according to the values specified by the partitionBy transspec.
In order to call partitionMerge, the table must be sorted according to the values specified by the partitionBy transspec.
Folds over the table to produce a single value (stored in a singleton table).
Folds over the table to produce a single value (stored in a singleton table).
A one-pass algorithm for sampling.
A one-pass algorithm for sampling. This runs in time O(H_n*m2 + n) = O(m2 lg n + n), so it is not super optimal. Another good option is to try Alissa's approach; keep 2 buffers of size m. Load one up fully, shuffle, then replace, but we aren't 100% sure it is uniform and independent.
Of course, the hope is that this will not be used once we get efficient sampling in that runs in O(m lg n) time.
Return an indication of table size, if known
Return an indication of table size, if known
Forces a table to an external table, possibly de-optimizing it.
Performs a one-pass transformation of the keys and values in the table.
Performs a one-pass transformation of the keys and values in the table. If the key transform is not identity, the resulting table will have unknown sort order.
Zips two tables together in their current sorted order.
Zips two tables together in their current sorted order. If the tables are not normalized first and thus have different slices sizes, then since the zipping is done per slice, this can produce a result that is different than if the tables were normalized.