public class IndexedTable extends AbstractDataset implements Table
This dataset uses two tables:
The indexed values need not be unique. When reading the data back by index value, a Scanner
will be
returned, allowing the client to iterate through all matching rows. Exact matches as well as range lookups on
indexed values are supported.
Index entries are created by storing additional rows in a second table. These index rows are keyed by column name, column value, and original row key, each field separated by a single null byte delimiter. It is not recommended that the name of an index column contains the null byte (this can cause a degradation in performance of index reads).
The columns to index can be configured in the DatasetProperties
used
when the dataset instance in created. Multiple column names should be listed as a comma-separated string
(with no spaces):
public class MyApp extends AbstractApplication {
public void configure() {
setName("MyApp");
...
createDataset("indexedData", IndexedTable.class,
DatasetProperties.builder().add(
IndexedTableDefinition.INDEX_COLUMNS_CONF_KEY, "col1,col2").build());
...
}
}
Note that this means that the column names which should be indexed cannot contain the comma character, as it would break parsing of the configuration property.
INDEX_COLUMNS_CONF_KEY
Modifier and Type | Field and Description |
---|---|
static String |
DYNAMIC_INDEXING
Configuration that specifies that the index columns will be specified at runtime, rather than at configure time.
|
static String |
DYNAMIC_INDEXING_PREFIX
Configuration that specifies the prefix to use to distinguish indexed rows of different tables when dynamic
indexing is enabled.
|
static String |
INDEX_COLUMNS_CONF_KEY
Configuration key for defining column names to index in the DatasetSpecification properties.
|
static String |
TYPE
Type name
|
DEFAULT_COLUMN_FAMILY, PROPERTY_COLUMN_FAMILY, PROPERTY_CONFLICT_LEVEL, PROPERTY_READLESS_INCREMENT, PROPERTY_SCHEMA, PROPERTY_SCHEMA_ROW_FIELD, PROPERTY_TTL
Constructor and Description |
---|
IndexedTable(String name,
Table table,
Table index,
SortedSet<byte[]> columnsToIndex)
Configuration time constructor.
|
IndexedTable(String name,
Table table,
Table index,
SortedSet<byte[]> columnsToIndex,
byte[] keyPrefix)
Configuration time constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
compareAndSwap(byte[] row,
byte[] column,
byte[] expected,
byte[] newValue)
Perform a swap operation by primary key.
|
SplitReader<byte[],Row> |
createSplitReader(Split split)
Creates a reader for the split of a dataset.
|
RecordScanner<StructuredRecord> |
createSplitRecordScanner(Split split)
Creates a reader for the split of a dataset.
|
void |
delete(byte[] row)
Deletes all columns of the specified row.
|
void |
delete(byte[] row,
byte[] column)
Deletes specified column of the specified row.
|
void |
delete(byte[] row,
byte[][] columns)
Deletes specified columns of the specified row.
|
void |
delete(Delete delete)
Perform a delete on the data table.
|
Row |
get(byte[] row)
Reads values of all columns of the specified row.
|
byte[] |
get(byte[] row,
byte[] column)
Reads the value of the specified column of the specified row.
|
Row |
get(byte[] row,
byte[][] columns)
Reads the values of the specified columns of the specified row.
|
Row |
get(byte[] row,
byte[] startColumn,
byte[] stopColumn,
int limit)
Reads the values of all columns in the specified row that are
between the specified start (inclusive) and stop (exclusive) columns.
|
Row |
get(Get get)
Read a row by row key from the data table.
|
List<Row> |
get(List<Get> gets)
Reads values for the rows and columns defined by the
Get parameters. |
Type |
getRecordType()
The type of records that the dataset exposes as a schema.
|
List<Split> |
getSplits()
Returns all splits of the dataset.
|
List<Split> |
getSplits(int numSplits,
byte[] start,
byte[] stop)
Returns splits for a range of keys in the table.
|
void |
increment(byte[] row,
byte[][] columns,
long[] amounts)
Increments (atomically) the specified row and columns by the specified amounts, without returning the new values.
|
void |
increment(byte[] row,
byte[] column,
long amount)
Increments (atomically) the specified row and column by the specified amount, without returning the new value.
|
void |
increment(Increment increment)
Increments (atomically) the specified row and columns by the specified amounts, without returning the new values.
|
Row |
incrementAndGet(byte[] row,
byte[][] columns,
long[] amounts)
Increments (atomically) the specified row and columns by the specified amounts, and returns the new values.
|
long |
incrementAndGet(byte[] row,
byte[] column,
long amount)
Increments (atomically) the specified row and column by the specified amount, and returns the new value.
|
Row |
incrementAndGet(Increment increment)
Increments (atomically) the specified row and columns by the specified amounts, and returns the new values.
|
void |
put(byte[] row,
byte[][] columns,
byte[][] values)
Writes the specified values for the specified columns of the specified row.
|
void |
put(byte[] row,
byte[] column,
byte[] value)
Writes the specified value for the specified column of the specified row.
|
void |
put(Put put)
Writes a put to the data table.
|
Scanner |
readByIndex(byte[] column,
byte[] value)
Reads table rows by the given secondary index key.
|
Scanner |
scan(byte[] startRow,
byte[] stopRow)
Scans table.
|
Scanner |
scan(Scan scan)
|
Scanner |
scanByIndex(byte[] column,
byte[] startValue,
byte[] endValue)
Reads table rows within the given secondary index key range.
|
void |
write(byte[] bytes,
Put put)
Writes the {key, value} record into a dataset.
|
void |
write(StructuredRecord structuredRecord)
Writes the record into a dataset.
|
close, commitTx, getName, getTransactionAwareName, getTxChanges, postTxCommit, rollbackTx, setMetricsCollector, startTx, toString, updateTx
public static final String TYPE
public static final String INDEX_COLUMNS_CONF_KEY
public static final String DYNAMIC_INDEXING
public static final String DYNAMIC_INDEXING_PREFIX
public IndexedTable(String name, Table table, Table index, SortedSet<byte[]> columnsToIndex)
name
- the name of the tabletable
- table to use as the tableindex
- table to use as the indexcolumnsToIndex
- the names of the data columns to indexpublic IndexedTable(String name, Table table, Table index, SortedSet<byte[]> columnsToIndex, byte[] keyPrefix)
name
- the name of the tabletable
- table to use as the tableindex
- table to use as the indexcolumnsToIndex
- the names of the data columns to indexkeyPrefix
- the dynamic indexing prefix. See DYNAMIC_INDEXING_PREFIX
public Row get(byte[] row)
Table
NOTE: Depending on the implementation of this interface and use-case, calling this method can be less efficient than calling the same method with columns as parameters because it can require making a round trip to the persistent store.
NOTE: objects that are passed in parameters can be re-used by underlying implementation and present in returned data structures from this method.
public byte[] get(byte[] row, byte[] column)
Table
public Row get(byte[] row, byte[][] columns)
Table
NOTE: objects that are passed in parameters can be re-used by underlying implementation and present in returned data structures from this method.
public Row get(byte[] row, byte[] startColumn, byte[] stopColumn, int limit)
Table
NOTE: objects that are passed in parameters can be re-used by underlying implementation and present in returned data structures from this method.
public List<Row> get(List<Get> gets)
Table
Get
parameters. When running in distributed mode,
and retrieving multiple rows at the same time, this method should be preferred to multiple Table.get(Get)
calls, as the operations will be batched into a single remote call per server.public Scanner readByIndex(byte[] column, byte[] value)
Scanner
with no results will be returned.IllegalArgumentException
- if the given column is not configured for indexing.public Scanner scanByIndex(byte[] column, @Nullable byte[] startValue, @Nullable byte[] endValue)
Scanner
with no results will be returned.column
- the column to use for the index lookupstartValue
- the inclusive start of the range for which rows must fall within to be returned in the scan.
null
means start from first row of the tableendValue
- the exclusive end of the range for which rows must fall within to be returned in the scan
null
means end with the last row of the tableIllegalArgumentException
- if the given column is not configured for indexing.public void put(Put put)
Put
are configured to be indexed, the
appropriate indexes will be updated with the indexed values referencing the data table row.public void put(byte[] row, byte[] column, byte[] value)
Table
public void put(byte[] row, byte[][] columns, byte[][] values)
Table
NOTE: Depending on the implementation, this can work faster than calling Table.put(byte[], byte[], byte[])
multiple times (especially in transactions that alter many columns of one row).
public void delete(Delete delete)
public void delete(byte[] row)
Table
NOTE: Depending on the implementation of this interface and use-case, calling this method can be less efficient than calling the same method with columns as parameters because it can require a round trip to the persistent store.
public void delete(byte[] row, byte[] column)
Table
public void delete(byte[] row, byte[][] columns)
Table
NOTE: Depending on the implementation, this can work faster than calling Table.delete(byte[], byte[])
multiple times (especially in transactions that delete many columns of the same rows).
public boolean compareAndSwap(byte[] row, byte[] column, byte[] expected, byte[] newValue)
compareAndSwap
in interface Table
row
- row to modifycolumn
- column to modifyexpected
- expected value before changenewValue
- value to setpublic long incrementAndGet(byte[] row, byte[] column, long amount)
incrementAndGet
in interface Table
row
- row which value to incrementcolumn
- column to incrementamount
- amount to increment byTable.incrementAndGet(byte[], byte[], long)
public Row incrementAndGet(byte[] row, byte[][] columns, long[] amounts)
incrementAndGet
in interface Table
row
- row whose values to incrementcolumns
- columns to incrementamounts
- amounts to increment columns by (in the same order as the columns)Row
with a subset of changed columnsTable.incrementAndGet(byte[], byte[][], long[])
public Row incrementAndGet(Increment increment)
incrementAndGet
in interface Table
increment
- defines changesRow
with a subset of changed columnsTable.incrementAndGet(Increment)
public void increment(byte[] row, byte[] column, long amount)
IllegalArgumentException
.increment
in interface Table
row
- row which values to incrementcolumn
- column to incrementamount
- amount to increment byTable.increment(byte[], byte[], long)
public void increment(byte[] row, byte[][] columns, long[] amounts)
IllegalArgumentException
.increment
in interface Table
row
- row which values to incrementcolumns
- columns to incrementamounts
- amounts to increment columns by (same order as columns)Table.increment(byte[], byte[][], long[])
public void increment(Increment increment)
IllegalArgumentException
.increment
in interface Table
increment
- the row and column increment amountsTable.increment(Increment)
public Scanner scan(@Nullable byte[] startRow, @Nullable byte[] stopRow)
Table
public List<Split> getSplits(int numSplits, byte[] start, byte[] stop)
Table
getSplits
in interface Table
numSplits
- Desired number of splits. If greater than zero, at most this many splits will be returned.
If less than or equal to zero, any number of splits can be returned.start
- if non-null, the returned splits will only cover keys that are greater or equalstop
- if non-null, the returned splits will only cover keys that are lessSplit
public Type getRecordType()
RecordScannable
getRecordType
in interface RecordScannable<StructuredRecord>
getRecordType
in interface RecordWritable<StructuredRecord>
public void write(StructuredRecord structuredRecord) throws IOException
RecordWritable
write
in interface RecordWritable<StructuredRecord>
structuredRecord
- record to write into the dataset.IOException
- when the RECORD
could not be written to the dataset.public List<Split> getSplits()
BatchReadable
For feeding the whole dataset into a batch job.
getSplits
in interface BatchReadable<byte[],Row>
getSplits
in interface RecordScannable<StructuredRecord>
Split
s.public RecordScanner<StructuredRecord> createSplitRecordScanner(Split split)
RecordScannable
createSplitRecordScanner
in interface RecordScannable<StructuredRecord>
split
- The split to create a reader for.RecordScanner
.public SplitReader<byte[],Row> createSplitReader(Split split)
BatchReadable
createSplitReader
in interface BatchReadable<byte[],Row>
split
- The split to create a reader for.SplitReader
.public void write(byte[] bytes, Put put)
Table
write
in interface BatchWritable<byte[],Put>
write
in interface Table
bytes
- always ignored, and it is safe to write null as the key.put
- a Put for one row of the table.Copyright © 2023 Cask Data, Inc. Licensed under the Apache License, Version 2.0.