Interface KeyValueService

    • Method Detail

      • getDelegates

        Collection<? extends KeyValueService> getDelegates()
        Gets all key value services this key value service delegates to directly.

        This can be used to decompose a complex key value service using table splits, tiers, or other delegating operations into its subcomponents.

      • getRows

        @Idempotent
        Map<Cell,​Value> getRows​(TableReference tableRef,
                                      Iterable<byte[]> rows,
                                      ColumnSelection columnSelection,
                                      long timestamp)
        Gets values from the key-value store.
        Parameters:
        tableRef - the name of the table to retrieve values from.
        rows - set containing the rows to retrieve values for.
        columnSelection - specifies the set of columns to fetch.
        timestamp - specifies the maximum timestamp (exclusive) at which to retrieve each rows's value.
        Returns:
        map of retrieved values. Values which do not exist (either because they were deleted or never created in the first place) are simply not returned.
        Throws:
        IllegalArgumentException - if any of the requests were invalid (e.g., attempting to retrieve values from a non-existent table).
      • getRowsColumnRange

        @Idempotent
        Map<byte[],​RowColumnRangeIterator> getRowsColumnRange​(TableReference tableRef,
                                                                    Iterable<byte[]> rows,
                                                                    BatchColumnRangeSelection batchColumnRangeSelection,
                                                                    long timestamp)
        Gets values from the key-value store for the specified rows and column range as separate iterators for each row.
        Parameters:
        tableRef - the name of the table to retrieve values from.
        rows - set containing the rows to retrieve values for. Behavior is undefined if rows contains duplicates (as defined by Arrays.equals(byte[], byte[])).
        batchColumnRangeSelection - specifies the column range and the per-row batchSize to fetch.
        timestamp - specifies the maximum timestamp (exclusive) at which to retrieve each rows's value.
        Returns:
        map of row names to RowColumnRangeIterator. Each RowColumnRangeIterator can iterate over the values that are spanned by the batchColumnRangeSelection in increasing order by column name.
        Throws:
        IllegalArgumentException - if rows contains duplicates.
      • getRowsColumnRange

        @Idempotent
        RowColumnRangeIterator getRowsColumnRange​(TableReference tableRef,
                                                  Iterable<byte[]> rows,
                                                  ColumnRangeSelection columnRangeSelection,
                                                  int cellBatchHint,
                                                  long timestamp)
        Gets values from the key-value store for the specified rows and column range as a single iterator. This method should be at least as performant as getRowsColumnRange(TableReference, Iterable, BatchColumnRangeSelection, long), and may be more performant in some cases.
        Parameters:
        tableRef - the name of the table to retrieve values from.
        rows - set containing the rows to retrieve values for. Behavior is undefined if rows contains duplicates (as defined by Arrays.equals(byte[], byte[])).
        columnRangeSelection - specifies the column range to fetch.
        cellBatchHint - specifies the batch size for fetching the values.
        timestamp - specifies the maximum timestamp (exclusive) at which to retrieve each rows's value.
        Returns:
        a RowColumnRangeIterator that can iterate over all the retrieved values. Results for different rows are in the same order as they are provided in rows. All columns for a given row are adjacent and sorted by increasing column name.
        Throws:
        IllegalArgumentException - if rows contains duplicates.
      • get

        @Idempotent
        Map<Cell,​Value> get​(TableReference tableRef,
                                  Map<Cell,​Long> timestampByCell)
        Gets values from the key-value store.
        Parameters:
        tableRef - the name of the table to retrieve values from.
        timestampByCell - specifies, for each row, the maximum timestamp (exclusive) at which to retrieve that rows's value.
        Returns:
        map of retrieved values. Values which do not exist (either because they were deleted or never created in the first place) are simply not returned.
        Throws:
        IllegalArgumentException - if any of the requests were invalid (e.g., attempting to retrieve values from a non-existent table).
      • getLatestTimestamps

        @Idempotent
        Map<Cell,​Long> getLatestTimestamps​(TableReference tableRef,
                                                 Map<Cell,​Long> timestampByCell)
        Gets timestamp values from the key-value store.
        Parameters:
        tableRef - the name of the table to retrieve values from.
        timestampByCell - map containing the cells to retrieve timestamps for. The map specifies, for each key, the maximum timestamp (exclusive) at which to retrieve that key's value.
        Returns:
        map of retrieved values. cells which do not exist (either because they were deleted or never created in the first place) are simply not returned.
        Throws:
        IllegalArgumentException - if any of the requests were invalid (e.g., attempting to retrieve values from a non-existent table).
      • put

        @Idempotent
        void put​(TableReference tableRef,
                 Map<Cell,​byte[]> values,
                 long timestamp)
          throws KeyAlreadyExistsException
        Puts values into the key-value store. This call does not guarantee atomicity across cells. On failure, it is possible that some of the requests will have succeeded (without having been rolled back). Similarly, concurrent batched requests may interleave.

        If the key-value store supports durability, this call guarantees that the requests have successfully been written to disk before returning.

        Putting a null value is the same as putting the empty byte[]. If you want to delete a value try delete(TableReference, Multimap).

        May throw KeyAlreadyExistsException, if storing a different value to existing key, but this is not guaranteed even if the key exists - see putUnlessExists(com.palantir.atlasdb.keyvalue.api.TableReference, java.util.Map<com.palantir.atlasdb.keyvalue.api.Cell, byte[]>)}.

        Must not throw KeyAlreadyExistsException when overwriting a cell with the original value (idempotent).

        Parameters:
        tableRef - the name of the table to put values into.
        values - map containing the key-value entries to put.
        timestamp - must be non-negative and not equal to Long.MAX_VALUE
        Throws:
        KeyAlreadyExistsException
      • putWithTimestamps

        @NonIdempotent
        @Idempotent
        void putWithTimestamps​(TableReference tableRef,
                               com.google.common.collect.Multimap<Cell,​Value> cellValues)
                        throws KeyAlreadyExistsException
        Puts values into the key-value store with individually specified timestamps. This call does not guarantee atomicity across cells. On failure, it is possible that some of the requests will have succeeded (without having been rolled back). Similarly, concurrent batched requests may interleave.

        If the key-value store supports durability, this call guarantees that the requests have successfully been written to disk before returning.

        This method may be non-idempotent. On some write-once implementations retrying this call may result in failure. The way around this is to delete and retry.

        Putting a null value is the same as putting the empty byte[]. If you want to delete a value try delete(TableReference, Multimap).

        May throw KeyAlreadyExistsException, if storing a different value to existing key, but this is not guaranteed even if the key exists - see putUnlessExists(com.palantir.atlasdb.keyvalue.api.TableReference, java.util.Map<com.palantir.atlasdb.keyvalue.api.Cell, byte[]>).

        Must not throw KeyAlreadyExistsException when overwriting a cell with the original value (idempotent).

        Parameters:
        tableRef - the name of the table to put values into.
        cellValues - map containing the key-value entries to put with non-negative timestamps less than Long.MAX_VALUE.
        Throws:
        KeyAlreadyExistsException
      • putUnlessExists

        void putUnlessExists​(TableReference tableRef,
                             Map<Cell,​byte[]> values)
                      throws KeyAlreadyExistsException
        Puts values into the key-value store. This call does not guarantee atomicity across cells. On failure, it is possible that some of the requests will have succeeded (without having been rolled back). Similarly, concurrent batched requests may interleave. However, concurrent writes to the same Cell will not both report success. One of them will throw KeyAlreadyExistsException.

        A single Cell will only ever take on one value.

        If the call completes successfully then you know that your value was written and no other value was written first. If a KeyAlreadyExistsException is thrown it may be because the underlying call did a retry and your value was actually put successfully. It is recommended that you check the stored value to account for this case.

        Retry should be done by the underlying implementation to ensure that other exceptions besides KeyAlreadyExistsException are not thrown spuriously.

        Parameters:
        tableRef - the name of the table to put values into.
        values - map containing the key-value entries to put.
        Throws:
        KeyAlreadyExistsException - If you are putting a Cell with the same timestamp as one that already exists.
      • setOnce

        void setOnce​(TableReference tableRef,
                     Map<Cell,​byte[]> values)
        Puts a value into the key-value store explicitly overwriting existing entries written by putUnlessExists(TableReference, Map) and checkAndSet(CheckAndSetRequest). Once this method has been called for a cell, calling it again for the same cell and different value, or attempting a CAS on the cell is undefined.

        WARNING Use this method if and only if you wish to set a value in an atomic table. Otherwise, use put(TableReference, Map, long).

        Parameters:
        tableRef - the name of the table to put values into.
        values - map containing the key-value entries to put.
      • supportsCheckAndSet

        default boolean supportsCheckAndSet()
        Check whether CAS is supported. This check can go away when JDBC KVS is deleted.
        Returns:
        true iff checkAndSet is supported (for all delegates/tables, if applicable)
      • checkAndSet

        void checkAndSet​(CheckAndSetRequest checkAndSetRequest)
                  throws CheckAndSetException
        Performs a check-and-set into the key-value store. Please see CheckAndSetRequest for information about how to create this request.

        Note that this call does not guarantee atomicity across Cells. If you attempt to achieve this guarantee by performing multiple checkAndSet calls in a single transaction, and one of the calls fails, then you will need to manually roll back successful checkAndSet operations, as data will have been overwritten. It is therefore not recommended to attempt to perform checkAndSet operations alongside other operations in a single transaction.

        If the call completes successfully, then you know that the Cell initially had the value you expected, although the Cell could have taken on another value and then been written back to the expected value since said value was obtained. If a CheckAndSetException is thrown, it is likely that the value stored was not as you expected. In this case, you may want to check the stored value and determine why it was different from the expected value.

        Parameters:
        checkAndSetRequest - the request, including table, cell, old value and new value.
        Throws:
        CheckAndSetException - if the stored value for the cell was not as expected.
      • delete

        @Idempotent
        void delete​(TableReference tableRef,
                    com.google.common.collect.Multimap<Cell,​Long> keys)
        Deletes values from the key-value store.

        This call does not guarantee atomicity for deletes across (Cell, ts) pairs. However it MUST be implemented where timestamps are deleted in increasing order for each Cell. This means that if there is a request to delete (c, 1) and (c, 2) then the system will never be in a state where (c, 2) was successfully deleted but (c, 1) still remains. It is possible that if there is a failure, then some of the cells may have succeeded. Similarly, concurrent batched requests may interleave.

        If the key-value store supports durability, this call guarantees that the requests have successfully been written to disk before returning.

        If a key value store supports garbage collection, then a call to delete should mean the value will not be read in the future. If GC isn't supported, then delete can be written to have a best effort attempt to delete the values.

        Some systems may require more nodes to be up to ensure that a delete is successful. If this is the case then this method may throw if the delete can't be completed on all nodes.

        Parameters:
        tableRef - the name of the table to delete values from.
        keys - map containing the keys to delete values for; the map should specify, for each
      • deleteRange

        @Idempotent
        void deleteRange​(TableReference tableRef,
                         RangeRequest range)
        Deletes values in a range from the key-value store. Does not guarantee an atomic delete throughout the entire range. Currently does not allow a column selection to mean only delete certain columns in a range. Some systems may require more nodes to be up to ensure that a delete is successful. If this is the case then this method may throw if the delete can't be completed on all nodes.
        Parameters:
        tableRef - the name of the table to delete values from.
        range - the range to delete
      • deleteRows

        @Idempotent
        void deleteRows​(TableReference tableRef,
                        Iterable<byte[]> rows)
        Deletes multiple complete rows from the key-value store. Does not guarantee atomicity in any way (deletes may be partial within *any* of the rows provided, and there is no guarantee of any correlation or lack thereof between success of the deletes for each of the rows provided). Some systems may require more nodes to be up to ensure that a delete is successful. If this is the case then this method may throw if the delete can't be completed on all nodes. Please be aware that if it does throw, some deletes may have been applied on some nodes. This method MAY require linearly many calls to the database in the number of rows, so should be used with caution.
        Parameters:
        tableRef - the name of the table to delete values from.
        rows - rows to delete
      • deleteAllTimestamps

        @Idempotent
        void deleteAllTimestamps​(TableReference tableRef,
                                 Map<Cell,​TimestampRangeDelete> deletes)
                          throws InsufficientConsistencyException
        For each cell, deletes all timestamps prior to the associated maximum timestamp. If this operation fails, it's acceptable for this method to leave an inconsistent state, however implementations of this method must guarantee that, for each cell, if a value at the associated timestamp is inconsistently deleted, then all other values of that cell in the relevant range must have already been consistently deleted.
        Parameters:
        tableRef - the name of the table to delete the timestamps in.
        deletes - cells to be deleted, and the ranges of timestamps to delete for each cell
        Throws:
        InsufficientConsistencyException
      • truncateTable

        @Idempotent
        void truncateTable​(TableReference tableRef)
                    throws InsufficientConsistencyException
        Truncate a table in the key-value store.

        This is preferred to dropping and re-adding a table, as live schema changes can be a complicated topic for distributed databases.

        Parameters:
        tableRef - the name of the table to truncate.
        Throws:
        InsufficientConsistencyException - if not all hosts respond successfully
        RuntimeException - or a subclass of RuntimeException if the table does not exist
      • getRange

        @Idempotent
        com.palantir.common.base.ClosableIterator<RowResult<Value>> getRange​(TableReference tableRef,
                                                                             RangeRequest rangeRequest,
                                                                             long timestamp)
        For each row in the specified range, returns the most recent version strictly before timestamp. Remember to close any ClosableIterators you get in a finally block.
        Parameters:
        rangeRequest - the range to load.
        timestamp - specifies the maximum timestamp (exclusive) at which to retrieve each rows's
      • getCandidateCellsForSweeping

        com.palantir.common.base.ClosableIterator<List<CandidateCellForSweeping>> getCandidateCellsForSweeping​(TableReference tableRef,
                                                                                                               CandidateCellForSweepingRequest request)
        For a given range of rows, returns all candidate cells for sweeping (and their timestamps).

        A candidate cell is a cell that has at least one timestamp that is less than request.sweepTimestamp() and is not in the set specified by request.timestampsToIgnore().

        This method will scan the semi-open range of rows from the start row specified in the request to the end of the table. If the given start row name is an empty byte array, the whole table will be scanned.

        The returned cells will be lexicographically ordered.

        We return an iterator of lists instead of a "flat" iterator of results so that we preserve the information about batching. The caller can always use Iterators.concat() or similar if this is undesired.

      • getFirstBatchForRanges

        @Idempotent
        Map<RangeRequest,​com.palantir.util.paging.TokenBackedBasicResultsPage<RowResult<Value>,​byte[]>> getFirstBatchForRanges​(TableReference tableRef,
                                                                                                                                           Iterable<RangeRequest> rangeRequests,
                                                                                                                                           long timestamp)
        For each range passed in the result will have the first page of results for that range.

        The page size for each range is dictated by the parameter RangeRequest.getBatchHint(). If no batch size hint is specified for a range, then it will just get the first row in that range.

        It is possible that the results may be empty if the first cells after the start of the range all have timestamps greater than the requested timestamp. In this case BasicResultsPage.moreResultsAvailable() will return true and the token for the next page will be set.

        It may be possible to get back a result with BasicResultsPage.moreResultsAvailable() set to true when there aren't more left. The next call will return zero results and have moreResultsAvailable set to false.

      • dropTables

        @Idempotent
        void dropTables​(Set<TableReference> tableRefs)
                 throws InsufficientConsistencyException
        Drops many tables in idempotent fashion. If you are dropping many tables at once, use this call as the implementation can be much faster/less error-prone on some KVSs. Also deletes corresponding table metadata. Do not fall into the trap of performing drop & immediate re-create of tables; instead use 'truncate' for this task.
        Throws:
        InsufficientConsistencyException
      • getAllTableNames

        @Idempotent
        Set<TableReference> getAllTableNames()
        Return the list of tables stored in this key value service. This will contain system tables (such as the _transaction table), but will not contain the names of any tables used internally by the key value service (a common example is a _metadata table for storing table metadata).
      • getMetadataForTable

        @Idempotent
        byte[] getMetadataForTable​(TableReference tableRef)
        Gets the metadata for a given table. Also useful for checking to see if a table exists.
        Returns:
        a byte array representing the metadata for the table. Array is empty if no table with the given name exists. Consider TableMetadata#BYTES_HYDRATOR for hydrating.
      • getMetadataForTables

        @Idempotent
        Map<TableReference,​byte[]> getMetadataForTables()
        Gets the metadata for all known user-created Atlas tables. Consider not using this if you will be running against an Atlas instance with a large number of tables.
        Returns:
        a Map from TableReference to byte array representing the metadata for the table Consider TableMetadata#BYTES_HYDRATOR for hydrating
      • putMetadataForTable

        @Idempotent
        void putMetadataForTable​(TableReference tableRef,
                                 byte[] metadata)
      • putMetadataForTables

        @Idempotent
        void putMetadataForTables​(Map<TableReference,​byte[]> tableRefToMetadata)
      • addGarbageCollectionSentinelValues

        @Idempotent
        void addGarbageCollectionSentinelValues​(TableReference tableRef,
                                                Iterable<Cell> cells)
        Adds a value with timestamp = Value.INVALID_VALUE_TIMESTAMP to each of the given cells. If a value already exists at that time stamp, nothing is written for that cell.
      • getAllTimestamps

        @Idempotent
        com.google.common.collect.Multimap<Cell,​Long> getAllTimestamps​(TableReference tableRef,
                                                                             Set<Cell> cells,
                                                                             long timestamp)
                                                                      throws com.palantir.common.exception.AtlasDbDependencyException
        Gets timestamp values from the key-value store. For each cell, this returns all associated timestamps < given_ts.

        This method has stronger consistency guarantees than regular read requests. This must return all timestamps stored anywhere in the system. An example of where this could happen is if we use a system with QUORUM reads and writes. Under normal operations reads only need to talk to a Quorum of hosts. However this call MUST be implemented by talking to ALL the nodes where a value could be stored.

        Parameters:
        tableRef - the name of the table to retrieve timestamps from.
        cells - set containg cells to retrieve timestamps for.
        timestamp - maximum timestamp to get (exclusive)
        Returns:
        multimap of timestamps by cell
        Throws:
        com.palantir.common.exception.AtlasDbDependencyException
      • compactInternally

        void compactInternally​(TableReference tableRef)
        Does whatever can be done to compact or cleanup a table. Intended to be called after many deletions are performed. This call must be implemented so that it completes synchronously.
      • compactInternally

        default void compactInternally​(TableReference tableRef,
                                       boolean inMaintenanceMode)
        Some compaction operations might block reads and writes. These operations will trigger only if inMaintenanceMode is set to true.
      • isInitialized

        default boolean isInitialized()
        Returns true iff the KeyValueService has been initialized and is ready to use. Note that this check ignores the cluster's availability - use getClusterAvailabilityStatus() if you wish to verify that we can talk to the backing store.
      • performanceIsSensitiveToTombstones

        default boolean performanceIsSensitiveToTombstones()
        Whether or not read performance degrades significantly when many deleted cells are in the requested range. This is used by sweep to determine if it should wait a while between runs after deleting a large number of cells.
      • getRowKeysInRange

        @Deprecated
        List<byte[]> getRowKeysInRange​(TableReference tableRef,
                                       byte[] startRow,
                                       byte[] endRow,
                                       int maxResults)
        Deprecated.
        if you wish to use this method, contact the atlasdb team for support
        Returns a sorted list of row keys in the specified range. This method is not guaranteed to be implemented for all implementations of KeyValueService. It may be changed or removed at any time without warning.
        Parameters:
        tableRef - table for which the request is made.
        startRow - inclusive start of the row key range. Use empty byte array for unbounded.
        endRow - inclusive end of the row key range. Use empty byte array for unbounded.
        maxResults - the request only returns the first maxResults rows in range.