Class SingleColumnHistogramWriter


  • public class SingleColumnHistogramWriter
    extends HistogramWriter
    Histogram writer for the tables having one column histogram (which is actually all standard tables from Yamcs)
    • Constructor Detail

      • SingleColumnHistogramWriter

        public SingleColumnHistogramWriter​(RdbTable table,
                                           String histoColumn)
    • Method Detail

      • startQueueing

        public CompletableFuture<org.rocksdb.Snapshot> startQueueing​(String dbPartition)
                                                              throws IOException
        return a completable future which returns a snapshot after which the histogram data is being queued, such that the snapshot+queued histogram data represents accurately the state of the table.

        The queue stores only the histogram data, the data itself (table records) is written to the database by the table writer (we definitely do not want to block that!).

        The reason we don't create directly the snapshot is to avoid race conditions if there is a fast writer which may have already written the data and just waiting to add the histogram. Creating the snapshot in the writer thread avoid the data being counted twice.

        Unfortunately this is still not 100% safe if there are two threads writing in the table:

         t1 thread 0: start histogram rebuild, wait to get a snapshot
         t2 thread 1: write a record to table
         t3 thread 2: write a record to table
         t4 thread 1: take snapshot and enable queueing
         t5 thread 2: add the record to the histogram queue
         t6 thread 0: rebuild the histograms based on the snapshot
         t7 thread 0: stop queueing, add the queued data to the histograms. 
          The data added in the queue at step t5 will be counted twice because it was already part of the snapshot.
         

        If there is no table writer, the rebuilder will wait forever for the snapshot so to avoid this we terminate the future after a few milliseconds. This too can induce a race condition.

        To avoid those race conditions we would need from rocksdb the sequence number for each write to be able to compare them with the snapshot sequence number and thus know if the data has already been written.

        An alternative would be to synchronise all the writers.

        However, given the fact that histogram rebuild is an infrequent operation and most tables will only have maximum one steady writer (this works correctly), the problem is unlikely to appear in practice. In addition, the histograms being statistical in nature, having a counter off by one is not considered to be a major problem.

        Specified by:
        startQueueing in class HistogramWriter
        Throws:
        IOException
      • stopQueueing

        public void stopQueueing​(String dbPartition)
        Description copied from class: HistogramWriter
        called from the histogram rebuilder to stop queuing and start again updating histograms starting with the ones queued
        Specified by:
        stopQueueing in class HistogramWriter