Class SingleColumnHistogramWriter
-
Field Summary
Fields inherited from class org.yamcs.yarch.rocksdb.HistogramWriter
columnWriters, table, tableDefinition, tablespace
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addHistogram
(Row row) CompletableFuture<org.rocksdb.Snapshot>
startQueueing
(String dbPartition) return a completable future which returns a snapshot after which the histogram data is being queued, such that the snapshot+queued histogram data represents accurately the state of the table.void
stopQueueing
(String dbPartition) called from the histogram rebuilder to stop queuing and start again updating histograms starting with the ones queuedMethods inherited from class org.yamcs.yarch.rocksdb.HistogramWriter
newWriter
-
Constructor Details
-
SingleColumnHistogramWriter
-
-
Method Details
-
addHistogram
- Specified by:
addHistogram
in classHistogramWriter
- Throws:
IOException
org.rocksdb.RocksDBException
-
startQueueing
return a completable future which returns a snapshot after which the histogram data is being queued, such that the snapshot+queued histogram data represents accurately the state of the table.The queue stores only the histogram data, the data itself (table records) is written to the database by the table writer (we definitely do not want to block that!).
The reason we don't create directly the snapshot is to avoid race conditions if there is a fast writer which may have already written the data and just waiting to add the histogram. Creating the snapshot in the writer thread avoid the data being counted twice.
Unfortunately this is still not 100% safe if there are two threads writing in the table:
t1 thread 0: start histogram rebuild, wait to get a snapshot t2 thread 1: write a record to table t3 thread 2: write a record to table t4 thread 1: take snapshot and enable queueing t5 thread 2: add the record to the histogram queue t6 thread 0: rebuild the histograms based on the snapshot t7 thread 0: stop queueing, add the queued data to the histograms. The data added in the queue at step t5 will be counted twice because it was already part of the snapshot.
If there is no table writer, the rebuilder will wait forever for the snapshot so to avoid this we terminate the future after a few milliseconds. This too can induce a race condition.
To avoid those race conditions we would need from rocksdb the sequence number for each write to be able to compare them with the snapshot sequence number and thus know if the data has already been written.
An alternative would be to synchronise all the writers.
However, given the fact that histogram rebuild is an infrequent operation and most tables will only have maximum one steady writer (this works correctly), the problem is unlikely to appear in practice. In addition, the histograms being statistical in nature, having a counter off by one is not considered to be a major problem.
- Specified by:
startQueueing
in classHistogramWriter
- Throws:
IOException
-
stopQueueing
Description copied from class:HistogramWriter
called from the histogram rebuilder to stop queuing and start again updating histograms starting with the ones queued- Specified by:
stopQueueing
in classHistogramWriter
-