T
- the type of element returned by this cursor@API(value=EXPERIMENTAL) public class ProbableIntersectionCursor<T> extends Object
IntersectionCursor
in that it does not require that its children produce results in a compatible order.
Just like the IntersectionCursor
, this cursor does require a comparison key function for each entry, but it
only uses this comparison key for determining whether two results returned by child cursors are equal.
This cursor makes very few guarantees about its results. In particular, just as with the UnorderedUnionCursor
,
results are returned as they come, so the exact ordering of returned results is determined only at runtime.
It also can produce duplicates if the same result appears multiple times in its child cursors.
Additionally, in order to support resuming this cursor using a continuation, Bloom filters are used internally
to remember which results the cursors have already seen. However, as Bloom filters can report false positives,
it is possible that this cursor can return a result that only appears in a proper subset of the child cursors'
result sets. The selectivity of the Bloom filter can be adjusted by setting the expectedResults
and falsePositivePercentage
parameters at cursor creation time. These parameters are fed through to the
underlying Guava BloomFilter
initializer.
However, this cursor does make the following guarantees:
This cursor can therefore be used to select a narrow candidate set of "probable" elements of the intersection
and then perform a (possibly more expensive) alternative filter to verify each result. For example, if each
child corresponds to satisfying some conjunct of an "and"
predicate by scanning an index, then one could intersect the results with this cursor and then evaluate
the predicate on each returned record as a residual filter.
BloomFilter
RecordCursor.NoNextReason
Modifier and Type | Field and Description |
---|---|
static long |
DEFAULT_EXPECTED_RESULTS
The default number of results to expect to be read from each child cursor of this cursor.
|
static double |
DEFAULT_FALSE_POSITIVE_PERCENTAGE
The default acceptable false positive percentage when evaluating whether an element is contained with a given cursor's result set.
|
Modifier and Type | Method and Description |
---|---|
boolean |
accept(RecordCursorVisitor visitor)
Accept a visit from hierarchical visitor, which implements
RecordCursorVisitor . |
void |
close() |
static <T> ProbableIntersectionCursor<T> |
create(Function<? super T,? extends List<Object>> comparisonKeyFunction,
List<Function<byte[],RecordCursor<T>>> cursorFunctions,
byte[] continuation,
FDBStoreTimer timer)
Create a cursor merging the results of two or more cursors.
|
static <T> ProbableIntersectionCursor<T> |
create(Function<? super T,? extends List<Object>> comparisonKeyFunction,
List<Function<byte[],RecordCursor<T>>> cursorFunctions,
long expectedResults,
double falsePositivePercentage,
byte[] continuation,
FDBStoreTimer timer)
Create a cursor merging the results of two or more cursors.
|
byte[] |
getContinuation()
Deprecated.
|
Executor |
getExecutor() |
RecordCursor.NoNextReason |
getNoNextReason()
Deprecated.
|
FDBStoreTimer |
getTimer()
Get the
FDBStoreTimer used to instrument events of this cursor. |
U |
next()
Deprecated.
|
CompletableFuture<Boolean> |
onHasNext()
Deprecated.
|
CompletableFuture<RecordCursorResult<U>> |
onNext()
Asynchronously return the next result from this cursor.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
asIterator, asList, empty, empty, filter, filterAsync, filterAsyncInstrumented, filterAsyncInstrumented, filterInstrumented, filterInstrumented, first, flatMapPipelined, flatMapPipelined, flatMapPipelined, forEach, forEachAsync, forEachResult, forEachResultAsync, fromFuture, fromFuture, fromIterator, fromIterator, fromList, fromList, fromList, fromList, getCount, getNext, hasNext, limitRowsTo, limitTo, map, mapEffect, mapEffect, mapFuture, mapPipelined, orElse, reduce, skip, skipThenLimit
forEachRemaining, remove
public static final long DEFAULT_EXPECTED_RESULTS
public static final double DEFAULT_FALSE_POSITIVE_PERCENTAGE
@Nonnull public static <T> ProbableIntersectionCursor<T> create(@Nonnull Function<? super T,? extends List<Object>> comparisonKeyFunction, @Nonnull List<Function<byte[],RecordCursor<T>>> cursorFunctions, @Nullable byte[] continuation, @Nullable FDBStoreTimer timer)
expectedResults
and falsePositivePercentage
.T
- the type of elements returned by this cursorcomparisonKeyFunction
- the function evaluated to compare elements from different cursorscursorFunctions
- a list of functions to produce RecordCursor
s from a continuationcontinuation
- any continuation from a previous scantimer
- the timer used to instrument eventscreate(Function, List, long, double, byte[], FDBStoreTimer)
@Nonnull public static <T> ProbableIntersectionCursor<T> create(@Nonnull Function<? super T,? extends List<Object>> comparisonKeyFunction, @Nonnull List<Function<byte[],RecordCursor<T>>> cursorFunctions, long expectedResults, double falsePositivePercentage, @Nullable byte[] continuation, @Nullable FDBStoreTimer timer)
IntersectionCursor
, this does not require
that results be returned in the same order as their values from the comparison key function. However, it
does require that if two elements evaluate to the same comparison key that they are equal.
Every result from this cursor is guaranteed to be in at least one child, but some results may appear in only
a proper subset of the given cursors. This is because a probabilistic data structure is used internally to
allow for the cursor to be resumed across continuation boundaries. The caller can adjust the memory usage
as well as the false positive rate by tweaking the values of the expectedResults
and
falsePositivePercentage
parameters. Note that those parameters only matter if the continuation is null
.
A non-null continuation will read those parameters based on serialized versions of the data structure that are
included as part of the continuation.
T
- the type of elements returned by this cursorcomparisonKeyFunction
- the function evaluated to compare elements from different cursorscursorFunctions
- a list of functions to produce RecordCursor
s from a continuationexpectedResults
- the expected number of results from each child cursorfalsePositivePercentage
- an acceptable false positive percentage for each cursorcontinuation
- any continuation from a previous scantimer
- the timer used to instrument events@Nonnull public CompletableFuture<RecordCursorResult<U>> onNext()
RecordCursor
RecordCursorResult
, which represents exactly one of the following:
T
produced by the cursor. In addition to the next record, this result
includes a RecordCursorContinuation
that can be used to continue the cursor after the last record
returned. The returned continuation is guaranteed not to be an "end continuation" representing the end of
the cursor: specifically, RecordCursorContinuation.isEnd()
is always false
on the returned
continuation.
RecordCursor.NoNextReason
that
explains why no record could be produced. The result include a continuation that can be used to continue
the cursor after the last record returned.
If the result's NoNextReason
is anything other than RecordCursor.NoNextReason.SOURCE_EXHAUSTED
, the
returned continuation must not be an end continuation. Conversely, if the result's NoNextReason
is SOURCE_EXHAUSTED
, then the returned continuation must be an an "end continuation".
RecordCursorContinuation
can be serialized to an opaque byte array using
RecordCursorContinuation.toBytes()
. This can be passed back into a new cursor of the same type, with all
other parameters remaining the same.onNext
in interface RecordCursor<U>
RecordCursorResult
,
RecordCursorContinuation
@Nonnull @Deprecated public CompletableFuture<Boolean> onHasNext()
RecordCursor
onHasNext
in interface RecordCursor<U>
true
if RecordCursor.next()
would return a record.AsyncIterator.onHasNext()
@Nullable @Deprecated public U next()
RecordCursor
@Nullable @Deprecated public byte[] getContinuation()
RecordCursor
getContinuation
in interface RecordCursor<U>
null
if the underlying source is completely exhausted, independent of any limit
passed to the cursor creator. Since such creators generally accept null
to mean no continuation,
that is, start from the beginning, one must check for null
from getContinuation
to
keep from starting over.
Result is not always defined if called before onHasNext
or before next
after
onHasNext
has returned true
. That is, a continuation is only guaranteed when called
"between" records from a while (hasNext) next
loop or after its end.@Nonnull @Deprecated public RecordCursor.NoNextReason getNoNextReason()
RecordCursor
false
for RecordCursor.hasNext()
.
If hasNext
was not called or returned true
last time, the result is undefined and
may be an exception.getNoNextReason
in interface RecordCursor<U>
public void close()
close
in interface RecordCursor<U>
close
in interface AutoCloseable
@Nonnull public Executor getExecutor()
getExecutor
in interface RecordCursor<U>
@Nullable public FDBStoreTimer getTimer()
FDBStoreTimer
used to instrument events of this cursor.public boolean accept(@Nonnull RecordCursorVisitor visitor)
RecordCursor
RecordCursorVisitor
.
By contract, implementations of this method must return the value of visitor.visitLeave(this)
,
which determines whether or not subsequent siblings of this cursor should be visited.accept
in interface RecordCursor<U>
visitor
- a hierarchical visitortrue
if the subsequent siblings of the cursor
should be visited, and false
otherwise