RankingQueryRunnerImpl (mimir-core 6.2 API)

java.lang.Object
- gate.mimir.search.RankingQueryRunnerImpl

All Implemented Interfaces:

QueryRunner
```
public class RankingQueryRunnerImpl
extends Object
implements QueryRunner
```
A QueryRunner implementation that can perform ranking. This query runner has two modes of functioning: ranking and non-ranking, depending on whether a MimirScorer is provided during construction or not. All documents are referred to using their rank (i.e. position in the list of results). When working in non-ranking mode, ranking order is the same as document ID order.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`protected class`	`RankingQueryRunnerImpl.BackgroundRunner` The background thread implementation: simply collects `Runnable`s from the `backgroundTasks` queue and runs them.
`protected class`	`RankingQueryRunnerImpl.DocIdsCollector` The first action started when a new `RankingQueryRunnerImpl` is created.
`protected class`	`RankingQueryRunnerImpl.HitsCollector` Collects the document hits (i.e.

Field Summary

Fields
Modifier and Type	Field and Description
`protected boolean`	`allDocIdsCollected` Flag used to mark that all results documents have been counted.
`protected BlockingQueue<Runnable>`	`backgroundTasks` A queue with tasks to be executed by the background thread.
`protected boolean`	`closed` Internal flag used to mark when this query runner has been closed.
`protected int`	`docBlockSize` The number of documents to be ranked (of have their hits collected) as a block.
`protected FutureTask<Object>`	`docIdCollectorFuture` The task that's working on collecting all the document IDs.
`protected it.unimi.dsi.fastutil.objects.ObjectBigList<List<Binding>>`	`documentHits` The sets of hits for each returned document.
`protected it.unimi.dsi.fastutil.longs.LongBigList`	`documentIds` The document IDs for the documents found to contain hits.
`protected it.unimi.dsi.fastutil.doubles.DoubleBigArrayBigList`	`documentScores` If scoring is enabled (`scorer` is not `null`), this list contains the scores for the documents found to contain hits.
`protected it.unimi.dsi.fastutil.longs.LongBigList`	`documentsOrder` The order the documents should be returned in (elements in this list are indexes in `documentIds`).
`protected SortedMap<long[],Future<?>>`	`hitCollectors` Data structure holding references to `Future`s that are currently working (or have worked) on collecting hits for a range of document indexes.
`protected static org.slf4j.Logger`	`logger` Shared logger instance.
`protected QueryEngine`	`queryEngine` The QueryEngine we run inside.
`protected QueryExecutor`	`queryExecutor` The `QueryExecutor` for the query being run.
`protected Thread`	`runningThread` The background thread used for collecting hits.
`protected MimirScorer`	`scorer` The `MimirScorer` to be used for ranking documents.

Fields inherited from interface gate.mimir.search.QueryRunner
DEFAULT_SCORE

Constructor Summary

Constructors
Constructor and Description

RankingQueryRunnerImpl(QueryExecutor executor, MimirScorer scorer)
Creates a query runner in ranking mode.

Constructors
Constructor and Description
`RankingQueryRunnerImpl(QueryExecutor executor, MimirScorer scorer)` Creates a query runner in ranking mode.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()` Closes this `QueryExecutor` and releases all resources used.
`protected Future<?>`	`collectHits(long[] interval)` Makes sure all the documents in the specified range are queued for hit collection.
`protected long`	`findRank(double documentScore, long start, long end)` Given a document score, finds the correct insertion point into the `documentsOrder` list, within a given range of ranks.
`List<Binding>`	`getDocumentHits(long rank)` Retrieves the hits within a given result document.
`long`	`getDocumentID(long rank)` Gets the ID of a result document.
`protected long`	`getDocumentIndex(long rank)` Given a document rank, return its index in the `documentIds` list.
`Serializable`	`getDocumentMetadataField(long rank, String fieldName)` Obtains an arbitrary document metadata field from the stored document data.
`Map<String,Serializable>`	`getDocumentMetadataFields(long rank, Set<String> fieldNames)` Obtains a set of arbitrary document metadata fields from the stored document data.
`double`	`getDocumentScore(long rank)` Get the score for a given result document.
`long`	`getDocumentsCount()` Gets the number of result documents.
`long`	`getDocumentsCountSync()` Synchronous version of `getDocumentsCount()` that waits if necessary before returning the correct result (instead of returning `-1` of the value is not yet known).
`long`	`getDocumentsCurrentCount()` Gets the number of result documents found so far.
`String[][]`	`getDocumentText(long rank, int termPosition, int length)` Gets a segment of the document text for a given document.
`String`	`getDocumentTitle(long rank)` Obtains the title for a given document.
`String`	`getDocumentURI(long rank)` Obtains the URI for a given document.
`protected long`	`nextNotDeleted()` Find the next document ID for the current query executor which is not marked as deleted in the index.
`protected void`	`rankDocuments(long rank)` Ranks some more documents (i.e.
`void`	`renderDocument(long rank, Appendable out)` Render the content of the given document, with the hits for this query highlighted.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - logger
```
protected static org.slf4j.Logger logger
```
    Shared logger instance.
  - queryExecutor
```
protected QueryExecutor queryExecutor
```
    The QueryExecutor for the query being run.
  - queryEngine
```
protected QueryEngine queryEngine
```
    The QueryEngine we run inside.
  - scorer
```
protected MimirScorer scorer
```
    The MimirScorer to be used for ranking documents.
  - docBlockSize
```
protected int docBlockSize
```
    The number of documents to be ranked (of have their hits collected) as a block.
  - documentIds
```
protected it.unimi.dsi.fastutil.longs.LongBigList documentIds
```
    The document IDs for the documents found to contain hits. This list is sorted in ascending documentID order.
  - documentScores
```
protected it.unimi.dsi.fastutil.doubles.DoubleBigArrayBigList documentScores
```
    If scoring is enabled (scorer is not null), this list contains the scores for the documents found to contain hits. This list is aligned to documentIds.
  - documentHits
```
protected it.unimi.dsi.fastutil.objects.ObjectBigList<List<Binding>> documentHits
```
    The sets of hits for each returned document. This data structure is lazily built, so some elements may be null. This list is aligned to documentIds.
  - documentsOrder
```
protected it.unimi.dsi.fastutil.longs.LongBigList documentsOrder
```
    The order the documents should be returned in (elements in this list are indexes in documentIds).
  - hitCollectors
```
protected SortedMap<long[],Future<?>> hitCollectors
```
    Data structure holding references to Futures that are currently working (or have worked) on collecting hits for a range of document indexes.
  - runningThread
```
protected Thread runningThread
```
    The background thread used for collecting hits.
  - backgroundTasks
```
protected BlockingQueue<Runnable> backgroundTasks
```
    A queue with tasks to be executed by the background thread.
  - allDocIdsCollected
```
protected volatile boolean allDocIdsCollected
```
    Flag used to mark that all results documents have been counted.
  - docIdCollectorFuture
```
protected volatile FutureTask<Object> docIdCollectorFuture
```
    The task that's working on collecting all the document IDs. When this activity has finished, the precise documents count is known.
  - closed
```
protected volatile boolean closed
```
    Internal flag used to mark when this query runner has been closed.
- Constructor Detail
  - RankingQueryRunnerImpl
```
public RankingQueryRunnerImpl(QueryExecutor executor,
                              MimirScorer scorer)
                       throws IOException
```
    Creates a query runner in ranking mode.
    
    Parameters:
    
    qNode - the QueryNode for the query being executed.
    
    scorer - the MimirScorer to use for ranking.
    
    qEngine - the QueryEngine used for executing the queries.
    
    Throws:
    
    IOException
- Method Detail
  - getDocumentsCount
```
public long getDocumentsCount()
```
    Description copied from interface: QueryRunner
    
    Gets the number of result documents.
    
    Specified by:
    
    getDocumentsCount in interface QueryRunner
    
    Returns:
    
    -1 if the search has not yet completed, the total number of result document otherwise.
  - getDocumentsCountSync
```
public long getDocumentsCountSync()
```
    Synchronous version of getDocumentsCount() that waits if necessary before returning the correct result (instead of returning -1 of the value is not yet known).
    
    Specified by:
    
    getDocumentsCountSync in interface QueryRunner
    
    Returns:
    
    the total number of documents found to match the query.
  - getDocumentsCurrentCount
```
public long getDocumentsCurrentCount()
```
    Description copied from interface: QueryRunner
    
    Gets the number of result documents found so far. After the search completes, the result returned by this call is identical to that of QueryRunner.getDocumentsCount().
    
    Specified by:
    
    getDocumentsCurrentCount in interface QueryRunner
    
    Returns:
    
    the number of result documents known so far.
  - getDocumentID
```
public long getDocumentID(long rank)
                   throws IndexOutOfBoundsException,
                          IOException
```
    Description copied from interface: QueryRunner
    
    Gets the ID of a result document.
    
    Specified by:
    
    getDocumentID in interface QueryRunner
    
    Parameters:
    
    rank - the index of the desired document in the list of documents. This should be a value between 0 and QueryRunner.getDocumentsCount() -1. If the requested document position has not yet been ranked (i.e. we know there is a document at that position, but we don't yet know which one) then the necessary ranking is performed before this method returns.
    
    Returns:
    
    an int value, representing the ID of the requested document.
    
    Throws:
    
    IndexOutOfBoundsException - is the index provided is less than zero, or greater than QueryRunner.getDocumentsCount() -1.
    
    IOException
  - getDocumentScore
```
public double getDocumentScore(long rank)
                        throws IndexOutOfBoundsException,
                               IOException
```
    Description copied from interface: QueryRunner
    
    Get the score for a given result document. The value for the score depends on the scorer used by the QueryEngine (see QueryEngine.setScorerSource(java.util.concurrent.Callable)).
    
    Specified by:
    
    getDocumentScore in interface QueryRunner
    
    Parameters:
    
    rank - the index of the desired document in the list of documents. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    Returns:
    
    Throws:
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentHits
```
public List<Binding> getDocumentHits(long rank)
                              throws IndexOutOfBoundsException,
                                     IOException
```
    Description copied from interface: QueryRunner
    
    Retrieves the hits within a given result document.
    
    Specified by:
    
    getDocumentHits in interface QueryRunner
    
    Parameters:
    
    rank - the index of the desired document in the list of documents. This should be a value between 0 and QueryRunner.getDocumentsCount() -1. This method call waits until the requested data is available before returning (document hits are being collected by a background thread).
    
    Returns:
    
    Throws:
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentIndex
```
protected long getDocumentIndex(long rank)
                         throws IOException,
                                IndexOutOfBoundsException
```
    Given a document rank, return its index in the documentIds list. If ranking is not being performed, then the rank is interpreted as an index against the documentIds list and is simply returned.
    
    Parameters:
    
    rank -
    
    Returns:
    
    Throws:
    
    IOException, - IndexOutOfBoundsException
    
    IOException
    
    IndexOutOfBoundsException
  - rankDocuments
```
protected void rankDocuments(long rank)
                      throws IOException
```
    Ranks some more documents (i.e. adds more entries to the documentsOrder list, making sure that the document at provided rank is included (if such a document exists). If the provided rank is larger than the number of result documents, then all documents will be ranked before this method returns. This is the only method that writes to the documentsOrder list. This method is executed synchronously in the client thread.
    
    Parameters:
    
    rank -
    
    Throws:
    
    IOException
  - findRank
```
protected long findRank(double documentScore,
                        long start,
                        long end)
```
    Given a document score, finds the correct insertion point into the documentsOrder list, within a given range of ranks. This method performs binary search followed by a linear scan so that the returned insertion point is the largest correct one (i.e. later documents with the same score get sorted after earlier ones, thus keeping the sorting stable).
    
    Parameters:
    
    documentScore - the score for the new document.
    
    start - the start of the search range within documentsOrder
    
    end - the end of the search range within documentsOrder
    
    Returns:
    
    the largest correct insertion point
  - collectHits
```
protected Future<?> collectHits(long[] interval)
```
    Makes sure all the documents in the specified range are queued for hit collection.
    
    Parameters:
    
    interval - the interval specified by 2 document ranks. The interval is defined as the elements in documentsOrder between ranks interval[0] and (interval[1]-1) inclusive.
    
    Returns:
    
    the future that has been queued for collecting the hits.
  - getDocumentText
```
public String[][] getDocumentText(long rank,
                                  int termPosition,
                                  int length)
                           throws IndexException,
                                  IndexOutOfBoundsException,
                                  IOException
```
    Description copied from interface: QueryRunner
    
    Gets a segment of the document text for a given document.
    
    Specified by:
    
    getDocumentText in interface QueryRunner
    
    Parameters:
    
    rank - the rank of the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    termPosition - the first term requested.
    
    length - the number of terms requested.
    
    Returns:
    
    two parallel String arrays, one containing term text, the other containing the spaces in between. The first term is results[0][0], the space following it is results[1][0], etc.
    
    Throws:
    
    IndexException
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentURI
```
public String getDocumentURI(long rank)
                      throws IndexException,
                             IndexOutOfBoundsException,
                             IOException
```
    Description copied from interface: QueryRunner
    
    Obtains the URI for a given document.
    
    Specified by:
    
    getDocumentURI in interface QueryRunner
    
    Parameters:
    
    rank - the rank for the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    Returns:
    
    the URI provided at indexing time for the document.
    
    Throws:
    
    IndexException
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentTitle
```
public String getDocumentTitle(long rank)
                        throws IndexException,
                               IndexOutOfBoundsException,
                               IOException
```
    Description copied from interface: QueryRunner
    
    Obtains the title for a given document.
    
    Specified by:
    
    getDocumentTitle in interface QueryRunner
    
    Parameters:
    
    rank - the rank of the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    Returns:
    
    the document title (provided at indexing time).
    
    Throws:
    
    IndexException
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentMetadataField
```
public Serializable getDocumentMetadataField(long rank,
                                             String fieldName)
                                      throws IndexException,
                                             IndexOutOfBoundsException,
                                             IOException
```
    Description copied from interface: QueryRunner
    
    Obtains an arbitrary document metadata field from the stored document data. DocumentMetadataHelpers used at indexing time can add arbitrary Serializable values as metadata fields for the documents being indexed. This method is used at search time to retrieve those values.
    
    Specified by:
    
    getDocumentMetadataField in interface QueryRunner
    
    Parameters:
    
    rank - the rank for the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    fieldName - the field name for which the value is sought.
    
    Returns:
    
    Throws:
    
    IndexException
    
    IndexOutOfBoundsException
    
    IOException
  - getDocumentMetadataFields
```
public Map<String,Serializable> getDocumentMetadataFields(long rank,
                                                          Set<String> fieldNames)
                                                   throws IndexException,
                                                          IndexOutOfBoundsException,
                                                          IOException
```
    Description copied from interface: QueryRunner
    
    Obtains a set of arbitrary document metadata fields from the stored document data. DocumentMetadataHelpers used at indexing time can add arbitrary Serializable values as metadata fields for the documents being indexed. This method is used at search time to retrieve those values.
    
    Specified by:
    
    getDocumentMetadataFields in interface QueryRunner
    
    Parameters:
    
    rank - the rank for the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    fieldNames - the names of the metadata fields for which the values are requested.
    
    Returns:
    
    a Map linking field names with their values.
    
    Throws:
    
    IndexException
    
    IndexOutOfBoundsException
    
    IOException
  - renderDocument
```
public void renderDocument(long rank,
                           Appendable out)
                    throws IOException,
                           IndexException
```
    Description copied from interface: QueryRunner
    
    Render the content of the given document, with the hits for this query highlighted.
    
    Specified by:
    
    renderDocument in interface QueryRunner
    
    Parameters:
    
    rank - the rank for the requested document. This should be a value between 0 and QueryRunner.getDocumentsCount() -1.
    
    out - an Appendable to which the output is written.
    
    Throws:
    
    IOException
    
    IndexException
  - close
```
public void close()
           throws IOException
```
    Description copied from interface: QueryRunner
    
    Closes this QueryExecutor and releases all resources used.
    
    Specified by:
    
    close in interface QueryRunner
    
    Throws:
    
    IOException
  - nextNotDeleted
```
protected long nextNotDeleted()
                       throws IOException
```
    Find the next document ID for the current query executor which is not marked as deleted in the index.
    
    Throws:
    
    IOException

Class RankingQueryRunnerImpl

Nested Class Summary

Field Summary

Fields inherited from interface gate.mimir.search.QueryRunner

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

logger

queryExecutor

queryEngine

scorer

docBlockSize

documentIds

documentScores

documentHits

documentsOrder

hitCollectors

runningThread

backgroundTasks

allDocIdsCollected

docIdCollectorFuture

closed

Constructor Detail

RankingQueryRunnerImpl

Method Detail

getDocumentsCount

getDocumentsCountSync

getDocumentsCurrentCount

getDocumentID

getDocumentScore

getDocumentHits

getDocumentIndex

rankDocuments

findRank

collectHits

getDocumentText

getDocumentURI

getDocumentTitle

getDocumentMetadataField

getDocumentMetadataFields

renderDocument

close

nextNotDeleted