Interface Index<TId,TVector,TItem extends Item<TId,TVector>,TDistance>

Type Parameters:
TId - Type of the external identifier of an item
TVector - Type of the vector to perform distance calculation on
TItem - Type of items stored in the index
TDistance - Type of distance between items (expect any numeric type: float, double, int, ..)
All Superinterfaces:
Serializable
All Known Implementing Classes:
BruteForceIndex, HnswIndex

public interface Index<TId,TVector,TItem extends Item<TId,TVector>,TDistance> extends Serializable
K-nearest neighbors search index.
See Also:
  • Field Details

    • DEFAULT_PROGRESS_UPDATE_INTERVAL

      static final int DEFAULT_PROGRESS_UPDATE_INTERVAL
      By default after indexing this many items progress will be reported to registered progress listeners.
      See Also:
  • Method Details

    • add

      boolean add(TItem item)
      Add a new item to the index. If an item with the same identifier already exists in the index then : If deletes are disabled on this index the method will return false and the item will not be updated. If deletes are enabled and the version of the item has is higher version than that of the item currently stored in the index the old item will be removed and the new item added, otherwise this method will return false and the item will not be updated.
      Parameters:
      item - the item to add to the index
      Returns:
      true if the item was added to the index
      Throws:
      IllegalArgumentException - thrown when the item has the wrong dimensionality
    • remove

      boolean remove(TId id, long version)
      Removes an item from the index. If the index does not support deletes or an item with the same identifier exists in the index with a higher version number, then this method will return false and the item will not be removed.
      Parameters:
      id - unique identifier or the item to remove
      version - version of the delete. If your items don't override version use 0
      Returns:
      true if an item was removed from the index. In case the index does not support removals this will always be false
    • contains

      default boolean contains(TId id)
      Check if an item is contained in this index
      Parameters:
      id - unique identifier of the item
      Returns:
      true if an item is contained in this index, false otherwise
    • addAll

      default void addAll(Collection<TItem> items) throws InterruptedException
      Add multiple items to the index
      Parameters:
      items - the items to add to the index
      Throws:
      InterruptedException - thrown when the thread doing the indexing is interrupted
    • addAll

      default void addAll(Collection<TItem> items, ProgressListener listener) throws InterruptedException
      Add multiple items to the index. Reports progress to the passed in implementation of ProgressListener every DEFAULT_PROGRESS_UPDATE_INTERVAL elements indexed.
      Parameters:
      items - the items to add to the index
      listener - listener to report progress to
      Throws:
      InterruptedException - thrown when the thread doing the indexing is interrupted
    • addAll

      default void addAll(Collection<TItem> items, int numThreads, ProgressListener listener, int progressUpdateInterval) throws InterruptedException
      Add multiple items to the index. Reports progress to the passed in implementation of ProgressListener every progressUpdateInterval elements indexed.
      Parameters:
      items - the items to add to the index
      numThreads - number of threads to use for parallel indexing
      listener - listener to report progress to
      progressUpdateInterval - after indexing this many items progress will be reported. The last element will always be reported regardless of this setting.
      Throws:
      InterruptedException - thrown when the thread doing the indexing is interrupted
    • size

      int size()
      Returns the size of the index.
      Returns:
      size of the index
    • get

      Optional<TItem> get(TId id)
      Returns an item by its identifier.
      Parameters:
      id - unique identifier or the item to return
      Returns:
      an item
    • items

      Collection<TItem> items()
      Returns all items in the index.
      Returns:
      all items in the index
    • findNearest

      List<SearchResult<TItem,TDistance>> findNearest(TVector vector, int k)
      Find the items closest to the passed in vector.
      Parameters:
      vector - the vector
      k - number of items to return
      Returns:
      the items closest to the passed in vector
    • findNeighbors

      default List<SearchResult<TItem,TDistance>> findNeighbors(TId id, int k)
      Find the items closest to the item identified by the passed in id. If the id does not match an item an empty list is returned. the element itself is not included in the response.
      Parameters:
      id - id of the item to find the neighbors of
      k - number of items to return
      Returns:
      the items closest to the item
    • save

      void save(OutputStream out) throws IOException
      Saves the index to an OutputStream. Saving is not thread safe and you should not modify the index while saving.
      Parameters:
      out - the output stream to write the index to
      Throws:
      IOException - in case of I/O exception
    • save

      default void save(File file) throws IOException
      Saves the index to a file. Saving is not thread safe and you should not modify the index while saving.
      Parameters:
      file - file to write the index to
      Throws:
      IOException - in case of I/O exception
    • save

      default void save(Path path) throws IOException
      Saves the index to a path. Saving is not thread safe and you should not modify the index while saving.
      Parameters:
      path - file to write the index to
      Throws:
      IOException - in case of I/O exception