Package com.arcadedb.index.vector
Class HnswVectorIndexRAM<TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
- java.lang.Object
-
- com.arcadedb.index.vector.HnswVectorIndexRAM<TId,TVector,TItem,TDistance>
-
- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)
- All Implemented Interfaces:
com.github.jelmerk.knn.Index<TId,TVector,TItem,TDistance>
,Serializable
public class HnswVectorIndexRAM<TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> extends Object implements com.github.jelmerk.knn.Index<TId,TVector,TItem,TDistance>
This work is derived from the excellent work made by Jelmer Kuperus on https://github.com/jelmerk/hnswlib. We forked the entire class only because it was not extensible (private members).Implementation of
Index
that implements the hnsw algorithm.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
HnswVectorIndexRAM.Builder<TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
Builder for initializing anHnswVectorIndexRAM
instance.class
HnswVectorIndexRAM.ItemIterator
static class
HnswVectorIndexRAM.Node<TItem extends com.github.jelmerk.knn.Item>
-
Field Summary
Fields Modifier and Type Field Description protected HnswVectorIndexRAM.Node<TItem>
entryPoint
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
add(TItem item)
VectorIndexBuilder
createPersistentIndex(Database database)
List<com.github.jelmerk.knn.SearchResult<TItem,TDistance>>
findNearest(TVector destination, int k)
Optional<TItem>
get(TId id)
int
getDimensions()
Returns the dimensionality of the items stored in this index.Comparator<TDistance>
getDistanceComparator()
Returns the comparator used to compare distances.com.github.jelmerk.knn.DistanceFunction<TVector,TDistance>
getDistanceFunction()
Returns the distance function.int
getEf()
The size of the dynamic list for the nearest neighbors (used during the search)int
getEfConstruction()
Returns the parameter has the same meaning as ef, but controls the index time / index precision.Integer
getEntryPoint()
int
getM()
Returns the number of bi-directional links created for every new element during construction.int
getMaxItemCount()
Returns the maximum number of items the index can hold.Collection<TItem>
items()
HnswVectorIndexRAM.ItemIterator
iterateNodes()
static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(File file)
Restores aHnswVectorIndexRAM
from a File.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(File file, ClassLoader classLoader)
Restores aHnswVectorIndexRAM
from a File.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(InputStream inputStream)
Restores aHnswVectorIndexRAM
from an InputStream.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(InputStream inputStream, ClassLoader classLoader)
Restores aHnswVectorIndexRAM
from an InputStream.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(Path path)
Restores aHnswVectorIndexRAM
from a Path.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM<TId,TVector,TItem,TDistance>load(Path path, ClassLoader classLoader)
Restores aHnswVectorIndexRAM
from a Path.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance extends Comparable<TDistance>>
HnswVectorIndexRAM.Builder<TId,TVector,TItem,TDistance>newBuilder(int dimensions, com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> distanceFunction, int maxItemCount)
Start the process of building a new HNSW index.static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance>
HnswVectorIndexRAM.Builder<TId,TVector,TItem,TDistance>newBuilder(int dimensions, com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> distanceFunction, Comparator<TDistance> distanceComparator, int maxItemCount)
Start the process of building a new HNSW index.boolean
remove(TId id, long version)
void
resize(int newSize)
Changes the maximum capacity of the index.void
save(OutputStream out)
void
setEf(int ef)
Set the size of the dynamic list for the nearest neighbors (used during the search)int
size()
-
-
-
Field Detail
-
entryPoint
protected volatile HnswVectorIndexRAM.Node<TItem extends com.github.jelmerk.knn.Item<TId,TVector>> entryPoint
-
-
Method Detail
-
size
public int size()
-
items
public Collection<TItem> items()
-
iterateNodes
public HnswVectorIndexRAM.ItemIterator iterateNodes()
-
remove
public boolean remove(TId id, long version)
-
add
public boolean add(TItem item)
-
findNearest
public List<com.github.jelmerk.knn.SearchResult<TItem,TDistance>> findNearest(TVector destination, int k)
-
resize
public void resize(int newSize)
Changes the maximum capacity of the index.- Parameters:
newSize
- new size of the index
-
getDimensions
public int getDimensions()
Returns the dimensionality of the items stored in this index.- Returns:
- the dimensionality of the items stored in this index
-
getM
public int getM()
Returns the number of bi-directional links created for every new element during construction.- Returns:
- the number of bi-directional links created for every new element during construction
-
getEf
public int getEf()
The size of the dynamic list for the nearest neighbors (used during the search)- Returns:
- The size of the dynamic list for the nearest neighbors
-
setEf
public void setEf(int ef)
Set the size of the dynamic list for the nearest neighbors (used during the search)- Parameters:
ef
- The size of the dynamic list for the nearest neighbors
-
getEfConstruction
public int getEfConstruction()
Returns the parameter has the same meaning as ef, but controls the index time / index precision.- Returns:
- the parameter has the same meaning as ef, but controls the index time / index precision
-
getDistanceFunction
public com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> getDistanceFunction()
Returns the distance function.- Returns:
- the distance function
-
getDistanceComparator
public Comparator<TDistance> getDistanceComparator()
Returns the comparator used to compare distances.- Returns:
- the comparator used to compare distance
-
getMaxItemCount
public int getMaxItemCount()
Returns the maximum number of items the index can hold.- Returns:
- the maximum number of items the index can hold
-
save
public void save(OutputStream out) throws IOException
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(File file) throws IOException
Restores aHnswVectorIndexRAM
from a File.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
file
- File to restore the index from- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exception
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(File file, ClassLoader classLoader) throws IOException
Restores aHnswVectorIndexRAM
from a File.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
file
- File to restore the index fromclassLoader
- the classloader to use- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exception
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(Path path) throws IOException
Restores aHnswVectorIndexRAM
from a Path.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
path
- Path to restore the index from- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exception
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(Path path, ClassLoader classLoader) throws IOException
Restores aHnswVectorIndexRAM
from a Path.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
path
- Path to restore the index fromclassLoader
- the classloader to use- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exception
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(InputStream inputStream) throws IOException
Restores aHnswVectorIndexRAM
from an InputStream.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ...).- Parameters:
inputStream
- InputStream to restore the index from- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exceptionIllegalArgumentException
- in case the file cannot be read
-
load
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM<TId,TVector,TItem,TDistance> load(InputStream inputStream, ClassLoader classLoader) throws IOException
Restores aHnswVectorIndexRAM
from an InputStream.- Type Parameters:
TId
- Type of the external identifier of an itemTVector
- Type of the vector to perform distance calculation onTItem
- Type of items stored in the indexTDistance
- Type of distance between items (expect any numeric type: float, double, int, ...).- Parameters:
inputStream
- InputStream to restore the index fromclassLoader
- the classloader to use- Returns:
- The restored index
- Throws:
IOException
- in case of an I/O exceptionIllegalArgumentException
- in case the file cannot be read
-
newBuilder
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance extends Comparable<TDistance>> HnswVectorIndexRAM.Builder<TId,TVector,TItem,TDistance> newBuilder(int dimensions, com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> distanceFunction, int maxItemCount)
Start the process of building a new HNSW index.- Type Parameters:
TVector
- Type of the vector to perform distance calculation onTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
dimensions
- the dimensionality of the vectors stored in the indexdistanceFunction
- the distance functionmaxItemCount
- maximum number of items the index can hold- Returns:
- a builder
-
newBuilder
public static <TId,TVector,TItem extends com.github.jelmerk.knn.Item<TId,TVector>,TDistance> HnswVectorIndexRAM.Builder<TId,TVector,TItem,TDistance> newBuilder(int dimensions, com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> distanceFunction, Comparator<TDistance> distanceComparator, int maxItemCount)
Start the process of building a new HNSW index.- Type Parameters:
TVector
- Type of the vector to perform distance calculation onTDistance
- Type of distance between items (expect any numeric type: float, double, int, ..)- Parameters:
dimensions
- the dimensionality of the vectors stored in the indexdistanceFunction
- the distance functiondistanceComparator
- used to compare distancesmaxItemCount
- maximum number of items the index can hold- Returns:
- a builder
-
createPersistentIndex
public VectorIndexBuilder createPersistentIndex(Database database)
-
getEntryPoint
public Integer getEntryPoint()
-
-