Package com.arcadedb.index.vector
Class HnswVectorIndex<TId,TVector,TDistance>
- java.lang.Object
-
- com.arcadedb.engine.Component
-
- com.arcadedb.index.vector.HnswVectorIndex<TId,TVector,TDistance>
-
- All Implemented Interfaces:
Index
,IndexInternal
public class HnswVectorIndex<TId,TVector,TDistance> extends Component implements Index, IndexInternal
This work is derived from the excellent work made by Jelmer Kuperus on https://github.com/jelmerk/hnswlib.Implementation of
Index
that implements the hnsw algorithm. TODO: Check if the global lock interferes with ArcadeDB's tx approach- Author:
- Luca Garulli ([email protected])
- See Also:
- Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
HnswVectorIndex.BuildVectorIndexCallback<TId,TVector>
static interface
HnswVectorIndex.IgnoreVertexCallback
static class
HnswVectorIndex.IndexFactoryHandler
static class
HnswVectorIndex.PaginatedComponentFactoryHandlerUnique
-
Nested classes/interfaces inherited from interface com.arcadedb.index.Index
Index.BuildIndexCallback
-
Nested classes/interfaces inherited from interface com.arcadedb.index.IndexInternal
IndexInternal.INDEX_STATUS
-
-
Field Summary
Fields Modifier and Type Field Description static int
CURRENT_VERSION
Vertex
entryPoint
RID
entryPointRIDToLoad
static String
FILE_EXT
-
Constructor Summary
Constructors Modifier Constructor Description protected
HnswVectorIndex(DatabaseInternal database, String indexName, String filePath, int id, int version)
Load time.protected
HnswVectorIndex(VectorIndexBuilder builder)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
add(Vertex vertex)
void
addAll(List<com.github.jelmerk.knn.Item<TId,TVector>> embeddings, HnswVectorIndex.BuildVectorIndexCallback callback)
void
addIndexOnBucket(IndexInternal index)
long
build(int buildIndexBatchSize, Index.BuildIndexCallback callback)
long
build(HnswVectorIndexRAM origin, int buildIndexBatchSize, HnswVectorIndex.BuildVectorIndexCallback vertexCreationCallback, Index.BuildIndexCallback edgeCallback)
void
close()
boolean
compact()
long
countEntries()
void
drop()
boolean
equals(Object obj)
List<com.github.jelmerk.knn.SearchResult<Vertex,TDistance>>
findNearest(TVector destination, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
List<Pair<Identifiable,? extends Number>>
findNeighborsFromId(TId id, int k)
List<Pair<Identifiable,? extends Number>>
findNeighborsFromId(TId id, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
List<Pair<Identifiable,? extends Number>>
findNeighborsFromVector(TVector vector, int k)
List<Pair<Identifiable,? extends Number>>
findNeighborsFromVector(TVector vector, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
List<Pair<Identifiable,? extends Number>>
findNeighborsFromVertex(Vertex start, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
IndexCursor
get(Object[] keys)
Retrieves the set of RIDs associated to a key.IndexCursor
get(Object[] keys, int limit)
Retrieves the set of RIDs associated to a key with a limit for the result.int
getAssociatedBucketId()
IndexInternal
getAssociatedIndex()
byte[]
getBinaryKeyTypes()
Component
getComponent()
int
getDimensionFromVertex(Vertex vertex)
int
getDimensions()
Returns the dimensionality of the items stored in this index.Comparator<TDistance>
getDistanceComparator()
Returns the comparator used to compare distances.com.github.jelmerk.knn.DistanceFunction<TVector,TDistance>
getDistanceFunction()
Returns the distance function.String
getEdgeType(int level)
int
getEf()
The size of the dynamic list for the nearest neighbors (used during the search)int
getEfConstruction()
Returns the parameter has the same meaning as ef, but controls the index time / index precision.List<Integer>
getFileIds()
<TId> TId
getIdFromVertex(Vertex vertex)
List<? extends Index>
getIndexesByKeys(Object[] keys)
IndexInternal[]
getIndexesOnBuckets()
Type[]
getKeyTypes()
int
getM()
Returns the number of bi-directional links created for every new element during construction.int
getMaxItemCount()
Returns the maximum number of items the index can hold.String
getMostRecentFileName()
String
getName()
LSMTreeIndexAbstract.NULL_STRATEGY
getNullStrategy()
int
getPageSize()
List<String>
getPropertyNames()
Map<String,Long>
getStats()
List<IndexInternal>
getSubIndexes()
Schema.INDEX_TYPE
getType()
TypeIndex
getTypeIndex()
String
getTypeName()
TypeIndex
getUnderlyingIndex()
<TVector> TVector
getVectorFromVertex(Vertex vertex)
int
hashCode()
boolean
ignoreVertex(Vertex vertex, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
boolean
isAutomatic()
boolean
isCompacting()
boolean
isDeletedFromVertex(Vertex vertex)
boolean
isUnique()
boolean
isValid()
IndexCursor
iterator(boolean ascendingOrder)
IndexCursor
iterator(boolean ascendingOrder, Object[] fromKeys, boolean inclusive)
void
onAfterCommit()
void
onAfterSchemaLoad()
void
put(Object[] keys, RID[] rid)
Add multiple values for one key in the index.IndexCursor
range(boolean ascending, Object[] beginKeys, boolean beginKeysInclusive, Object[] endKeys, boolean endKeysInclusive)
void
remove(Object[] keys)
Removes the keys from the index.void
remove(Object[] keys, Identifiable rid)
Removes an entry keys/record entry from the index.void
removeIndexOnBucket(IndexInternal index)
void
save()
void
save(OutputStream out)
boolean
scheduleCompaction()
void
setMetadata(String name, String[] propertyNames, int associatedBucketId)
void
setNullStrategy(LSMTreeIndexAbstract.NULL_STRATEGY nullStrategy)
boolean
setStatus(IndexInternal.INDEX_STATUS[] expectedStatuses, IndexInternal.INDEX_STATUS newStatus)
void
setTypeIndex(TypeIndex typeIndex)
boolean
supportsOrderedIterations()
JSONObject
toJSON()
String
toString()
-
Methods inherited from class com.arcadedb.engine.Component
getDatabase, getFileId, getMainComponent, getVersion, onAfterLoad
-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface com.arcadedb.index.IndexInternal
getFileId
-
-
-
-
Field Detail
-
FILE_EXT
public static final String FILE_EXT
- See Also:
- Constant Field Values
-
CURRENT_VERSION
public static final int CURRENT_VERSION
- See Also:
- Constant Field Values
-
entryPointRIDToLoad
public volatile RID entryPointRIDToLoad
-
entryPoint
public volatile Vertex entryPoint
-
-
Constructor Detail
-
HnswVectorIndex
protected HnswVectorIndex(VectorIndexBuilder builder)
-
HnswVectorIndex
protected HnswVectorIndex(DatabaseInternal database, String indexName, String filePath, int id, int version) throws IOException
Load time.- Throws:
IOException
-
-
Method Detail
-
onAfterSchemaLoad
public void onAfterSchemaLoad()
- Overrides:
onAfterSchemaLoad
in classComponent
-
onAfterCommit
public void onAfterCommit()
- Overrides:
onAfterCommit
in classComponent
-
getName
public String getName()
-
findNeighborsFromId
public List<Pair<Identifiable,? extends Number>> findNeighborsFromId(TId id, int k)
-
findNeighborsFromId
public List<Pair<Identifiable,? extends Number>> findNeighborsFromId(TId id, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
-
findNeighborsFromVertex
public List<Pair<Identifiable,? extends Number>> findNeighborsFromVertex(Vertex start, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
-
findNeighborsFromVector
public List<Pair<Identifiable,? extends Number>> findNeighborsFromVector(TVector vector, int k)
-
findNeighborsFromVector
public List<Pair<Identifiable,? extends Number>> findNeighborsFromVector(TVector vector, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
-
addAll
public void addAll(List<com.github.jelmerk.knn.Item<TId,TVector>> embeddings, HnswVectorIndex.BuildVectorIndexCallback callback)
-
add
public boolean add(Vertex vertex)
-
getUnderlyingIndex
public TypeIndex getUnderlyingIndex()
-
findNearest
public List<com.github.jelmerk.knn.SearchResult<Vertex,TDistance>> findNearest(TVector destination, int k, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
-
getDimensions
public int getDimensions()
Returns the dimensionality of the items stored in this index.- Returns:
- the dimensionality of the items stored in this index
-
getM
public int getM()
Returns the number of bi-directional links created for every new element during construction.- Returns:
- the number of bi-directional links created for every new element during construction
-
getEf
public int getEf()
The size of the dynamic list for the nearest neighbors (used during the search)- Returns:
- The size of the dynamic list for the nearest neighbors
-
getEfConstruction
public int getEfConstruction()
Returns the parameter has the same meaning as ef, but controls the index time / index precision.- Returns:
- the parameter has the same meaning as ef, but controls the index time / index precision
-
getDistanceFunction
public com.github.jelmerk.knn.DistanceFunction<TVector,TDistance> getDistanceFunction()
Returns the distance function.- Returns:
- the distance function
-
getDistanceComparator
public Comparator<TDistance> getDistanceComparator()
Returns the comparator used to compare distances.- Returns:
- the comparator used to compare distance
-
getMaxItemCount
public int getMaxItemCount()
Returns the maximum number of items the index can hold.- Returns:
- the maximum number of items the index can hold
-
save
public void save(OutputStream out) throws IOException
- Throws:
IOException
-
getIdFromVertex
public <TId> TId getIdFromVertex(Vertex vertex)
-
getVectorFromVertex
public <TVector> TVector getVectorFromVertex(Vertex vertex)
-
isDeletedFromVertex
public boolean isDeletedFromVertex(Vertex vertex)
-
ignoreVertex
public boolean ignoreVertex(Vertex vertex, HnswVectorIndex.IgnoreVertexCallback ignoreVertexCallback)
-
getDimensionFromVertex
public int getDimensionFromVertex(Vertex vertex)
-
getEdgeType
public String getEdgeType(int level)
-
save
public void save()
-
toJSON
public JSONObject toJSON()
- Specified by:
toJSON
in interfaceIndexInternal
-
getAssociatedIndex
public IndexInternal getAssociatedIndex()
- Specified by:
getAssociatedIndex
in interfaceIndexInternal
-
drop
public void drop()
- Specified by:
drop
in interfaceIndexInternal
-
getStats
public Map<String,Long> getStats()
- Specified by:
getStats
in interfaceIndexInternal
-
getNullStrategy
public LSMTreeIndexAbstract.NULL_STRATEGY getNullStrategy()
- Specified by:
getNullStrategy
in interfaceIndex
-
setNullStrategy
public void setNullStrategy(LSMTreeIndexAbstract.NULL_STRATEGY nullStrategy)
- Specified by:
setNullStrategy
in interfaceIndex
-
supportsOrderedIterations
public boolean supportsOrderedIterations()
- Specified by:
supportsOrderedIterations
in interfaceIndex
-
isAutomatic
public boolean isAutomatic()
- Specified by:
isAutomatic
in interfaceIndex
-
getPageSize
public int getPageSize()
- Specified by:
getPageSize
in interfaceIndexInternal
-
build
public long build(int buildIndexBatchSize, Index.BuildIndexCallback callback)
- Specified by:
build
in interfaceIndexInternal
-
build
public long build(HnswVectorIndexRAM origin, int buildIndexBatchSize, HnswVectorIndex.BuildVectorIndexCallback vertexCreationCallback, Index.BuildIndexCallback edgeCallback)
-
getSubIndexes
public List<IndexInternal> getSubIndexes()
-
setMetadata
public void setMetadata(String name, String[] propertyNames, int associatedBucketId)
- Specified by:
setMetadata
in interfaceIndexInternal
-
setStatus
public boolean setStatus(IndexInternal.INDEX_STATUS[] expectedStatuses, IndexInternal.INDEX_STATUS newStatus)
- Specified by:
setStatus
in interfaceIndexInternal
-
getComponent
public Component getComponent()
- Specified by:
getComponent
in interfaceIndexInternal
-
getKeyTypes
public Type[] getKeyTypes()
- Specified by:
getKeyTypes
in interfaceIndexInternal
-
getBinaryKeyTypes
public byte[] getBinaryKeyTypes()
- Specified by:
getBinaryKeyTypes
in interfaceIndexInternal
-
getFileIds
public List<Integer> getFileIds()
- Specified by:
getFileIds
in interfaceIndexInternal
-
setTypeIndex
public void setTypeIndex(TypeIndex typeIndex)
- Specified by:
setTypeIndex
in interfaceIndexInternal
-
getTypeIndex
public TypeIndex getTypeIndex()
- Specified by:
getTypeIndex
in interfaceIndexInternal
-
getAssociatedBucketId
public int getAssociatedBucketId()
- Specified by:
getAssociatedBucketId
in interfaceIndex
-
addIndexOnBucket
public void addIndexOnBucket(IndexInternal index)
-
removeIndexOnBucket
public void removeIndexOnBucket(IndexInternal index)
-
getIndexesOnBuckets
public IndexInternal[] getIndexesOnBuckets()
-
iterator
public IndexCursor iterator(boolean ascendingOrder)
-
iterator
public IndexCursor iterator(boolean ascendingOrder, Object[] fromKeys, boolean inclusive)
-
range
public IndexCursor range(boolean ascending, Object[] beginKeys, boolean beginKeysInclusive, Object[] endKeys, boolean endKeysInclusive)
-
get
public IndexCursor get(Object[] keys)
Description copied from interface:Index
Retrieves the set of RIDs associated to a key.
-
get
public IndexCursor get(Object[] keys, int limit)
Description copied from interface:Index
Retrieves the set of RIDs associated to a key with a limit for the result.
-
put
public void put(Object[] keys, RID[] rid)
Description copied from interface:Index
Add multiple values for one key in the index.
-
remove
public void remove(Object[] keys)
Description copied from interface:Index
Removes the keys from the index.
-
remove
public void remove(Object[] keys, Identifiable rid)
Description copied from interface:Index
Removes an entry keys/record entry from the index.
-
countEntries
public long countEntries()
- Specified by:
countEntries
in interfaceIndex
-
compact
public boolean compact() throws IOException, InterruptedException
- Specified by:
compact
in interfaceIndexInternal
- Throws:
IOException
InterruptedException
-
isCompacting
public boolean isCompacting()
- Specified by:
isCompacting
in interfaceIndexInternal
-
isValid
public boolean isValid()
- Specified by:
isValid
in interfaceIndexInternal
-
scheduleCompaction
public boolean scheduleCompaction()
- Specified by:
scheduleCompaction
in interfaceIndexInternal
-
getMostRecentFileName
public String getMostRecentFileName()
- Specified by:
getMostRecentFileName
in interfaceIndexInternal
-
getType
public Schema.INDEX_TYPE getType()
-
getTypeName
public String getTypeName()
- Specified by:
getTypeName
in interfaceIndex
-
getPropertyNames
public List<String> getPropertyNames()
- Specified by:
getPropertyNames
in interfaceIndex
-
close
public void close()
- Specified by:
close
in interfaceIndexInternal
- Specified by:
close
in classComponent
-
-