Package com.arcadedb.schema
Class VectorIndexBuilder
- java.lang.Object
-
- com.arcadedb.schema.IndexBuilder<HnswVectorIndex>
-
- com.arcadedb.schema.VectorIndexBuilder
-
public class VectorIndexBuilder extends IndexBuilder<HnswVectorIndex>
Builder class for vector indexes.- Author:
- Luca Garulli ([email protected])
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_EF
static int
DEFAULT_EF_CONSTRUCTION
static int
DEFAULT_M
-
Fields inherited from class com.arcadedb.schema.IndexBuilder
BUILD_BATCH_SIZE
-
-
Constructor Summary
Constructors Constructor Description VectorIndexBuilder(Database database, HnswVectorIndexRAM origin)
-
Method Summary
-
Methods inherited from class com.arcadedb.schema.IndexBuilder
getCallback, getDatabase, getFilePath, getIndexImplementation, getIndexName, getIndexType, getKeyTypes, getNullStrategy, getPageSize, isUnique, withBatchSize, withCallback, withFilePath, withIgnoreIfExists, withIndexName, withKeyTypes, withMaxAttempts, withNullStrategy, withPageSize, withType, withUnique
-
-
-
-
Field Detail
-
DEFAULT_M
public static final int DEFAULT_M
- See Also:
- Constant Field Values
-
DEFAULT_EF
public static final int DEFAULT_EF
- See Also:
- Constant Field Values
-
DEFAULT_EF_CONSTRUCTION
public static final int DEFAULT_EF_CONSTRUCTION
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
VectorIndexBuilder
public VectorIndexBuilder(Database database, HnswVectorIndexRAM origin)
-
-
Method Detail
-
create
public HnswVectorIndex create()
- Specified by:
create
in classIndexBuilder<HnswVectorIndex>
-
withDistanceFunction
public VectorIndexBuilder withDistanceFunction(com.github.jelmerk.knn.DistanceFunction distanceFunction)
-
withDistanceComparator
public VectorIndexBuilder withDistanceComparator(Comparator distanceComparator)
-
withDimensions
public VectorIndexBuilder withDimensions(int dimensions)
-
withMaxItemCount
public VectorIndexBuilder withMaxItemCount(int maxItemCount)
-
withM
public VectorIndexBuilder withM(int m)
Sets the number of bi-directional links created for every new element during construction. Reasonable range for m is 2-100. Higher m work better on datasets with high intrinsic dimensionality and/or high recall, while low m work better for datasets with low intrinsic dimensionality and/or low recalls. The parameter also determines the algorithm's memory consumption. As an example for d = 4 random vectors optimal m for search is somewhere around 6, while for high dimensional datasets (word embeddings, good face descriptors), higher M are required (e.g. m = 48, 64) for optimal performance at high recall. The range mM = 12-48 is ok for the most of the use cases. When m is changed one has to update the other parameters. Nonetheless, ef and efConstruction parameters can be roughly estimated by assuming that m efConstruction is a constant.- Parameters:
m
- the number of bi-directional links created for every new element during construction- Returns:
- the builder.
-
withEfConstruction
public VectorIndexBuilder withEfConstruction(int efConstruction)
` The parameter has the same meaning as ef, but controls the index time / index precision. Bigger efConstruction leads to longer construction, but better index quality. At some point, increasing efConstruction does not improve the quality of the index. One way to check if the selection of ef_construction was ok is to measure a recall for M nearest neighbor search when ef = efConstruction: if the recall is lower than 0.9, then there is room for improvement.- Parameters:
efConstruction
- controls the index time / index precision- Returns:
- the builder
-
withEf
public VectorIndexBuilder withEf(int ef)
The size of the dynamic list for the nearest neighbors (used during the search). Higher ef leads to more accurate but slower search. The value ef of can be anything between k and the size of the dataset.- Parameters:
ef
- size of the dynamic list for the nearest neighbors- Returns:
- the builder
-
withVertexType
public VectorIndexBuilder withVertexType(String vertexType)
-
withEdgeType
public VectorIndexBuilder withEdgeType(String edgeType)
-
withVectorProperty
public VectorIndexBuilder withVectorProperty(String vectorPropertyName, Type vectorPropertyType)
-
withIdProperty
public VectorIndexBuilder withIdProperty(String idPropertyName)
-
withDeletedProperty
public VectorIndexBuilder withDeletedProperty(String deletedPropertyName)
-
withCache
public VectorIndexBuilder withCache(Map<RID,Vertex> cache)
-
withVertexCreationCallback
public VectorIndexBuilder withVertexCreationCallback(HnswVectorIndex.BuildVectorIndexCallback callback)
-
getDimensions
public int getDimensions()
-
getDistanceFunction
public com.github.jelmerk.knn.DistanceFunction getDistanceFunction()
-
getDistanceComparator
public Comparator getDistanceComparator()
-
getM
public int getM()
-
getEf
public int getEf()
-
getEfConstruction
public int getEfConstruction()
-
getMaxItemCount
public int getMaxItemCount()
-
getVertexType
public String getVertexType()
-
getIdPropertyName
public String getIdPropertyName()
-
getDeletedPropertyName
public String getDeletedPropertyName()
-
getEdgeType
public String getEdgeType()
-
getVectorPropertyName
public String getVectorPropertyName()
-
-