Class EmbeddingStoreIngestor

java.lang.Object
dev.langchain4j.store.embedding.EmbeddingStoreIngestor

public class EmbeddingStoreIngestor extends Object
EmbeddingStoreIngestor is responsible for the ingestion of documents into an embedding store. It manages the entire pipeline process, from splitting the documents into text segments, generating embeddings for these segments using a provided embedding model, to finally storing these embeddings into an embedding store. Optionally, it can also transform documents before splitting them, which can be useful if you want to clean your data, format it differently, etc. Additionally, it can optionally transform segments after they have been split.
  • Constructor Details

    • EmbeddingStoreIngestor

      public EmbeddingStoreIngestor(DocumentTransformer documentTransformer, DocumentSplitter documentSplitter, TextSegmentTransformer textSegmentTransformer, EmbeddingModel embeddingModel, EmbeddingStore<TextSegment> embeddingStore)
      Creates a new EmbeddingStoreIngestor.
      Parameters:
      documentTransformer - the document transformer to use, or null if no transformation is needed.
      documentSplitter - the document splitter to use.
      textSegmentTransformer - the text segment transformer to use, or null if no transformation is needed.
      embeddingModel - the embedding model to use.
      embeddingStore - the embedding store to use.
  • Method Details

    • ingest

      public void ingest(Document document)
      Ingests a single document.
      Parameters:
      document - the document.
    • ingest

      public void ingest(Document... documents)
      Ingests multiple documents.
      Parameters:
      documents - the documents.
    • ingest

      public void ingest(List<Document> documents)
      Ingests multiple documents.
      Parameters:
      documents - the documents.
    • builder

      public static EmbeddingStoreIngestor.Builder builder()
      Creates a new EmbeddingStoreIngestor builder.
      Returns:
      the builder.