Class ClassPathDocumentLoader
java.lang.Object
dev.langchain4j.data.document.loader.ClassPathDocumentLoader
DocumentLoader
implementation for loading documents using a ClassPathSource
- Author:
- Eric Deandrea
-
Method Summary
Modifier and TypeMethodDescriptionstatic dev.langchain4j.data.document.Document
loadDocument
(String pathOnClasspath) Loads aDocument
from the specified file path.static dev.langchain4j.data.document.Document
loadDocument
(String pathOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Loads aDocument
from the specified file path.static List
<dev.langchain4j.data.document.Document> loadDocuments
(String directoryOnClasspath) LoadsDocument
s from the specified directory.static List
<dev.langchain4j.data.document.Document> loadDocuments
(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) LoadsDocument
s from the specified directory.static List
<dev.langchain4j.data.document.Document> loadDocuments
(String directoryOnClasspath, PathMatcher pathMatcher) Loads matchingDocument
s from the specified directory.static List
<dev.langchain4j.data.document.Document> loadDocuments
(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Loads matchingDocument
s from the specified directory.static List
<dev.langchain4j.data.document.Document> loadDocumentsRecursively
(String directoryOnClasspath) Recursively loadsDocument
s from the specified directory and its subdirectories.static List
<dev.langchain4j.data.document.Document> loadDocumentsRecursively
(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.static List
<dev.langchain4j.data.document.Document> loadDocumentsRecursively
(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.static List
<dev.langchain4j.data.document.Document> loadDocumentsRecursively
(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
-
Method Details
-
loadDocument
Loads aDocument
from the specified file path.
The file is parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
ReturnedDocument
contains all the textual information from the file.- Parameters:
pathOnClasspath
- The path on the classpath to the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocument
public static dev.langchain4j.data.document.Document loadDocument(String pathOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Loads aDocument
from the specified file path.
The file is parsed using the specifiedDocumentParser
.
ReturnedDocument
contains all the textual information from the file.- Parameters:
pathOnClasspath
- The path on the classpath to the file.documentParser
- The parser to be used for parsing text from the file.- Returns:
- document
- Throws:
IllegalArgumentException
- If specified path is not a file.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath) LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) LoadsDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher) Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Loads matchingDocument
s from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")
will load all files fromdirectoryPath
with atxt
extension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath) Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loadsDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser
. The defaultDocumentParser
is loaded through SPI (seeDocumentParserFactoru
). If noDocumentParserFactory
is available in the classpath, aTextDocumentParser
is used.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loads matchingDocument
s from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser
.
Skips anyDocument
s that fail to load.- Parameters:
directoryOnClasspath
- The path to the directory on the classpath with files.pathMatcher
- Only files whose paths match the providedPathMatcher
will be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")
will load all files fromdirectoryPath
and its subdirectories with atxt
extension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath
) before being matched by apathMatcher
. Thus,pathMatcher
should use relative patterns. Please be aware that*.txt
pattern (with a single asterisk) will match files only in thedirectoryPath
, but it will not match files from the subdirectories ofdirectoryPath
.documentParser
- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException
- If specified path is not a directory.
-