Class ClassPathDocumentLoader
java.lang.Object
dev.langchain4j.data.document.loader.ClassPathDocumentLoader
DocumentLoader implementation for loading documents using a ClassPathSource- Author:
- Eric Deandrea
-
Method Summary
Modifier and TypeMethodDescriptionstatic dev.langchain4j.data.document.DocumentloadDocument(String pathOnClasspath) Loads aDocumentfrom the specified file path.static dev.langchain4j.data.document.DocumentloadDocument(String pathOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Loads aDocumentfrom the specified file path.static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath) LoadsDocuments from the specified directory.static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) LoadsDocuments from the specified directory.static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher) Loads matchingDocuments from the specified directory.static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Loads matchingDocuments from the specified directory.static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath) Recursively loadsDocuments from the specified directory and its subdirectories.static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loadsDocuments from the specified directory and its subdirectories.static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocuments from the specified directory and its subdirectories.static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loads matchingDocuments from the specified directory and its subdirectories.
-
Method Details
-
loadDocument
Loads aDocumentfrom the specified file path.
The file is parsed using the defaultDocumentParser. The defaultDocumentParseris loaded through SPI (seeDocumentParserFactoru). If noDocumentParserFactoryis available in the classpath, aTextDocumentParseris used.
ReturnedDocumentcontains all the textual information from the file.- Parameters:
pathOnClasspath- The path on the classpath to the file.- Returns:
- document
- Throws:
IllegalArgumentException- If specified path is not a file.
-
loadDocument
public static dev.langchain4j.data.document.Document loadDocument(String pathOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Loads aDocumentfrom the specified file path.
The file is parsed using the specifiedDocumentParser.
ReturnedDocumentcontains all the textual information from the file.- Parameters:
pathOnClasspath- The path on the classpath to the file.documentParser- The parser to be used for parsing text from the file.- Returns:
- document
- Throws:
IllegalArgumentException- If specified path is not a file.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath) LoadsDocuments from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser. The defaultDocumentParseris loaded through SPI (seeDocumentParserFactoru). If noDocumentParserFactoryis available in the classpath, aTextDocumentParseris used.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) LoadsDocuments from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.documentParser- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher) Loads matchingDocuments from the specified directory. Does not use recursion.
The files are parsed using the defaultDocumentParser. The defaultDocumentParseris loaded through SPI (seeDocumentParserFactoru). If noDocumentParserFactoryis available in the classpath, aTextDocumentParseris used.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.pathMatcher- Only files whose paths match the providedPathMatcherwill be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")will load all files fromdirectoryPathwith atxtextension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath) before being matched by apathMatcher. Thus,pathMatchershould use relative patterns.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocuments
public static List<dev.langchain4j.data.document.Document> loadDocuments(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Loads matchingDocuments from the specified directory. Does not use recursion.
The files are parsed using the specifiedDocumentParser.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.pathMatcher- Only files whose paths match the providedPathMatcherwill be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:*.txt")will load all files fromdirectoryPathwith atxtextension. When traversing the directory, each file path is converted from absolute to relative (relative todirectoryPath) before being matched by apathMatcher. Thus,pathMatchershould use relative patterns.documentParser- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath) Recursively loadsDocuments from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser. The defaultDocumentParseris loaded through SPI (seeDocumentParserFactoru). If noDocumentParserFactoryis available in the classpath, aTextDocumentParseris used.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loadsDocuments from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.documentParser- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher) Recursively loads matchingDocuments from the specified directory and its subdirectories.
The files are parsed using the defaultDocumentParser. The defaultDocumentParseris loaded through SPI (seeDocumentParserFactoru). If noDocumentParserFactoryis available in the classpath, aTextDocumentParseris used.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.pathMatcher- Only files whose paths match the providedPathMatcherwill be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")will load all files fromdirectoryPathand its subdirectories with atxtextension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath) before being matched by apathMatcher. Thus,pathMatchershould use relative patterns. Please be aware that*.txtpattern (with a single asterisk) will match files only in thedirectoryPath, but it will not match files from the subdirectories ofdirectoryPath.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-
loadDocumentsRecursively
public static List<dev.langchain4j.data.document.Document> loadDocumentsRecursively(String directoryOnClasspath, PathMatcher pathMatcher, dev.langchain4j.data.document.DocumentParser documentParser) Recursively loads matchingDocuments from the specified directory and its subdirectories.
The files are parsed using the specifiedDocumentParser.
Skips anyDocuments that fail to load.- Parameters:
directoryOnClasspath- The path to the directory on the classpath with files.pathMatcher- Only files whose paths match the providedPathMatcherwill be loaded. For example, usingFileSystems.getDefault().getPathMatcher("glob:**.txt")will load all files fromdirectoryPathand its subdirectories with atxtextension. When traversing the directory tree, each file path is converted from absolute to relative (relative todirectoryPath) before being matched by apathMatcher. Thus,pathMatchershould use relative patterns. Please be aware that*.txtpattern (with a single asterisk) will match files only in thedirectoryPath, but it will not match files from the subdirectories ofdirectoryPath.documentParser- The parser to be used for parsing text from each file.- Returns:
- list of documents
- Throws:
IllegalArgumentException- If specified path is not a directory.
-