Class NestedJarHandler


  • public class NestedJarHandler
    extends Object
    Unzip a jarfile within a jarfile to a temporary file on disk. Also handles the download of jars from http(s) URLs to temp files.

    Somewhat paradoxically, the fastest way to support scanning zipfiles-within-zipfiles is to unzip the inner zipfile to a temporary file on disk, because the inner zipfile can only be read using ZipInputStream, not ZipFile (the ZipFile constructors only take a File argument). ZipInputStream doesn't have methods for reading the zip directory at the beginning of the stream, so using ZipInputStream rather than ZipFile, you have to decompress the entire zipfile to read all the directory entries. However, there may be many non-whitelisted entries in the zipfile, so this could be a lot of wasted work.

    FastClasspathScanner makes two passes, one to read the zipfile directory, which whitelist and blacklist criteria are applied to (this is a fast operation when using ZipFile), and then an additional pass to read only whitelisted (non-blacklisted) entries. Therefore, in the general case, the ZipFile API is always going to be faster than ZipInputStream. Therefore, decompressing the inner zipfile to disk is the only efficient option.

    • Field Detail

      • TEMP_FILENAME_LEAF_SEPARATOR

        public static final String TEMP_FILENAME_LEAF_SEPARATOR
        The separator between random temp filename part and leafname.
        See Also:
        Constant Field Values
    • Method Detail

      • getZipFileRecycler

        public Recycler<ZipFile,IOException> getZipFileRecycler​(File zipFile,
                                                                LogNode log)
                                                         throws Exception
        Get a ZipFile recycler given the (non-nested) canonical path of a jarfile.
        Parameters:
        zipFile - The zipfile.
        log - The log.
        Returns:
        The ZipFile recycler.
        Throws:
        Exception - If the zipfile could not be opened.
      • getInnermostNestedJar

        public Map.Entry<File,Set<String>> getInnermostNestedJar​(String nestedJarPath,
                                                                 LogNode log)
                                                          throws Exception
        Get a File for a given (possibly nested) jarfile path, unzipping the first N-1 segments of an N-segment '!'-delimited path to temporary files, then returning the File reference for the N-th temporary file.

        If the path does not contain '!', returns the File represented by the path.

        All path segments should end in a jarfile extension, e.g. ".jar" or ".zip".

        Parameters:
        nestedJarPath - The nested jar path.
        log - The log.
        Returns:
        An Entry<File, Set<String>>, where the File is the innermost jar, and the Set<String> is the set of all relative paths of scanning roots within the innermost jar (may be empty, or may contain strings like "target/classes" or similar). If there was an issue with the path, returns null.
        Throws:
        Exception - If the innermost jarfile could not be extracted.
      • getOutermostJar

        public File getOutermostJar​(File jarFile)
        Given a File reference for an inner nested jarfile, find the outermost jarfile it was extracted from.
        Parameters:
        jarFile - The jarfile.
        Returns:
        The outermost jar that the jarfile was contained within.
      • unzipToTempDir

        public File unzipToTempDir​(File jarFile,
                                   String packageRoot,
                                   LogNode log)
                            throws IOException
        Unzip a given package root within a zipfile to a temporary directory, starting several more threads to perform the unzip in parallel, then return the temporary directory. The temporary directory and all of its contents will be removed when NestedJarHandler#close()) is called.

        N.B. standalone code for parallel unzip can be found at https://github.com/lukehutch/quickunzip

        Parameters:
        jarFile - The jarfile.
        packageRoot - The package root to extract from the jar.
        log - The log.
        Returns:
        The File object for the temporary directory the package root was extracted to.
        Throws:
        IOException - If the package root could not be extracted from the jar.
      • close

        public void close​(LogNode log)
        Delete temporary files and release other resources.
        Parameters:
        log - The log.