Class CompressionUtils


  • public class CompressionUtils
    extends Object
    • Field Detail

      • COMPRESSED_TEXT_WEIGHT_FACTOR

        public static final long COMPRESSED_TEXT_WEIGHT_FACTOR
        See Also:
        Constant Field Values
    • Constructor Detail

      • CompressionUtils

        public CompressionUtils()
    • Method Detail

      • zip

        public static long zip​(File directory,
                               File outputZipFile,
                               boolean fsync)
                        throws IOException
        Zip the contents of directory into the file indicated by outputZipFile. Sub directories are skipped
        Parameters:
        directory - The directory whose contents should be added to the zip in the output stream.
        outputZipFile - The output file to write the zipped data to
        fsync - True if the output file should be fsynced to disk
        Returns:
        The number of bytes (uncompressed) read from the input directory.
        Throws:
        IOException
      • zip

        public static long zip​(File directory,
                               File outputZipFile)
                        throws IOException
        Zip the contents of directory into the file indicated by outputZipFile. Sub directories are skipped
        Parameters:
        directory - The directory whose contents should be added to the zip in the output stream.
        outputZipFile - The output file to write the zipped data to
        Returns:
        The number of bytes (uncompressed) read from the input directory.
        Throws:
        IOException
      • zip

        public static long zip​(File directory,
                               OutputStream out)
                        throws IOException
        Zips the contents of the input directory to the output stream. Sub directories are skipped
        Parameters:
        directory - The directory whose contents should be added to the zip in the output stream.
        out - The output stream to write the zip data to. Caller is responsible for closing this stream.
        Returns:
        The number of bytes (uncompressed) read from the input directory.
        Throws:
        IOException
      • unzip

        public static FileUtils.FileCopyResult unzip​(com.google.common.io.ByteSource byteSource,
                                                     File outDir,
                                                     com.google.common.base.Predicate<Throwable> shouldRetry,
                                                     boolean cacheLocally)
                                              throws IOException
        Unzip the byteSource to the output directory. If cacheLocally is true, the byteSource is cached to local disk before unzipping. This may cause more predictable behavior than trying to unzip a large file directly off a network stream, for example. * @param byteSource The ByteSource which supplies the zip data
        Parameters:
        byteSource - The ByteSource which supplies the zip data
        outDir - The output directory to put the contents of the zip
        shouldRetry - A predicate expression to determine if a new InputStream should be acquired from ByteSource and the copy attempted again. If you want to retry on any exception, use FileUtils.IS_EXCEPTION.
        cacheLocally - A boolean flag to indicate if the data should be cached locally
        Returns:
        A FileCopyResult containing the result of writing the zip entries to disk
        Throws:
        IOException
      • unzip

        public static FileUtils.FileCopyResult unzip​(File pulledFile,
                                                     File outDir)
                                              throws IOException
        Unzip the pulled file to an output directory. This is only expected to work on zips with lone files, and is not intended for zips with directory structures.
        Parameters:
        pulledFile - The file to unzip
        outDir - The directory to store the contents of the file.
        Returns:
        a FileCopyResult of the files which were written to disk
        Throws:
        IOException
      • unzip

        public static FileUtils.FileCopyResult unzip​(InputStream in,
                                                     File outDir)
                                              throws IOException
        Unzip from the input stream to the output directory, using the entry's file name as the file name in the output directory. The behavior of directories in the input stream's zip is undefined. If possible, it is recommended to use unzip(ByteStream, File) instead
        Parameters:
        in - The input stream of the zip data. This stream is closed
        outDir - The directory to copy the unzipped data to
        Returns:
        The FileUtils.FileCopyResult containing information on all the files which were written
        Throws:
        IOException
      • gunzip

        public static FileUtils.FileCopyResult gunzip​(File pulledFile,
                                                      File outFile)
        gunzip the file to the output file.
        Parameters:
        pulledFile - The source of the gz data
        outFile - A target file to put the contents
        Returns:
        The result of the file copy
        Throws:
        IOException
      • gunzip

        public static FileUtils.FileCopyResult gunzip​(InputStream in,
                                                      File outFile)
                                               throws IOException
        Unzips the input stream via a gzip filter. use gunzip(ByteSource, File, Predicate) if possible
        Parameters:
        in - The input stream to run through the gunzip filter. This stream is closed
        outFile - The file to output to
        Throws:
        IOException
      • gunzip

        public static long gunzip​(InputStream in,
                                  OutputStream out)
                           throws IOException
        gunzip from the source stream to the destination stream.
        Parameters:
        in - The input stream which is to be decompressed. This stream is closed.
        out - The output stream to write to. This stream is closed
        Returns:
        The number of bytes written to the output stream.
        Throws:
        IOException
      • gunzip

        public static FileUtils.FileCopyResult gunzip​(com.google.common.io.ByteSource in,
                                                      File outFile,
                                                      com.google.common.base.Predicate<Throwable> shouldRetry)
        A gunzip function to store locally
        Parameters:
        in - The factory to produce input streams
        outFile - The file to store the result into
        shouldRetry - A predicate to indicate if the Throwable is recoverable
        Returns:
        The count of bytes written to outFile
      • gunzip

        public static FileUtils.FileCopyResult gunzip​(com.google.common.io.ByteSource in,
                                                      File outFile)
        Gunzip from the input stream to the output file
        Parameters:
        in - The compressed input stream to read from
        outFile - The file to write the uncompressed results to
        Returns:
        A FileCopyResult of the file written
      • gzip

        public static long gzip​(InputStream inputStream,
                                OutputStream out)
                         throws IOException
        Copy inputStream to out while wrapping out in a GZIPOutputStream Closes both input and output
        Parameters:
        inputStream - The input stream to copy data from. This stream is closed
        out - The output stream to wrap in a GZIPOutputStream before copying. This stream is closed
        Returns:
        The size of the data copied
        Throws:
        IOException
      • gzip

        public static FileUtils.FileCopyResult gzip​(File inFile,
                                                    File outFile,
                                                    com.google.common.base.Predicate<Throwable> shouldRetry)
        Gzips the input file to the output
        Parameters:
        inFile - The file to gzip
        outFile - A target file to copy the uncompressed contents of inFile to
        shouldRetry - Predicate on a potential throwable to determine if the copy should be attempted again.
        Returns:
        The result of the file copy
        Throws:
        IOException
      • gzip

        public static long gzip​(com.google.common.io.ByteSource in,
                                com.google.common.io.ByteSink out,
                                com.google.common.base.Predicate<Throwable> shouldRetry)
      • gzip

        public static FileUtils.FileCopyResult gzip​(File inFile,
                                                    File outFile)
        GZip compress the contents of inFile into outFile
        Parameters:
        inFile - The source of data
        outFile - The destination for compressed data
        Returns:
        A FileCopyResult of the resulting file at outFile
        Throws:
        IOException
      • isZip

        public static boolean isZip​(String fName)
        Checks to see if fName is a valid name for a "*.zip" file
        Parameters:
        fName - The name of the file in question
        Returns:
        True if fName is properly named for a .zip file, false otherwise
      • isGz

        public static boolean isGz​(String fName)
        Checks to see if fName is a valid name for a "*.gz" file
        Parameters:
        fName - The name of the file in question
        Returns:
        True if fName is a properly named .gz file, false otherwise
      • getGzBaseName

        public static String getGzBaseName​(String fname)
        Get the file name without the .gz extension
        Parameters:
        fname - The name of the gzip file
        Returns:
        fname without the ".gz" extension
        Throws:
        IAE - if fname is not a valid "*.gz" file name