Class HDF5Utils
java.lang.Object
org.broadinstitute.hellbender.tools.copynumber.utils.HDF5Utils
TODO move into hdf5-java-bindings
-
Field Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic double[][]
readChunkedDoubleMatrix
(org.broadinstitute.hdf5.HDF5File file, String path) Reads a large matrix stored as a set of chunks (submatrices) using the sub-paths and conventions used bywriteChunkedDoubleMatrix(org.broadinstitute.hdf5.HDF5File, java.lang.String, double[][], int)
.static List<SimpleInterval>
readIntervals
(org.broadinstitute.hdf5.HDF5File file, String path) Reads a list of intervals from an HDF5 file using the sub-paths and conventions used bywriteIntervals(org.broadinstitute.hdf5.HDF5File, java.lang.String, java.util.List<T>)
.static void
writeChunkedDoubleMatrix
(org.broadinstitute.hdf5.HDF5File file, String path, double[][] matrix, int maxChunkSize) Given a large matrix, chunks the matrix into equally sized subsets of rows (plus a subset containing the remainder, if necessary) and writes these submatrices to indexed sub-paths to avoid a hard limit in Java HDF5 on the number of elements in a matrix given byMAX_NUMBER_OF_VALUES_PER_HDF5_MATRIX
.static <T extends SimpleInterval>
voidwriteIntervals
(org.broadinstitute.hdf5.HDF5File file, String path, List<T> intervals) Given an HDF5 file and an HDF5 path, writes a list of intervals to hard-coded sub-paths.
-
Field Details
-
INTERVAL_CONTIG_NAMES_SUB_PATH
- See Also:
-
INTERVAL_MATRIX_SUB_PATH
- See Also:
-
MAX_NUMBER_OF_VALUES_PER_HDF5_MATRIX
public static final int MAX_NUMBER_OF_VALUES_PER_HDF5_MATRIX- See Also:
-
NUMBER_OF_ROWS_SUB_PATH
- See Also:
-
NUMBER_OF_COLUMNS_SUB_PATH
- See Also:
-
NUMBER_OF_CHUNKS_SUB_PATH
- See Also:
-
CHUNK_INDEX_PATH_SUFFIX
- See Also:
-
-
Method Details
-
readIntervals
public static List<SimpleInterval> readIntervals(org.broadinstitute.hdf5.HDF5File file, String path) Reads a list of intervals from an HDF5 file using the sub-paths and conventions used bywriteIntervals(org.broadinstitute.hdf5.HDF5File, java.lang.String, java.util.List<T>)
. -
writeIntervals
public static <T extends SimpleInterval> void writeIntervals(org.broadinstitute.hdf5.HDF5File file, String path, List<T> intervals) Given an HDF5 file and an HDF5 path, writes a list of intervals to hard-coded sub-paths. Contig names are represented by a string array, while intervals are represented by a double matrix, in which the contigs are represented by their index in the aforementioned string array. -
readChunkedDoubleMatrix
public static double[][] readChunkedDoubleMatrix(org.broadinstitute.hdf5.HDF5File file, String path) Reads a large matrix stored as a set of chunks (submatrices) using the sub-paths and conventions used bywriteChunkedDoubleMatrix(org.broadinstitute.hdf5.HDF5File, java.lang.String, double[][], int)
. -
writeChunkedDoubleMatrix
public static void writeChunkedDoubleMatrix(org.broadinstitute.hdf5.HDF5File file, String path, double[][] matrix, int maxChunkSize) Given a large matrix, chunks the matrix into equally sized subsets of rows (plus a subset containing the remainder, if necessary) and writes these submatrices to indexed sub-paths to avoid a hard limit in Java HDF5 on the number of elements in a matrix given byMAX_NUMBER_OF_VALUES_PER_HDF5_MATRIX
. The number of chunks is determined bymaxChunkSize
, which should be set appropriately for the desired number of columns.- Parameters:
maxChunkSize
- The maximum number of values in each chunk. Decreasing this number will reduce heap usage when writing chunks, which requires subarrays to be copied. However, since a single row is not allowed to be split across multiple chunks, the number of columns must be less than the maximum number of values in each chunk.
-