Package htsjdk.samtools.util
Class SortingCollection<T>
java.lang.Object
htsjdk.samtools.util.SortingCollection<T>
- All Implemented Interfaces:
Iterable<T>
Collection to which many records can be added. After all records are added, the collection can be
iterated, and the records will be returned in order defined by the comparator. Records may be spilled
to a temporary directory if there are more records added than will fit in memory. As a result of this,
the objects returned may not be identical to the objects added to the collection, but they should be
equal as determined by the codec used to write them to disk and read them back.
When iterating over the collection, the number of file handles required is numRecordsInCollection/maxRecordsInRam. If this becomes a limiting factor, a file handle cache could be added.
If Snappy DLL is available and snappy.disable system property is not set to true, then Snappy is used to compress temporary files.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic interface
Client must implement this class, which defines the way in which records are written to and read from file. -
Method Summary
Modifier and TypeMethodDescriptionvoid
void
cleanup()
Delete any temporary files.void
This method can be called after caller is done adding to collection, in order to possibly free up memory.boolean
iterator()
Prepare to iterate through the records in order.static <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM) Syntactic sugar around the ctor, to save some typing of type parameters.static <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling) Syntactic sugar around the ctor, to save some typing of type parameters.static <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling, Path... tmpDir) Syntactic sugar around the ctor, to save some typing of type parametersstatic <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, File... tmpDir) Deprecated.since 2017-09.static <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Path... tmpDir) Syntactic sugar around the ctor, to save some typing of type parametersstatic <T> SortingCollection<T>
newInstance
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Collection<File> tmpDirs) Deprecated.since 2017-09.static <T> SortingCollection<T>
newInstanceFromPaths
(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Collection<Path> tmpDirs) Syntactic sugar around the ctor, to save some typing of type parametersvoid
setDestructiveIteration
(boolean destructiveIteration) Tell this collection that it is allowed to discard data during iteration in order to reduce memory footprint, precluding a second iteration.void
Sort the records in memory, write them to a file, and clear the buffer of records in memory.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Method Details
-
add
-
doneAdding
public void doneAdding()This method can be called after caller is done adding to collection, in order to possibly free up memory. If iterator() is called immediately after caller is done adding, this is not necessary, because iterator() triggers the same freeing. -
isDestructiveIteration
public boolean isDestructiveIteration()- Returns:
- True if this collection is allowed to discard data during iteration in order to reduce memory footprint, precluding a second iteration over the collection.
-
setDestructiveIteration
public void setDestructiveIteration(boolean destructiveIteration) Tell this collection that it is allowed to discard data during iteration in order to reduce memory footprint, precluding a second iteration. This is true by default. -
spillToDisk
public void spillToDisk()Sort the records in memory, write them to a file, and clear the buffer of records in memory. -
iterator
Prepare to iterate through the records in order. This method may be called more than once, but add() may not be called after this method has been called. -
cleanup
public void cleanup()Delete any temporary files. After this method is called, iterator() may not be called. -
newInstance
@Deprecated public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, File... tmpDir) Deprecated.since 2017-09. UsenewInstance(Class, Codec, Comparator, int, Path...)
insteadSyntactic sugar around the ctor, to save some typing of type parameters- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDir
- Where to write files of records that will not fit in RAM
-
newInstance
@Deprecated public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Collection<File> tmpDirs) Deprecated.since 2017-09. UsenewInstanceFromPaths(Class, Codec, Comparator, int, Collection)
insteadSyntactic sugar around the ctor, to save some typing of type parameters- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDirs
- Where to write files of records that will not fit in RAM
-
newInstance
public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling) Syntactic sugar around the ctor, to save some typing of type parameters. Writes files to java.io.tmpdir- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to diskprintRecordSizeSampling
- If true record size will be sampled and output at DEBUG log level
-
newInstance
public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling, Path... tmpDir) Syntactic sugar around the ctor, to save some typing of type parameters- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to diskprintRecordSizeSampling
- If true record size will be sampled and output at DEBUG log leveltmpDir
- Where to write files of records that will not fit in RAM
-
newInstance
public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM) Syntactic sugar around the ctor, to save some typing of type parameters. Writes files to java.io.tmpdir- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disk
-
newInstance
public static <T> SortingCollection<T> newInstance(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Path... tmpDir) Syntactic sugar around the ctor, to save some typing of type parameters- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDir
- Where to write files of records that will not fit in RAM
-
newInstanceFromPaths
public static <T> SortingCollection<T> newInstanceFromPaths(Class<T> componentType, SortingCollection.Codec<T> codec, Comparator<T> comparator, int maxRecordsInRAM, Collection<Path> tmpDirs) Syntactic sugar around the ctor, to save some typing of type parameters- Parameters:
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDirs
- Where to write files of records that will not fit in RAM
-