Class MarkDuplicatesWithMateCigarIterator

java.lang.Object
picard.sam.markduplicates.MarkDuplicatesWithMateCigarIterator
All Implemented Interfaces:
htsjdk.samtools.SAMRecordIterator, htsjdk.samtools.util.CloseableIterator<htsjdk.samtools.SAMRecord>, Closeable, AutoCloseable, Iterator<htsjdk.samtools.SAMRecord>

public class MarkDuplicatesWithMateCigarIterator extends Object implements htsjdk.samtools.SAMRecordIterator
This will iterate through a coordinate sorted SAM file (iterator) and either mark or remove duplicates as appropriate. This class relies on the coordinate sort order as well as the mate cigar (MC) optional SAM tag.
  • Constructor Details

    • MarkDuplicatesWithMateCigarIterator

      public MarkDuplicatesWithMateCigarIterator(htsjdk.samtools.SAMFileHeader header, htsjdk.samtools.util.CloseableIterator<htsjdk.samtools.SAMRecord> iterator, OpticalDuplicateFinder opticalDuplicateFinder, htsjdk.samtools.DuplicateScoringStrategy.ScoringStrategy duplicateScoringStrategy, int toMarkQueueMinimumDistance, boolean removeDuplicates, boolean skipPairsWithNoMateCigar, int maxRecordsInRam, int blockSize, List<File> tmpDirs) throws PicardException
      Initializes the mark duplicates iterator.
      Parameters:
      header - the SAM header
      iterator - an iterator over the SAM records to consider
      opticalDuplicateFinder - the algorithm for optical duplicate detection
      duplicateScoringStrategy - the scoring strategy for choosing duplicates. This cannot be SUM_OF_BASE_QUALITIES.
      toMarkQueueMinimumDistance - minimum distance for which to buffer
      removeDuplicates - true to remove duplicates, false to mark duplicates
      skipPairsWithNoMateCigar - true to not return mapped pairs with no mate cigar, false otherwise
      blockSize - the size of the blocks in the underlying buffer/queue
      tmpDirs - the temporary directories to use if we spill records to disk
      Throws:
      PicardException - if the inputs are not in coordinate sort order
  • Method Details

    • logMemoryStats

      public void logMemoryStats(htsjdk.samtools.util.Log log)
    • assertSorted

      public htsjdk.samtools.SAMRecordIterator assertSorted(htsjdk.samtools.SAMFileHeader.SortOrder sortOrder)
      Establishes that records returned by this iterator are expected to be in the specified sort order. If this method has been called, then implementers must throw an IllegalStateException from tmpReadEnds() when a samRecordWithOrdinal is read that violates the sort order. This method may be called multiple times over the course of an iteration, changing the expected sort, if desired -- from the time it is called, it validates whatever sort is set, or stops validating if it is set to null or SAMFileHeader.SortOrder.unsorted. If this method is not called, then no validation of the iterated records is done.
      Specified by:
      assertSorted in interface htsjdk.samtools.SAMRecordIterator
      Parameters:
      sortOrder - The order in which records are expected to be returned
      Returns:
      This SAMRecordIterator
    • hasNext

      public boolean hasNext()
      Specified by:
      hasNext in interface Iterator<htsjdk.samtools.SAMRecord>
    • next

      public htsjdk.samtools.SAMRecord next() throws PicardException
      Specified by:
      next in interface Iterator<htsjdk.samtools.SAMRecord>
      Throws:
      PicardException
    • remove

      public void remove()
      Specified by:
      remove in interface Iterator<htsjdk.samtools.SAMRecord>
    • close

      public void close()
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in interface htsjdk.samtools.util.CloseableIterator<htsjdk.samtools.SAMRecord>
    • getNumRecordsWithNoMateCigar

      public long getNumRecordsWithNoMateCigar()
      Useful for statistics after the iterator has been exhausted and closed.
    • getNumDuplicates

      public int getNumDuplicates()
    • getLibraryIdGenerator

      public LibraryIdGenerator getLibraryIdGenerator()
    • getOpticalDupesByLibraryId

      public htsjdk.samtools.util.Histogram<Short> getOpticalDupesByLibraryId()