Class PositionalDownsampler

All Implemented Interfaces:
PushPullTransformer<GATKRead>

public final class PositionalDownsampler extends ReadsDownsampler
PositionalDownsampler: Downsample each stack of reads at each alignment start to a size <= a target coverage using a ReservoirDownsampler. Stores only O(target coverage) reads in memory at any given time, provided the client regularly calls consumeFinalizedItems(). Unmapped reads with assigned positions are subject to downsampling in the same way as mapped reads, but unmapped reads without assigned positions are not subject to downsampling.
  • Constructor Summary

    Constructors
    Constructor
    Description
    PositionalDownsampler(int targetCoverage, htsjdk.samtools.SAMFileHeader header)
    Construct a PositionalDownsampler
    PositionalDownsampler(int targetCoverage, htsjdk.samtools.SAMFileHeader header, boolean nonRandomDownsamplingMode)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Empty the downsampler of all finalized/pending items
    Return (and *remove*) all items that have survived downsampling and are waiting to be retrieved.
    boolean
    Are there items that have survived the downsampling process waiting to be retrieved?
    boolean
    Are there items stored in this downsampler that it doesn't yet know whether they will ultimately survive the downsampling process?
    Peek at the first finalized item stored in this downsampler (or null if there are no finalized items)
    Peek at the first pending item stored in this downsampler (or null if there are no pending items)
    boolean
    Does this downsampler require that reads be fed to it in coordinate order?
    void
    Used to tell the downsampler that no more items will be submitted to it, and that it should finalize any pending items.
    void
    Tell this downsampler that no more reads located before the provided read (according to the sort order of the read stream) will be fed to it.
    int
    Get the current number of items in this downsampler This should be the best estimate of the total number of elements that will come out of the downsampler were consumeFinalizedItems() to be called immediately after this call.
    void
    submit(GATKRead newRead)
    Submit one item to the downsampler for consideration.

    Methods inherited from class org.broadinstitute.hellbender.utils.downsampling.Downsampler

    getNumberOfDiscardedItems, incrementNumberOfDiscardedItems, resetStats, submit

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • PositionalDownsampler

      public PositionalDownsampler(int targetCoverage, htsjdk.samtools.SAMFileHeader header)
      Construct a PositionalDownsampler
      Parameters:
      targetCoverage - Maximum number of reads that may share any given alignment start position. Must be > 0
      header - SAMFileHeader to use to determine contig ordering. Non-null.
    • PositionalDownsampler

      public PositionalDownsampler(int targetCoverage, htsjdk.samtools.SAMFileHeader header, boolean nonRandomDownsamplingMode)
  • Method Details

    • submit

      public void submit(GATKRead newRead)
      Description copied from class: Downsampler
      Submit one item to the downsampler for consideration. Some downsamplers will be able to determine immediately whether the item survives the downsampling process, while others will need to see more items before making that determination.
      Specified by:
      submit in interface PushPullTransformer<GATKRead>
      Specified by:
      submit in class Downsampler<GATKRead>
      Parameters:
      newRead - the individual item to submit to the downsampler for consideration
    • hasFinalizedItems

      public boolean hasFinalizedItems()
      Description copied from class: Downsampler
      Are there items that have survived the downsampling process waiting to be retrieved?
      Specified by:
      hasFinalizedItems in interface PushPullTransformer<GATKRead>
      Specified by:
      hasFinalizedItems in class Downsampler<GATKRead>
      Returns:
      true if this downsampler has > 0 finalized items, otherwise false
    • consumeFinalizedItems

      public List<GATKRead> consumeFinalizedItems()
      Description copied from class: Downsampler
      Return (and *remove*) all items that have survived downsampling and are waiting to be retrieved.
      Specified by:
      consumeFinalizedItems in interface PushPullTransformer<GATKRead>
      Specified by:
      consumeFinalizedItems in class Downsampler<GATKRead>
      Returns:
      a list of all finalized items this downsampler contains, or an empty list if there are none
    • hasPendingItems

      public boolean hasPendingItems()
      Description copied from class: Downsampler
      Are there items stored in this downsampler that it doesn't yet know whether they will ultimately survive the downsampling process?
      Specified by:
      hasPendingItems in class Downsampler<GATKRead>
      Returns:
      true if this downsampler has > 0 pending items, otherwise false
    • peekFinalized

      public GATKRead peekFinalized()
      Description copied from class: Downsampler
      Peek at the first finalized item stored in this downsampler (or null if there are no finalized items)
      Specified by:
      peekFinalized in class Downsampler<GATKRead>
      Returns:
      the first finalized item in this downsampler (the item is not removed from the downsampler by this call), or null if there are none
    • peekPending

      public GATKRead peekPending()
      Description copied from class: Downsampler
      Peek at the first pending item stored in this downsampler (or null if there are no pending items)
      Specified by:
      peekPending in class Downsampler<GATKRead>
      Returns:
      the first pending item stored in this downsampler (the item is not removed from the downsampler by this call), or null if there are none
    • size

      public int size()
      Description copied from class: Downsampler
      Get the current number of items in this downsampler This should be the best estimate of the total number of elements that will come out of the downsampler were consumeFinalizedItems() to be called immediately after this call. In other words it should be number of finalized items + estimate of number of pending items that will ultimately be included as well.
      Specified by:
      size in class Downsampler<GATKRead>
      Returns:
      a positive integer
    • signalEndOfInput

      public void signalEndOfInput()
      Description copied from class: Downsampler
      Used to tell the downsampler that no more items will be submitted to it, and that it should finalize any pending items.
      Specified by:
      signalEndOfInput in interface PushPullTransformer<GATKRead>
      Specified by:
      signalEndOfInput in class Downsampler<GATKRead>
    • clearItems

      public void clearItems()
      Description copied from class: Downsampler
      Empty the downsampler of all finalized/pending items
      Specified by:
      clearItems in class Downsampler<GATKRead>
    • requiresCoordinateSortOrder

      public boolean requiresCoordinateSortOrder()
      Description copied from class: ReadsDownsampler
      Does this downsampler require that reads be fed to it in coordinate order?
      Specified by:
      requiresCoordinateSortOrder in class ReadsDownsampler
      Returns:
      true if reads must be submitted to this downsampler in coordinate order, otherwise false
    • signalNoMoreReadsBefore

      public void signalNoMoreReadsBefore(GATKRead read)
      Description copied from class: ReadsDownsampler
      Tell this downsampler that no more reads located before the provided read (according to the sort order of the read stream) will be fed to it. Allows position-aware downsamplers to finalize pending reads earlier than they would otherwise be able to, particularly when doing per-sample downsampling and reads for certain samples are sparser than average.
      Specified by:
      signalNoMoreReadsBefore in class ReadsDownsampler
      Parameters:
      read - the downsampler will assume that no reads located before this read will ever be submitted to it in the future