Class SRAIndex

java.lang.Object
htsjdk.samtools.SRAIndex
All Implemented Interfaces:
BAMIndex, BrowseableBAMIndex, Closeable, AutoCloseable

public class SRAIndex extends Object implements BrowseableBAMIndex
Emulates BAM index so that we can request chunks of records from SRAFileReader Here is how it works: SRA allows reading of alignments by Reference position fast, so we divide our "file" range for alignments as a length of all references. Reading unaligned reads is then fast if we use read positions for lookup and (internally) filter out aligned fragments. Total SRA "file" range is calculated as sum of all reference lengths plus number of reads (both aligned and unaligned) in SRA archive. Now, we can use Chunks to lookup for aligned and unaligned fragments. We emulate BAM index bins by mapping SRA reference positions to bin numbers. And then we map from bin number to list of chunks, which represent SRA "file" positions (which are simply reference positions). We only emulate last level of BAM index bins (and they refer to a portion of reference SRA_BIN_SIZE bases long). For all other bins RuntimeException will be returned (but since nobody else creates bins, except SRAIndex class that is fine). But since the last level of bins was not meant to refer to fragments that only partially overlap bin reference positions, we also return chunk that goes 5000 bases left before beginning of the bin to assure fragments that start before the bin positions but still overlap with it can be retrieved by SRA reader. Later we will add support to NGS API to get a maximum number of bases that we need to go left to retrieve such fragments. Created by andrii.nikitiuk on 9/4/15.
  • Field Details

    • SRA_BIN_SIZE

      public static final int SRA_BIN_SIZE
      Number of reference bases bins in last level can represent
      See Also:
    • SRA_CHUNK_SIZE

      public static final int SRA_CHUNK_SIZE
      Chunks of that size will be created when using SRA index
      See Also:
  • Constructor Details

  • Method Details

    • getLevelSize

      public int getLevelSize(int levelNumber)
      Gets the size (number of bins in) a given level of a BAM index.
      Specified by:
      getLevelSize in interface BrowseableBAMIndex
      Parameters:
      levelNumber - Level for which to inspect the size.
      Returns:
      Size of the given level.
    • getLevelForBin

      public int getLevelForBin(Bin bin)
      SRA only operates on bins from last level
      Specified by:
      getLevelForBin in interface BrowseableBAMIndex
      Parameters:
      bin - The bin for which to determine the level.
      Returns:
      bin level
    • getFirstLocusInBin

      public int getFirstLocusInBin(Bin bin)
      Gets the first locus that this bin can index into.
      Specified by:
      getFirstLocusInBin in interface BrowseableBAMIndex
      Parameters:
      bin - The bin to test.
      Returns:
      first position that associated with given bin number
    • getLastLocusInBin

      public int getLastLocusInBin(Bin bin)
      Gets the last locus that this bin can index into.
      Specified by:
      getLastLocusInBin in interface BrowseableBAMIndex
      Parameters:
      bin - The bin to test.
      Returns:
      last position that associated with given bin number
    • getBinsOverlapping

      public BinList getBinsOverlapping(int referenceIndex, int startPos, int endPos)
      Provides a list of bins that contain bases at requested positions
      Specified by:
      getBinsOverlapping in interface BrowseableBAMIndex
      Parameters:
      referenceIndex - sequence of desired SAMRecords
      startPos - 1-based start of the desired interval, inclusive
      endPos - 1-based end of the desired interval, inclusive
      Returns:
      a list of bins that contain relevant data
    • getSpanOverlapping

      public BAMFileSpan getSpanOverlapping(Bin bin)
      Description copied from interface: BrowseableBAMIndex
      Perform an overlapping query of all bins bounding the given location.
      Specified by:
      getSpanOverlapping in interface BrowseableBAMIndex
      Parameters:
      bin - The bin over which to perform an overlapping query.
      Returns:
      The file pointers
    • getSpanOverlapping

      public BAMFileSpan getSpanOverlapping(int referenceIndex, int startPos, int endPos)
      Description copied from interface: BAMIndex
      Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive. See the BAM spec for more information on how a chunk is represented.
      Specified by:
      getSpanOverlapping in interface BAMIndex
      Parameters:
      referenceIndex - The contig.
      startPos - Genomic start of query.
      endPos - Genomic end of query.
      Returns:
      A file span listing the chunks in the BAM file.
    • getStartOfLastLinearBin

      public long getStartOfLastLinearBin()
      Description copied from interface: BAMIndex
      Gets the start of the last linear bin in the index.
      Specified by:
      getStartOfLastLinearBin in interface BAMIndex
      Returns:
      a position where aligned fragments end
    • getMetaData

      public BAMIndexMetaData getMetaData(int reference)
      Description copied from interface: BAMIndex
      Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate records
      Specified by:
      getMetaData in interface BAMIndex
      Parameters:
      reference - the reference of interest
      Returns:
      meta data for the reference
    • close

      public void close()
      Description copied from interface: BAMIndex
      Close the index and release any associated resources.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface BAMIndex
      Specified by:
      close in interface Closeable