Class HtsgetBAMFileReader

All Implemented Interfaces:
SamReader.PrimitiveSamReader

public class HtsgetBAMFileReader extends SamReader.ReaderImplementation
Class for reading and querying BAM files from an htsget source
  • Field Details

  • Constructor Details

    • HtsgetBAMFileReader

      public HtsgetBAMFileReader(URI source, boolean eagerDecode, ValidationStringency validationStringency, SAMRecordFactory samRecordFactory, boolean useAsynchronousIO) throws IOException
      Prepare to read BAM from an htsget source
      Parameters:
      source - http(s) URI of htsget resource including ID
      eagerDecode - if true, decode all BAM fields as reading rather than lazily.
      validationStringency - Controls how to handle invalidate reads or header lines.
      samRecordFactory - SAM record factory
      useAsynchronousIO - if true, use asynchronous I/O and prefetching
      Throws:
      IOException
    • HtsgetBAMFileReader

      public HtsgetBAMFileReader(URI source, boolean eagerDecode, ValidationStringency validationStringency, SAMRecordFactory samRecordFactory, boolean useAsynchronousIO, InflaterFactory inflaterFactory) throws IOException
      Prepare to read BAM from a htsget source
      Parameters:
      source - source of bytes.
      eagerDecode - if true, decode all BAM fields as reading rather than lazily.
      validationStringency - Controls how to handle invalidate reads or header lines.
      samRecordFactory - SAM record factory
      useAsynchronousIO - if true, use asynchronous I/O and prefetching
      inflaterFactory - InflaterFactory used by BlockCompressedInputStream
      Throws:
      IOException
  • Method Details

    • fromHtsgetURI

      public static HtsgetBAMFileReader fromHtsgetURI(htsjdk.samtools.HtsgetInputResource source, boolean eagerDecode, ValidationStringency validationStringency, SAMRecordFactory samRecordFactory, boolean useAsynchronousIO, InflaterFactory inflaterFactory) throws IOException, URISyntaxException
      Instantiate an HtsgetBAMFileReader from an HtsgetInputResource, attempting to convert it to an https resource then a http resource if the server does not support https
      Parameters:
      source - source of bytes.
      eagerDecode - if true, decode all BAM fields as reading rather than lazily.
      validationStringency - Controls how to handle invalidate reads or header lines.
      samRecordFactory - SAM record factory
      useAsynchronousIO - if true, use asynchronous I/O and prefetching
      inflaterFactory - InflaterFactory used by BlockCompressedInputStream
      Throws:
      IOException
      URISyntaxException
    • setValidationStringency

      public void setValidationStringency(ValidationStringency validationStringency)
      Set error-checking level for subsequent SAMRecord reads.
    • setSAMRecordFactory

      public void setSAMRecordFactory(SAMRecordFactory samRecordFactory)
      Set SAMRecordFactory for subsequent SAMRecord reads.
    • setEagerDecode

      public void setEagerDecode(boolean eagerDecode)
      Set whether to eagerly decode subsequent SAMRecord reads.
    • enableCrcChecking

      public void enableCrcChecking(boolean check)
      Set whether to check CRC for subsequent iterator or query requests.
    • enableFileSource

      public void enableFileSource(SamReader reader, boolean enabled)
      Set whether to write the source of every read into the source SAMRecords.
    • type

      public SamReader.Type type()
    • isQueryable

      public boolean isQueryable()
      Note that this source is queryable by interval despite NOT having an index
      Returns:
      true
    • hasIndex

      public boolean hasIndex()
      Always false as htsget sources do not have indices
      Returns:
      false
    • getIndex

      public BAMIndex getIndex()
      Always null as htsget sources do not have indices
      Returns:
      null
    • getFileHeader

      public SAMFileHeader getFileHeader()
    • isUsingPOST

      public boolean isUsingPOST()
      Can be used to determine whether the specified source supports the POST api
    • setUsingPOST

      public void setUsingPOST(boolean use)
      Force the source to attempt to use the POST api when requesting multiple intervals. Used for testing
      Parameters:
      use - whether to use the POST api
    • getIterator

      public CloseableIterator<SAMRecord> getIterator()
      Prepare to iterate through the SAMRecords in file order. Unlike file-based BAM readers, multiple iterators may be open at the same time
    • getIterator

      public CloseableIterator<SAMRecord> getIterator(SAMFileSpan fileSpan)
      Generally loads data at a given point in the file. Unsupported for HtsgetBAMFileReaders.
      Parameters:
      fileSpan - The file span.
      Returns:
      An iterator over the given file span.
    • getFilePointerSpanningReads

      public SAMFileSpan getFilePointerSpanningReads()
      Generally gets a pointer to the first read in the file. Unsupported for HtsgetBAMFileReaders.
      Returns:
      An pointer to the first read in the file.
    • query

      public CloseableIterator<SAMRecord> query(QueryInterval[] intervals, boolean contained)
      Prepare to iterate through the SAMRecords that match the given interval. Unlike file-based BAM readers, multiple iterators may be open at the same time

      Note that an unmapped SAMRecord may still have a reference name and an alignment start for sorting purposes (typically this is the coordinate of its mate), and will be found by this method if the coordinate matches the specified interval.

      Parameters:
      intervals - the intervals to include
      contained - If true, the alignments for the SAMRecords must be completely contained in the interval specified by start and end. If false, the SAMRecords need only overlap the interval.
      Returns:
      Iterator for the matching SAMRecords
    • query

      public CloseableIterator<SAMRecord> query(String sequence, int start, int end, boolean contained)
    • query

      public CloseableIterator<SAMRecord> query(List<Locatable> intervals, boolean contained)
      Query intervals directly by contig name instead of index relative to reference, to avoid repeated conversion between name and index representations

      Callers much ensure that the intervals are in increasing order and do not overlap or abut

      Parameters:
      intervals - intervals to query by
      contained - only return reads that are fully contained and not just overlapping if this is true
    • queryAlignmentStart

      public CloseableIterator<SAMRecord> queryAlignmentStart(String sequence, int start)
      Prepare to iterate through the SAMRecords with the given alignment start. Unlike file-based BAM readers, multiple iterators may be open at the same time

      Note that an unmapped SAMRecord may still have a reference name and an alignment start for sorting purposes (typically this is the coordinate of its mate), and will be found by this method if the coordinate matches the specified interval.

      Parameters:
      sequence - Reference sequence sought.
      start - Alignment start sought.
      Returns:
      Iterator for the matching SAMRecords.
    • queryUnmapped

      public CloseableIterator<SAMRecord> queryUnmapped()
      Prepare to iterate through the SAMRecords that are unmapped and do not have a reference name or alignment start. Unlike file-based BAM readers, multiple iterators may be open at the same time

      Returns:
      Iterator for the matching SAMRecords.
    • close

      public void close()
    • getValidationStringency

      public ValidationStringency getValidationStringency()
    • convertHtsgetUriToHttps

      public static URI convertHtsgetUriToHttps(URI uri) throws URISyntaxException
      Throws:
      URISyntaxException
    • convertHtsgetUriToHttp

      public static URI convertHtsgetUriToHttp(URI uri) throws URISyntaxException
      Throws:
      URISyntaxException