Class Slice

java.lang.Object
htsjdk.samtools.cram.structure.Slice

public class Slice extends Object
A CRAM slice is a logical construct that is just a subset of the blocks in a Slice. NOTE: Every Slice has a reference context (it is either single-reference (mapped), multi-reference, or unmapped), reflecting depending on the records it contains. Single-ref mapped doesn't mean that the records are necessarily (that is, that their getMappedRead flag is true), only that the records in that slice are PLACED on the corresponding reference contig.
  • Field Details

    • UNINITIALIZED_INDEXING_PARAMETER

      public static final int UNINITIALIZED_INDEXING_PARAMETER
      See Also:
    • EMBEDDED_REFERENCE_ABSENT_CONTENT_ID

      public static final int EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
      See Also:
  • Constructor Details

    • Slice

      public Slice(CRAMVersion cramVersion, CompressionHeader compressionHeader, InputStream inputStream, long containerByteOffset)
      Create a slice by reading a serialized Slice from an input stream.
      Parameters:
      cramVersion - the version of the CRAM stream being read
      compressionHeader - the compression header for the contain in which the Slice resides
      inputStream - the input stream to be read
      containerByteOffset - the stream byte offset of start of the container in which this Slice resides
    • Slice

      public Slice(List<CRAMCompressionRecord> records, CompressionHeader compressionHeader, long containerByteOffset, long globalRecordCounter)
      Create a single Slice from CRAM Compression Records and a Compression Header. The caller is responsible for appropriate subdivision of records into containers and slices (see ContainerFactory}.
      Parameters:
      records - input CRAM Compression Records
      compressionHeader - the enclosing Container's Compression Header
      containerByteOffset -
      globalRecordCounter -
      See Also:
  • Method Details

    • getSliceHeaderBlock

      public Block getSliceHeaderBlock()
    • getAlignmentContext

      public AlignmentContext getAlignmentContext()
    • getSliceBlocks

      public SliceBlocks getSliceBlocks()
    • getNumberOfRecords

      public int getNumberOfRecords()
    • getGlobalRecordCounter

      public long getGlobalRecordCounter()
    • getNumberOfBlocks

      public int getNumberOfBlocks()
      Returns:
      the number of blocks as defined by the CRAM spec; this is 1 for the core block plus the number of external blocks (does not include the slice header block);
    • getContentIDs

      public List<Integer> getContentIDs()
    • getReferenceMD5

      public byte[] getReferenceMD5()
    • getByteOffsetOfSliceHeaderBlock

      public int getByteOffsetOfSliceHeaderBlock()
      The Slice's offset in bytes from the beginning of the Container's Compression Header (or the end of the Container Header), equal to ContainerHeader.getLandmarks() Used by BAI and CRAI indexing
    • setByteOffsetOfSliceHeaderBlock

      public void setByteOffsetOfSliceHeaderBlock(int byteOffsetOfSliceHeaderBlock)
    • getByteSizeOfSliceBlocks

      public int getByteSizeOfSliceBlocks()
      The Slice's size in bytes Used by CRAI indexing only
    • setByteSizeOfSliceBlocks

      public void setByteSizeOfSliceBlocks(int byteSizeOfSliceBlocks)
    • setLandmarkIndex

      public void setLandmarkIndex(int landmarkIndex)
    • getBaseCount

      public long getBaseCount()
    • getSliceTags

      public SAMBinaryTagAndValue getSliceTags()
    • setEmbeddedReferenceContentID

      public void setEmbeddedReferenceContentID(int embeddedReferenceBlockContentID)
      Set the content ID of the embedded reference block. Per the CRAM spec, the value can be -1 (EMBEDDED_REFERENCE_ABSENT_CONTENT_ID) to indicate no embedded reference block is present. If the reference block content ID already has a non-EMBEDDED_REFERENCE_ABSENT_CONTENT_ID value, it cannot be reset. If the embedded reference block has already been set, the provided reference block content ID must agree with the content ID of the existing block.
      Parameters:
      embeddedReferenceBlockContentID -
    • getEmbeddedReferenceContentID

      public int getEmbeddedReferenceContentID()
      Get the content ID of the embedded reference block. Per the CRAM spec, the value can be EMBEDDED_REFERENCE_ABSENT_CONTENT_ID (-1) to indicate no embedded reference block is present.
      Returns:
      id of embedded reference block if present, otherwise EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
    • setEmbeddedReferenceBlock

      public void setEmbeddedReferenceBlock(Block embeddedReferenceBlock)
    • getEmbeddedReferenceBlock

      public Block getEmbeddedReferenceBlock()
      Return the embedded reference block, if any.
      Returns:
      embedded reference block. May be null.
    • getCompressionHeader

      public CompressionHeader getCompressionHeader()
    • deserializeCRAMRecords

      public ArrayList<CRAMCompressionRecord> deserializeCRAMRecords(CompressorCache compressorCache, ValidationStringency validationStringency)
      Reads and decodes the underlying blocks and returns a list of CRAMCompressionRecord. This isn't done initially when the blocks are read from the underlying stream since there are cases where we want to iterate through containers or slices and consume the underlying blocks, but not actually pay the price to decode the records (i.e., during indexing, or when satisfying index queries). The CRAMRecords returned from this are not normalized (read bases, quality scores and mates have not been resolved). See normalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion) for more information about normalization.
      Parameters:
      compressorCache - cached compressor objects to use to decode streams
      validationStringency - validation stringency to use
      Returns:
      list of raw (not normalized) CRAMCompressionRecord for this Slice (normalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion))
    • normalizeCRAMRecords

      public void normalizeCRAMRecords(List<CRAMCompressionRecord> cramCompressionRecords, CRAMReferenceRegion cramReferenceRegion)
      Normalize a list of CRAMCompressionRecord that have been read in from a CRAM stream. Normalization converts raw CRAM records to a state suitable for conversion to SAMRecords, resolving read bases against the reference, as well as quality scores and mates. The records in this list being normalized should be the records from a Slice, not an entire Container, since the relative positions of mate records are determined relative to the Slice (downstream) offsets. NOTE: This mutates (normalizes) the CRAM records in place.
      Parameters:
      cramCompressionRecords - CRAMCompressionRecords to normalize
      cramReferenceRegion - the reference region for this slice
    • write

      public void write(CRAMVersion cramVersion, OutputStream outputStream)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • setReferenceMD5

      public void setReferenceMD5(CRAMReferenceRegion cramReferenceRegion)
    • getMultiRefAlignmentSpans

      public Map<ReferenceContext,AlignmentSpan> getMultiRefAlignmentSpans(CompressorCache compressorCache, ValidationStringency validationStringency)
      Uses a Multiple Reference Slice Alignment Reader to determine the reference spans of a MULTI_REF Slice. Used for creating CRAI/BAI index entries.
      Parameters:
      validationStringency - how strict to be when reading CRAM records
    • getCRAIEntries

      public List<CRAIEntry> getCRAIEntries(CompressorCache compressorCache)
      Generate a CRAI Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.
      Returns:
      a list of CRAI Index Entries derived from this Slice
    • getBAIEntries

      public List<BAIEntry> getBAIEntries(CompressorCache compressorCache)
      Generate a BAIEntry Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.
      Returns:
      a list of BAIEntry Index Entries derived from this Slice