Package htsjdk.samtools.cram.structure
Class Slice
java.lang.Object
htsjdk.samtools.cram.structure.Slice
A CRAM slice is a logical construct that is just a subset of the blocks in a Slice.
NOTE: Every Slice has a reference context (it is either single-reference (mapped), multi-reference, or
unmapped), reflecting depending on the records it contains. Single-ref mapped doesn't mean that the records
are necessarily (that is, that their getMappedRead flag is true), only that the records in that slice are PLACED
on the corresponding reference contig.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
-
Constructor Summary
ConstructorDescriptionSlice
(CRAMVersion cramVersion, CompressionHeader compressionHeader, InputStream inputStream, long containerByteOffset) Create a slice by reading a serialized Slice from an input stream.Slice
(List<CRAMCompressionRecord> records, CompressionHeader compressionHeader, long containerByteOffset, long globalRecordCounter) Create a single Slice from CRAM Compression Records and a Compression Header. -
Method Summary
Modifier and TypeMethodDescriptiondeserializeCRAMRecords
(CompressorCache compressorCache, ValidationStringency validationStringency) Reads and decodes the underlying blocks and returns a list of CRAMCompressionRecord.getBAIEntries
(CompressorCache compressorCache) Generate a BAIEntry Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.long
int
The Slice's offset in bytes from the beginning of the Container's Compression Header (or the end of the Container Header), equal toContainerHeader.getLandmarks()
Used by BAI and CRAI indexingint
The Slice's size in bytes Used by CRAI indexing onlygetCRAIEntries
(CompressorCache compressorCache) Generate a CRAI Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.Return the embedded reference block, if any.int
Get the content ID of the embedded reference block.long
getMultiRefAlignmentSpans
(CompressorCache compressorCache, ValidationStringency validationStringency) Uses a Multiple Reference Slice Alignment Reader to determine the reference spans of a MULTI_REF Slice.int
int
byte[]
void
normalizeCRAMRecords
(List<CRAMCompressionRecord> cramCompressionRecords, CRAMReferenceRegion cramReferenceRegion) Normalize a list of CRAMCompressionRecord that have been read in from a CRAM stream.void
setByteOffsetOfSliceHeaderBlock
(int byteOffsetOfSliceHeaderBlock) void
setByteSizeOfSliceBlocks
(int byteSizeOfSliceBlocks) void
setEmbeddedReferenceBlock
(Block embeddedReferenceBlock) void
setEmbeddedReferenceContentID
(int embeddedReferenceBlockContentID) Set the content ID of the embedded reference block.void
setLandmarkIndex
(int landmarkIndex) void
setReferenceMD5
(CRAMReferenceRegion cramReferenceRegion) toString()
void
write
(CRAMVersion cramVersion, OutputStream outputStream)
-
Field Details
-
UNINITIALIZED_INDEXING_PARAMETER
public static final int UNINITIALIZED_INDEXING_PARAMETER- See Also:
-
EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
public static final int EMBEDDED_REFERENCE_ABSENT_CONTENT_ID- See Also:
-
-
Constructor Details
-
Slice
public Slice(CRAMVersion cramVersion, CompressionHeader compressionHeader, InputStream inputStream, long containerByteOffset) Create a slice by reading a serialized Slice from an input stream.- Parameters:
cramVersion
- the version of the CRAM stream being readcompressionHeader
- the compression header for the contain in which the Slice residesinputStream
- the input stream to be readcontainerByteOffset
- the stream byte offset of start of the container in which this Slice resides
-
Slice
public Slice(List<CRAMCompressionRecord> records, CompressionHeader compressionHeader, long containerByteOffset, long globalRecordCounter) Create a single Slice from CRAM Compression Records and a Compression Header. The caller is responsible for appropriate subdivision of records into containers and slices (see ContainerFactory}.- Parameters:
records
- input CRAM Compression RecordscompressionHeader
- the enclosingContainer
's Compression HeadercontainerByteOffset
-globalRecordCounter
-- See Also:
-
-
Method Details
-
getSliceHeaderBlock
-
getAlignmentContext
-
getSliceBlocks
-
getNumberOfRecords
public int getNumberOfRecords() -
getGlobalRecordCounter
public long getGlobalRecordCounter() -
getNumberOfBlocks
public int getNumberOfBlocks()- Returns:
- the number of blocks as defined by the CRAM spec; this is 1 for the core block plus the number of external blocks (does not include the slice header block);
-
getContentIDs
-
getReferenceMD5
public byte[] getReferenceMD5() -
getByteOffsetOfSliceHeaderBlock
public int getByteOffsetOfSliceHeaderBlock()The Slice's offset in bytes from the beginning of the Container's Compression Header (or the end of the Container Header), equal toContainerHeader.getLandmarks()
Used by BAI and CRAI indexing -
setByteOffsetOfSliceHeaderBlock
public void setByteOffsetOfSliceHeaderBlock(int byteOffsetOfSliceHeaderBlock) -
getByteSizeOfSliceBlocks
public int getByteSizeOfSliceBlocks()The Slice's size in bytes Used by CRAI indexing only -
setByteSizeOfSliceBlocks
public void setByteSizeOfSliceBlocks(int byteSizeOfSliceBlocks) -
setLandmarkIndex
public void setLandmarkIndex(int landmarkIndex) -
getBaseCount
public long getBaseCount() -
getSliceTags
-
setEmbeddedReferenceContentID
public void setEmbeddedReferenceContentID(int embeddedReferenceBlockContentID) Set the content ID of the embedded reference block. Per the CRAM spec, the value can be -1 (EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
) to indicate no embedded reference block is present. If the reference block content ID already has a non-EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
value, it cannot be reset. If the embedded reference block has already been set, the provided reference block content ID must agree with the content ID of the existing block.- Parameters:
embeddedReferenceBlockContentID
-
-
getEmbeddedReferenceContentID
public int getEmbeddedReferenceContentID()Get the content ID of the embedded reference block. Per the CRAM spec, the value can beEMBEDDED_REFERENCE_ABSENT_CONTENT_ID
(-1) to indicate no embedded reference block is present.- Returns:
- id of embedded reference block if present, otherwise
EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
-
setEmbeddedReferenceBlock
-
getEmbeddedReferenceBlock
Return the embedded reference block, if any.- Returns:
- embedded reference block. May be null.
-
getCompressionHeader
-
deserializeCRAMRecords
public ArrayList<CRAMCompressionRecord> deserializeCRAMRecords(CompressorCache compressorCache, ValidationStringency validationStringency) Reads and decodes the underlying blocks and returns a list of CRAMCompressionRecord. This isn't done initially when the blocks are read from the underlying stream since there are cases where we want to iterate through containers or slices and consume the underlying blocks, but not actually pay the price to decode the records (i.e., during indexing, or when satisfying index queries). The CRAMRecords returned from this are not normalized (read bases, quality scores and mates have not been resolved). SeenormalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion)
for more information about normalization.- Parameters:
compressorCache
- cached compressor objects to use to decode streamsvalidationStringency
- validation stringency to use- Returns:
- list of raw (not normalized) CRAMCompressionRecord for this Slice (
normalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion)
)
-
normalizeCRAMRecords
public void normalizeCRAMRecords(List<CRAMCompressionRecord> cramCompressionRecords, CRAMReferenceRegion cramReferenceRegion) Normalize a list of CRAMCompressionRecord that have been read in from a CRAM stream. Normalization converts raw CRAM records to a state suitable for conversion to SAMRecords, resolving read bases against the reference, as well as quality scores and mates. The records in this list being normalized should be the records from a Slice, not an entire Container, since the relative positions of mate records are determined relative to the Slice (downstream) offsets. NOTE: This mutates (normalizes) the CRAM records in place.- Parameters:
cramCompressionRecords
- CRAMCompressionRecords to normalizecramReferenceRegion
- the reference region for this slice
-
write
-
toString
-
setReferenceMD5
-
getMultiRefAlignmentSpans
public Map<ReferenceContext,AlignmentSpan> getMultiRefAlignmentSpans(CompressorCache compressorCache, ValidationStringency validationStringency) Uses a Multiple Reference Slice Alignment Reader to determine the reference spans of a MULTI_REF Slice. Used for creating CRAI/BAI index entries.- Parameters:
validationStringency
- how strict to be when reading CRAM records
-
getCRAIEntries
Generate a CRAI Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.- Returns:
- a list of CRAI Index Entries derived from this Slice
-
getBAIEntries
Generate a BAIEntry Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.- Returns:
- a list of BAIEntry Index Entries derived from this Slice
-