Class SAMRecord

java.lang.Object
htsjdk.samtools.SAMRecord
All Implemented Interfaces:
HtsRecord, Locatable, Serializable, Cloneable
Direct Known Subclasses:
BAMRecord, SRALazyRecord

public class SAMRecord extends Object implements HtsRecord, Cloneable, Locatable, Serializable
Java binding for a SAM file record. c.f. http://samtools.sourceforge.net/SAM1.pdf

The presence of reference name/reference index and alignment start do not necessarily mean that a read is aligned. Those values may merely be set to force a SAMRecord to appear in a certain place in the sort order. The readUnmappedFlag must be checked to determine whether or not a read is mapped. Only if the readUnmappedFlag is false can the reference name/index and alignment start be interpreted as indicating an actual alignment position.

Likewise, presence of mate reference name/index and mate alignment start do not necessarily mean that the mate is aligned. These may be set for an unaligned mate if the mate has been forced into a particular place in the sort order per the above paragraph. Only if the mateUnmappedFlag is false can the mate reference name/index and mate alignment start be interpreted as indicating the actual alignment position of the mate.

Note also that there are a number of getters & setters that are linked, i.e. they present different representations of the same underlying data. In these cases there is typically a representation that is preferred because it ought to be faster than some other representation. The following are the preferred representations:

  • getReadNameLength() is preferred to getReadName().length()
  • get/setReadBases() is preferred to get/setReadString()
  • get/setBaseQualities() is preferred to get/setBaseQualityString()
  • get/setReferenceIndex() is preferred to get/setReferenceName() for records with valid SAMFileHeaders
  • get/setMateReferenceIndex() is preferred to get/setMateReferenceName() for records with valid SAMFileHeaders
  • getCigarLength() is preferred to getCigar().getNumElements()
  • get/setCigar() is preferred to get/setCigarString()

setHeader() is called by the SAM reading code, so the get/setReferenceIndex() and get/setMateReferenceIndex() methods will have access to the sequence dictionary to resolve reference and mate reference names to dictionary indices.

setHeader() need not be called explicitly when writing SAMRecords, however the writers require a record in order to call get/setReferenceIndex() and get/setMateReferenceIndex(). Therefore adding records to a writer has a side effect: any record that does not have an assigned header at the time it is added to a writer will be updated and assigned the header associated with the writer.

Some of the get() methods return values that are mutable, due to the limitations of Java. A caller should never change the value returned by a get() method. If you want to change the value of some attribute of a SAMRecord, create a new value object and call the appropriate set() method.

Note that setIndexingBin() need not be called when writing SAMRecords. It will be computed as necessary. It is only present as an optimization in the event that the value is already known and need not be computed.

By default, extensive validation of SAMRecords is done when they are read. Very limited validation is done when values are set onto SAMRecords.

Notes on Headerless SAMRecords

If the header is null, the following SAMRecord methods may throw exceptions:

  • getReferenceIndex
  • setReferenceIndex
  • getMateReferenceIndex
  • setMateReferenceIndex

Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.

A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.

Also, SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require the reference and mate reference names to be valid in order to be written. At the time a record is added to a writer it will be updated to use the header associated with the writer and the reference and mate reference names must be valid for that header. If the names cannot be resolved using the writer's header, an exception will be thrown.

See Also:
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • UNKNOWN_MAPPING_QUALITY

      public static final int UNKNOWN_MAPPING_QUALITY
      Alignment score for a good alignment, but where computing a Phred-score is not feasible.
      See Also:
    • NO_MAPPING_QUALITY

      public static final int NO_MAPPING_QUALITY
      Alignment score for an unaligned read.
      See Also:
    • NO_ALIGNMENT_REFERENCE_NAME

      public static final String NO_ALIGNMENT_REFERENCE_NAME
      If a read has this reference name, it is unaligned, but not all unaligned reads have this reference name (see above).
      See Also:
    • NO_ALIGNMENT_REFERENCE_INDEX

      public static final int NO_ALIGNMENT_REFERENCE_INDEX
      If a read has this reference index, it is unaligned, but not all unaligned reads have this reference index (see above).
      See Also:
    • NO_ALIGNMENT_CIGAR

      public static final String NO_ALIGNMENT_CIGAR
      Cigar string for an unaligned read.
      See Also:
    • NO_ALIGNMENT_START

      public static final int NO_ALIGNMENT_START
      If a read has reference name "*", it will have this value for position.
      See Also:
    • NULL_SEQUENCE

      public static final byte[] NULL_SEQUENCE
      This should rarely be used, since a read with no sequence doesn't make much sense.
    • NULL_SEQUENCE_STRING

      public static final String NULL_SEQUENCE_STRING
      See Also:
    • NULL_QUALS

      public static final byte[] NULL_QUALS
      This should rarely be used, since all reads should have quality scores.
    • NULL_QUALS_STRING

      public static final String NULL_QUALS_STRING
      See Also:
    • MAX_INSERT_SIZE

      public static final int MAX_INSERT_SIZE
      abs(insertSize) must be <= this
      See Also:
    • TAGS_TO_REVERSE_COMPLEMENT

      public static final List<String> TAGS_TO_REVERSE_COMPLEMENT
      Tags that are known to need the reverse complement if the read is reverse complemented.
    • TAGS_TO_REVERSE

      public static final List<String> TAGS_TO_REVERSE
      Tags that are known to need the reverse if the read is reverse complemented.
    • mReferenceIndex

      protected Integer mReferenceIndex
    • mMateReferenceIndex

      protected Integer mMateReferenceIndex
  • Constructor Details

  • Method Details

    • getReadName

      public String getReadName()
    • getReadNameLength

      public int getReadNameLength()
      This method is preferred over getReadName().length(), because for BAMRecord it may be faster.
      Returns:
      length not including a null terminator.
    • setReadName

      public void setReadName(String value)
    • getReadString

      public String getReadString()
      Returns:
      read sequence as a string of ACGTN=.
    • setReadString

      public void setReadString(String value)
    • getReadBases

      public byte[] getReadBases()
      Do not modify the value returned by this method. If you want to change the bases, create a new byte[] and call setReadBases() or call setReadString().
      Returns:
      read sequence as ASCII bytes ACGTN=.
    • setReadBases

      public void setReadBases(byte[] value)
    • getReadLength

      public int getReadLength()
      This method is preferred over getReadBases().length, because for BAMRecord it may be faster.
      Returns:
      number of bases in the read.
    • getBaseQualityString

      public String getBaseQualityString()
      Returns:
      Base qualities, encoded as a FASTQ string.
    • setBaseQualityString

      public void setBaseQualityString(String value)
    • getBaseQualities

      public byte[] getBaseQualities()
      Do not modify the value returned by this method. If you want to change the qualities, create a new byte[] and call setBaseQualities() or call setBaseQualityString().
      Returns:
      Base qualities, as binary phred scores (not ASCII).
    • setBaseQualities

      public void setBaseQualities(byte[] value)
    • getOriginalBaseQualities

      public byte[] getOriginalBaseQualities()
      If the original base quality scores have been store in the "OQ" tag will return the numeric score as a byte[]
    • setOriginalBaseQualities

      public void setOriginalBaseQualities(byte[] oq)
      Sets the original base quality scores into the "OQ" tag as a String. Supplied value should be as phred-scaled numeric qualities.
    • getReferenceName

      public String getReferenceName()
      Returns:
      Reference name, or NO_ALIGNMENT_REFERENCE_NAME (*) if the record has no reference name
    • setReferenceName

      public void setReferenceName(String referenceName)
      Sets the reference name for this record. If the record has a valid SAMFileHeader and the reference name is present in the associated sequence dictionary, the record's reference index will also be updated with the corresponding sequence index. If referenceName is NO_ALIGNMENT_REFERENCE_NAME, sets the reference index to NO_ALIGNMENT_REFERENCE_INDEX.
      Parameters:
      referenceName - - must not be null
      Throws:
      IllegalArgumentException - if referenceName is null
    • getReferenceIndex

      public Integer getReferenceIndex()
      Returns the reference index for this record. If the reference name for this record has previously been resolved against the sequence dictionary, the corresponding index is returned directly. Otherwise, the record must have a non-null SAMFileHeader that can be used to resolve the index for the record's current reference name, unless the reference name is NO_ALIGNMENT_REFERENCE_NAME. If the record has a header, and the name does not appear in the header's sequence dictionary, the value NO_ALIGNMENT_REFERENCE_INDEX (-1) will be returned. If the record does not have a header, an IllegalStateException is thrown.
      Returns:
      Index in the sequence dictionary of the reference sequence. If the read has no reference sequence, or if the reference name is not found in the sequence index, NO_ALIGNMENT_REFERENCE_INDEX (-1) is returned.
      Throws:
      IllegalStateException - if the reference index must be resolved but cannot be because the SAMFileHeader for the record is null.
    • setReferenceIndex

      public void setReferenceIndex(int referenceIndex)
      Updates the reference index. The record must have a valid SAMFileHeader unless the referenceIndex parameter equals NO_ALIGNMENT_REFERENCE_INDEX, and the reference index must appear in the header's sequence dictionary. If the reference index is valid, the reference name will also be resolved and updated to the name for the sequence dictionary entry corresponding to the index.
      Parameters:
      referenceIndex - Must either equal NO_ALIGNMENT_REFERENCE_INDEX (-1) indicating no reference, or the record must have a SAMFileHeader and the index must exist in the associated sequence dictionary.
      Throws:
      IllegalStateException - if referenceIndex is not equal to NO_ALIGNMENT_REFERENCE_INDEX and the SAMFileHeader is null for this record
      IllegalArgumentException - if referenceIndex is not found in the sequence dictionary in the header for this record.
    • getMateReferenceName

      public String getMateReferenceName()
      Returns:
      Mate reference name, or NO_ALIGNMENT_REFERENCE_NAME (*) if the record has no mate reference name
    • setMateReferenceName

      public void setMateReferenceName(String mateReferenceName)
      Sets the mate reference name for this record. If the record has a valid SAMFileHeader and the mate reference name is present in the associated sequence dictionary, the record's mate reference index will also be updated with the corresponding sequence index. If mateReferenceName is NO_ALIGNMENT_REFERENCE_NAME, sets the mate reference index to NO_ALIGNMENT_REFERENCE_INDEX.
      Parameters:
      mateReferenceName - - must not be null
      Throws:
      IllegalArgumentException - if mateReferenceName is null
    • getMateReferenceIndex

      public Integer getMateReferenceIndex()
      Returns the mate reference index for this record. If the mate reference name for this record has previously been resolved against the sequence dictionary, the corresponding index is returned directly. Otherwise, the record must have a non-null SAMFileHeader that can be used to resolve the index for the record's current mate reference name, unless the mate reference name is NO_ALIGNMENT_REFERENCE_NAME. If the record has a header, and the name does not appear in the header's sequence dictionary, the value NO_ALIGNMENT_REFERENCE_INDEX (-1) will be returned. If the record does not have a header, an IllegalStateException is thrown.
      Returns:
      Index in the sequence dictionary of the mate reference sequence. If the read has no mate reference sequence, or if the mate reference name is not found in the sequence index, NO_ALIGNMENT_REFERENCE_INDEX (-1) is returned.
      Throws:
      IllegalStateException - if the mate reference index must be resolved but cannot be because the SAMFileHeader for the record is null.
    • setMateReferenceIndex

      public void setMateReferenceIndex(int mateReferenceIndex)
      Updates the mate reference index. The record must have a valid SAMFileHeader, and the mate reference index must appear in the header's sequence dictionary, unless the mateReferenceIndex parameter equals NO_ALIGNMENT_REFERENCE_INDEX. If the mate reference index is valid, the mate reference name will also be resolved and updated to the name for the sequence dictionary entry corresponding to the index.
      Parameters:
      mateReferenceIndex - Must either equal NO_ALIGNMENT_REFERENCE_INDEX (-1) indicating no reference, or the record must have a SAMFileHeader and the index must exist in the associated sequence dictionary.
      Throws:
      IllegalStateException - if the SAMFileHeader is null for this record
      IllegalArgumentException - if the mate reference index is not found in the sequence dictionary in the header for this record.
    • resolveIndexFromName

      protected static Integer resolveIndexFromName(String referenceName, SAMFileHeader header, boolean strict)
      Static method that resolves and returns the reference index corresponding to a given reference name.
      Parameters:
      referenceName - If referenceName is NO_ALIGNMENT_REFERENCE_NAME, the value NO_ALIGNMENT_REFERENCE_INDEX is returned directly. Otherwise referenceName must be looked up in the header's sequence dictionary.
      header - SAMFileHeader to use when resolving referenceName to an index. Must be non null if the referenceName is not NO_ALIGNMENT_REFERENCE_NAME.
      strict - if true, throws if referenceName does not appear in the header's sequence dictionary
      Throws:
      IllegalStateException - if referenceName is not equal to NO_ALIGNMENT_REFERENCE_NAME and the header is null
      IllegalArgumentException - if strict is true and the name does not appear in header's sequence dictionary. Does not mutate the SAMRecord.
    • resolveNameFromIndex

      protected static String resolveNameFromIndex(int referenceIndex, SAMFileHeader header)
      Static method that resolves and returns the reference name corresponding to a given reference index.
      Parameters:
      referenceIndex - If referenceIndex is NO_ALIGNMENT_REFERENCE_INDEX, the value NO_ALIGNMENT_REFERENCE_NAME is returned directly. Otherwise referenceIndex must be looked up in the header's sequence dictionary.
      header - SAMFileHeader to use when resolving referenceIndex to a name. Must be non null unless the the referenceIndex is NO_ALIGNMENT_REFERENCE_INDEX.
      Throws:
      IllegalStateException - if referenceIndex is not equal to NO_ALIGNMENT_REFERENCE_NAME and the header is null
      IllegalArgumentException - if referenceIndex does not appear in header's sequence dictionary. Does not mutate the SAMRecord.
    • getAlignmentStart

      public int getAlignmentStart()
      Returns:
      1-based inclusive leftmost position of the sequence remaining after clipping, or 0 if there is no position, e.g. for unmapped read.
    • setAlignmentStart

      public void setAlignmentStart(int value)
      Parameters:
      value - 1-based inclusive leftmost position of the sequence remaining after clipping or 0 if there is no position, e.g. for unmapped read.
    • getAlignmentEnd

      public int getAlignmentEnd()
      Returns:
      1-based inclusive rightmost position of the sequence remaining after clipping or 0 if there is no position, e.g. for unmapped read.
    • getUnclippedStart

      public int getUnclippedStart()
      Returns:
      the alignment start (1-based, inclusive) adjusted for clipped bases. For example if the read has an alignment start of 100 but the first 4 bases were clipped (hard or soft clipped) then this method will return 96. Invalid to call on an unmapped read.
    • getUnclippedEnd

      public int getUnclippedEnd()
      Returns:
      the alignment end (1-based, inclusive) adjusted for clipped bases. For example if the read has an alignment end of 100 but the last 7 bases were clipped (hard or soft clipped) then this method will return 107. Invalid to call on an unmapped read.
    • getReferencePositionAtReadPosition

      public int getReferencePositionAtReadPosition(int position)
      Non static version of the static function with the same name.
      Parameters:
      position - 1-based location within the unclipped sequence
      Returns:
      1-based reference position of the unclipped sequence at a given read position, or 0 if there is no position.
    • getReferencePositionAtReadPosition

      public static int getReferencePositionAtReadPosition(SAMRecord rec, int position)
      Returns the 1-based reference position for the provided 1-based position in read. For example, given the sequence NNNAAACCCGGG, cigar 3S9M, and an alignment start of 1, and a (1-based) position of 10 (start of GGG) it returns 7 (1-based position starting after the soft clip. For example: given the sequence AAACCCGGGTTT, cigar 4M1D6M, an alignment start of 1, a position of 4, returns reference position 4, a position of 5 returns reference position 6. Another example: given the sequence AAACCCGGGTTT, cigar 4M1I6M, an alignment start of 1, a position of 4 returns reference position 4, an position of 5 returns 0.
      Parameters:
      rec - record to use
      position - 1-based location within the unclipped sequence
      Returns:
      1-based reference position of the unclipped sequence at a given read position, or 0 if there is no position.
    • getReadPositionAtReferencePosition

      public int getReadPositionAtReferencePosition(int pos)
      Returns the 1-based position in the read of the 1-based reference position provided.
      Parameters:
      pos - 1-based reference position
      Returns:
      1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. See examples in the static version below
    • getReadPositionAtReferencePosition

      public int getReadPositionAtReferencePosition(int pos, boolean returnLastBaseIfDeleted)
      Non-static version of static function with the same name. See examples below.
      Parameters:
      pos - 1-based reference position
      returnLastBaseIfDeleted - if positive, and reference position matches a deleted base in the read, function will return the offset
      Returns:
      1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. If returnLastBaseIfDeleted is true deletions are assumed to "live" on the last read base in the preceding block.
    • getReadPositionAtReferencePosition

      public static int getReadPositionAtReferencePosition(SAMRecord rec, int pos, boolean returnLastBaseIfDeleted)
      Returns the 1-based position in the read of the provided reference position, or 0 if no such position exists. For example, given the sequence NNNAAACCCGGG, cigar 3S9M, and an alignment start of 1, and a (1-based) pos of 7 (start of GGG) it returns 10 (1-based position including the soft clip). For example: given the sequence AAACCCGGGT, cigar 4M1D6M, an alignment start of 1, a reference position of 4 returns read position 4, a reference position of 5 also returns a read position of 4 if returnLastBaseIfDeleted and 0 otherwise. For example: given the sequence AAACtCGGGTT, cigar 4M1I6M, an alignment start of 1, a position 4 returns a position of 5, a position of 5 returns 6 (the inserted base is the 5th read position), a position of 11 returns 0 since that position in the reference doesn't overlap the read at all.
      Parameters:
      rec - record to use
      pos - 1-based reference position
      returnLastBaseIfDeleted - if positive, and reference position matches a deleted base in the read, function will return the position of the last non-deleted base
      Returns:
      1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. If returnLastBaseIfDeleted is true deletions are assumed to "live" on the last read base in the preceding block.
    • getMateAlignmentStart

      public int getMateAlignmentStart()
      Returns:
      1-based inclusive leftmost position of the clipped mate sequence, or 0 if there is no position.
    • setMateAlignmentStart

      public void setMateAlignmentStart(int mateAlignmentStart)
    • getInferredInsertSize

      public int getInferredInsertSize()
      Returns:
      insert size (difference btw 5' end of read & 5' end of mate), if possible, else 0. Negative if mate maps to lower position than read.
    • setInferredInsertSize

      public void setInferredInsertSize(int inferredInsertSize)
    • getMappingQuality

      public int getMappingQuality()
      Returns:
      phred scaled mapping quality. 255 implies valid mapping but quality is hard to compute.
    • setMappingQuality

      public void setMappingQuality(int value)
    • getCigarString

      public String getCigarString()
    • setCigarString

      public void setCigarString(String value)
    • getCigar

      public Cigar getCigar()
      Do not modify the value returned by this method. If you want to change the Cigar, create a new Cigar and call setCigar() or call setCigarString()
      Returns:
      Cigar object for the read, or null if there is none.
    • getCigarLength

      public int getCigarLength()
      This method is preferred over getCigar().getNumElements(), because for BAMRecord it may be faster.
      Returns:
      number of cigar elements (number + operator) in the cigar string.
    • setCigar

      public void setCigar(Cigar cigar)
      For setting the Cigar string when changed. Note that this nulls the indexing bin, which would need to be recomputed on write (if needed). To avoid clobbering the indexing bin, use initializeCigar(htsjdk.samtools.Cigar)
    • initializeCigar

      protected void initializeCigar(Cigar cigar)
      For setting the Cigar string when BAMRecord has decoded it. Use this rather than setCigar(htsjdk.samtools.Cigar) so that indexing bin doesn't get clobbered.
    • getReadGroup

      public SAMReadGroupRecord getReadGroup()
      Get the SAMReadGroupRecord for this SAMRecord.
      Returns:
      The SAMReadGroupRecord from the SAMFileHeader for this SAMRecord, or null if 1) this record has no RG tag, or 2) the header doesn't contain the read group with the given ID.or 3) this record has no SAMFileHeader
      Throws:
      ClassCastException - if RG tag does not have a String value.
    • getFlags

      public int getFlags()
      It is preferable to use the get*Flag() methods that handle the flag word symbolically.
    • setFlags

      public void setFlags(int value)
    • getReadPairedFlag

      public boolean getReadPairedFlag()
      the read is paired in sequencing, no matter whether it is mapped in a pair.
    • getProperPairFlag

      public boolean getProperPairFlag()
      the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment).
    • getReadUnmappedFlag

      public boolean getReadUnmappedFlag()
      the query sequence itself is unmapped.
    • getMateUnmappedFlag

      public boolean getMateUnmappedFlag()
      the mate is unmapped.
    • getReadNegativeStrandFlag

      public boolean getReadNegativeStrandFlag()
      strand of the query (false for forward; true for reverse strand).
    • getMateNegativeStrandFlag

      public boolean getMateNegativeStrandFlag()
      strand of the mate (false for forward; true for reverse strand).
    • getFirstOfPairFlag

      public boolean getFirstOfPairFlag()
      the read is the first read in a pair.
    • getSecondOfPairFlag

      public boolean getSecondOfPairFlag()
      the read is the second read in a pair.
    • getNotPrimaryAlignmentFlag

      @Deprecated public boolean getNotPrimaryAlignmentFlag()
      Deprecated.
      the alignment is not primary (a read having split hits may have multiple primary alignment records).
    • isSecondaryAlignment

      public boolean isSecondaryAlignment()
      Returns:
      whether the alignment is secondary (an alternative alignment of the read).
    • getSupplementaryAlignmentFlag

      public boolean getSupplementaryAlignmentFlag()
      Returns:
      whether the alignment is supplementary (a split alignment such as a chimeric alignment).
    • getReadFailsVendorQualityCheckFlag

      public boolean getReadFailsVendorQualityCheckFlag()
      the read fails platform/vendor quality checks.
    • getDuplicateReadFlag

      public boolean getDuplicateReadFlag()
      the read is either a PCR duplicate or an optical duplicate.
    • setReadPairedFlag

      public void setReadPairedFlag(boolean flag)
      the read is paired in sequencing, no matter whether it is mapped in a pair.
    • setProperPairFlag

      public void setProperPairFlag(boolean flag)
      the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment).
    • setReadUmappedFlag

      @Deprecated public void setReadUmappedFlag(boolean flag)
      Deprecated.
      the query sequence itself is unmapped. This method name is misspelled. Use setReadUnmappedFlag(boolean) instead.
    • setReadUnmappedFlag

      public void setReadUnmappedFlag(boolean flag)
      the query sequence itself is unmapped.
    • setMateUnmappedFlag

      public void setMateUnmappedFlag(boolean flag)
      the mate is unmapped.
    • setReadNegativeStrandFlag

      public void setReadNegativeStrandFlag(boolean flag)
      strand of the query (false for forward; true for reverse strand).
    • setMateNegativeStrandFlag

      public void setMateNegativeStrandFlag(boolean flag)
      strand of the mate (false for forward; true for reverse strand).
    • setFirstOfPairFlag

      public void setFirstOfPairFlag(boolean flag)
      the read is the first read in a pair.
    • setSecondOfPairFlag

      public void setSecondOfPairFlag(boolean flag)
      the read is the second read in a pair.
    • setNotPrimaryAlignmentFlag

      @Deprecated public void setNotPrimaryAlignmentFlag(boolean flag)
      Deprecated.
      the alignment is not primary (a read having split hits may have multiple primary alignment records).
    • setSecondaryAlignment

      public void setSecondaryAlignment(boolean flag)
      set whether this alignment is secondary (an alternative alignment of the read).
    • setSupplementaryAlignmentFlag

      public void setSupplementaryAlignmentFlag(boolean flag)
      set whether this alignment is supplementary (a split alignment such as a chimeric alignment).
    • setReadFailsVendorQualityCheckFlag

      public void setReadFailsVendorQualityCheckFlag(boolean flag)
      the read fails platform/vendor quality checks.
    • setDuplicateReadFlag

      public void setDuplicateReadFlag(boolean flag)
      the read is either a PCR duplicate or an optical duplicate.
    • isSecondaryOrSupplementary

      public boolean isSecondaryOrSupplementary()
      Tests if this record is either a secondary and/or supplementary alignment; equivalent to (getNotPrimaryAlignmentFlag() || getSupplementaryAlignmentFlag()).
    • getValidationStringency

      public ValidationStringency getValidationStringency()
    • setValidationStringency

      public void setValidationStringency(ValidationStringency validationStringency)
      Control validation of lazily-decoded elements.
    • hasAttribute

      public boolean hasAttribute(String tag)
      Returns:
      true if the SAM record has the requested attribute set, false otherwise.
    • hasAttribute

      public boolean hasAttribute(SAMTag tag)
      Returns:
      true if the SAM record has the requested attribute set, false otherwise.
    • getAttribute

      public Object getAttribute(String tag)
      Get the value for a SAM tag. WARNING: Some value types (e.g. byte[]) are mutable. It is dangerous to change one of these values in place, because some SAMRecord implementations keep track of when attributes have been changed. If you want to change an attribute value, call setAttribute() to replace the value.
      Parameters:
      tag - Two-character tag name.
      Returns:
      Appropriately typed tag value, or null if the requested tag is not present.
    • getAttribute

      public Object getAttribute(SAMTag tag)
      Get the value for a SAM tag. WARNING: Some value types (e.g. byte[]) are mutable. It is dangerous to change one of these values in place, because some SAMRecord implementations keep track of when attributes have been changed. If you want to change an attribute value, call setAttribute() to replace the value.
      Parameters:
      tag - the SAM tag.
      Returns:
      Appropriately typed tag value, or null if the requested tag is not present.
    • getIntegerAttribute

      public Integer getIntegerAttribute(SAMTag tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a signed Integer if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a signed Integer.
    • getIntegerAttribute

      public Integer getIntegerAttribute(String tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a signed Integer if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a signed Integer.
    • getUnsignedIntegerAttribute

      public Long getUnsignedIntegerAttribute(String tag) throws SAMException
      A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.
      Parameters:
      tag - Two-character tag name.
      Returns:
      valid unsigned integer associated with the tag, as a Long
      Throws:
      SAMException - if the value is out of range for a 32-bit unsigned value, or not a Number
    • getUnsignedIntegerAttribute

      public Long getUnsignedIntegerAttribute(SAMTag tag) throws SAMException
      A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.
      Parameters:
      tag - Two-character tag name.
      Returns:
      valid unsigned integer associated with the tag, as a Long
      Throws:
      SAMException - if the value is out of range for a 32-bit unsigned value, or not a Number
    • getUnsignedIntegerAttribute

      public Long getUnsignedIntegerAttribute(short tag) throws SAMException
      A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.
      Parameters:
      tag - Binary representation of a 2-char String tag as created by SAMTagUtil.
      Returns:
      valid unsigned integer associated with the tag, as a Long
      Throws:
      SAMException - if the value is out of range for a 32-bit unsigned value, or not a Number
    • getShortAttribute

      public Short getShortAttribute(SAMTag tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a Short if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a Short.
    • getShortAttribute

      public Short getShortAttribute(String tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a Short if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a Short.
    • getByteAttribute

      public Byte getByteAttribute(SAMTag tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a Byte if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a Byte.
    • getByteAttribute

      public Byte getByteAttribute(String tag)
      Get the tag value and attempt to coerce it into the requested type.
      Parameters:
      tag - The requested tag.
      Returns:
      The value of a tag, converted into a Byte if possible.
      Throws:
      RuntimeException - If the value is not an integer type, or will not fit in a Byte.
    • getStringAttribute

      public String getStringAttribute(SAMTag tag)
    • getStringAttribute

      public String getStringAttribute(String tag)
    • getCharacterAttribute

      public Character getCharacterAttribute(SAMTag tag)
    • getCharacterAttribute

      public Character getCharacterAttribute(String tag)
    • getFloatAttribute

      public Float getFloatAttribute(SAMTag tag)
    • getFloatAttribute

      public Float getFloatAttribute(String tag)
    • getByteArrayAttribute

      public byte[] getByteArrayAttribute(SAMTag tag)
      Will work for signed byte array, unsigned byte array, or old-style hex array
    • getByteArrayAttribute

      public byte[] getByteArrayAttribute(String tag)
      Will work for signed byte array, unsigned byte array, or old-style hex array
    • getUnsignedByteArrayAttribute

      public byte[] getUnsignedByteArrayAttribute(SAMTag tag)
    • getUnsignedByteArrayAttribute

      public byte[] getUnsignedByteArrayAttribute(String tag)
    • getSignedByteArrayAttribute

      public byte[] getSignedByteArrayAttribute(SAMTag tag)
      Will work for signed byte array or old-style hex array
    • getSignedByteArrayAttribute

      public byte[] getSignedByteArrayAttribute(String tag)
      Will work for signed byte array or old-style hex array
    • getUnsignedShortArrayAttribute

      public short[] getUnsignedShortArrayAttribute(SAMTag tag)
    • getUnsignedShortArrayAttribute

      public short[] getUnsignedShortArrayAttribute(String tag)
    • getSignedShortArrayAttribute

      public short[] getSignedShortArrayAttribute(SAMTag tag)
    • getSignedShortArrayAttribute

      public short[] getSignedShortArrayAttribute(String tag)
    • getUnsignedIntArrayAttribute

      public int[] getUnsignedIntArrayAttribute(SAMTag tag)
    • getUnsignedIntArrayAttribute

      public int[] getUnsignedIntArrayAttribute(String tag)
    • getSignedIntArrayAttribute

      public int[] getSignedIntArrayAttribute(SAMTag tag)
    • getSignedIntArrayAttribute

      public int[] getSignedIntArrayAttribute(String tag)
    • getFloatArrayAttribute

      public float[] getFloatArrayAttribute(SAMTag tag)
    • getFloatArrayAttribute

      public float[] getFloatArrayAttribute(String tag)
    • isUnsignedArrayAttribute

      public boolean isUnsignedArrayAttribute(String tag)
      Returns:
      True if this tag is an unsigned array, else false.
      Throws:
      SAMException - if the tag is not present.
    • getAttribute

      public Object getAttribute(short tag)
      Parameters:
      tag - Binary representation of a 2-char String tag as created by SAMTagUtil.
      See Also:
    • setAttribute

      public void setAttribute(String tag, Object value)
      Set a named attribute onto the SAMRecord. Passing a null value causes the attribute to be cleared.
      Parameters:
      tag - two-character tag name. See http://samtools.sourceforge.net/SAM1.pdf for standard and user-defined tags.
      value - Supported types are String, Char, Integer, Float, Long (for values that fit into a signed or unsigned 32-bit integer only), byte[], short[], int[], float[]. If value == null, tag is cleared. Byte and Short are allowed but discouraged. If written to a SAM file, these will be converted to Integer, whereas if written to BAM, getAttribute() will return as Byte or Short, respectively. Long is allowed for values that fit into a signed or unsigned 32-bit integer only, but discouraged. To set unsigned byte[], unsigned short[] or unsigned int[] (which is discouraged because of poor Java language support), setUnsignedArrayAttribute() must be used instead of this method. String values are not validated to ensure that they conform to SAM spec.
    • setUnsignedArrayAttribute

      public void setUnsignedArrayAttribute(String tag, Object value)
      Because Java does not support unsigned integer types, we think it is a bad idea to encode them in SAM files. If you must do so, however, you must call this method rather than setAttribute, because calling this method is the way to indicate that, e.g. a short array should be interpreted as unsigned shorts.
      Parameters:
      value - must be one of byte[], short[], int[]
    • setAttribute

      protected void setAttribute(short tag, Object value)
      Parameters:
      tag - Binary representation of a 2-char String tag as created by SAMTagUtil.
      See Also:
    • setAttribute

      public void setAttribute(SAMTag tag, Object value)
      Parameters:
      tag - Binary representation of a 2-char String tag as created by SAMTagUtil.
      See Also:
    • isAllowedAttributeValue

      @Deprecated protected static boolean isAllowedAttributeValue(Object value)
      Deprecated.
      The attribute type and value checks have been moved directly into SAMBinaryTagAndValue.
      Checks if the value is allowed as an attribute value.
      Parameters:
      value - the value to be checked
      Returns:
      true if the value is valid and false otherwise
    • setAttribute

      protected void setAttribute(SAMTag tag, Object value, boolean isUnsignedArray)
    • setAttribute

      protected void setAttribute(short tag, Object value, boolean isUnsignedArray)
    • clearAttributes

      public void clearAttributes()
      Removes all attributes.
    • setAttributes

      protected void setAttributes(SAMBinaryTagAndValue attributes)
      Replace any existing attributes with the given linked item. NOTE: this method is intended to only be called from subclasses.
    • getBinaryAttributes

      protected SAMBinaryTagAndValue getBinaryAttributes()
      Returns:
      Pointer to the first of the tags. Returns null if there are no tags.
    • getContig

      public String getContig()
      Description copied from interface: Locatable
      Gets the contig name for the contig this is mapped to. May return null if there is no unique mapping.
      Specified by:
      getContig in interface Locatable
      Returns:
      reference name, null if this is unmapped
    • getStart

      public int getStart()
      Specified by:
      getStart in interface Locatable
      Returns:
      1-based inclusive leftmost position of the clipped sequence, or 0 if there is no position.
    • getEnd

      public int getEnd()
      an alias of getAlignmentEnd()
      Specified by:
      getEnd in interface Locatable
      Returns:
      1-based inclusive rightmost position of the clipped sequence, or 0 read if unmapped.
    • getAttributes

      public List<SAMRecord.SAMTagAndValue> getAttributes()
      Returns:
      list of {tag, value} tuples
    • computeIndexingBinIfAbsent

      @Deprecated public int computeIndexingBinIfAbsent(SAMRecord alignment)
      Deprecated.
      Use computeIndexingBin() if accessible or GenomicIndexUtil.regionToBin() otherwise.
    • getHeader

      public SAMFileHeader getHeader()
      Returns:
      the SAMFileHeader for this record. If the header is null, the following SAMRecord methods may throw exceptions:

      • getReferenceIndex
      • setReferenceIndex
      • getMateReferenceIndex
      • setMateReferenceIndex

      Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.

      A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.

      SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require records to have a valid header in order to be written. Any record that does not have a header at the time it is added to the writer will be updated to use the header associated with the writer.

    • setHeader

      public void setHeader(SAMFileHeader header)
      Sets the SAMFileHeader for this record. Setting the header into SAMRecord facilitates conversion between reference sequence names and indices.

      NOTE: If the record has a reference or mate reference name, the corresponding reference and mate reference indices are resolved and updated using the sequence dictionary in the new header. setHeader does not throw an exception if either the reference or mate reference name does not appear in the new header's sequence dictionary.

      When the SAMFileHeader is set to null, the reference and mate reference indices are cleared. Therefore, calls to the following SAMRecord methods on records with a null header may throw IllegalArgumentExceptions:

      • getReferenceIndex
      • setReferenceIndex
      • getMateReferenceIndex
      • setMateReferenceIndex

      Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.

      A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.

      SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require records to have a valid header in order to be written. Any record that does not have a header at the time it is added to the writer will be updated to use the header associated with the writer.

      Parameters:
      header - contains sequence dictionary for this SAMRecord
    • setHeaderStrict

      public void setHeaderStrict(SAMFileHeader header)
      Establishes the SAMFileHeader for this record and forces resolution of the record's reference and mate reference names against the header using the sequence dictionary in the new header. If either the reference or mate reference name does not appear in the new header's sequence dictionary, an IllegalArgumentException is thrown.
      Parameters:
      header - new header for this record. May be null.
      Throws:
      IllegalArgumentException - if the record has reference or mate reference names that cannot be resolved to indices using the new header.
    • getVariableBinaryRepresentation

      public byte[] getVariableBinaryRepresentation()
      If this record has a valid binary representation of the variable-length portion of a binary record stored, return that byte array, otherwise return null. This will never be true for SAMRecords. It will be true for BAMRecords that have not been eagerDecoded(), and for which none of the data in the variable-length portion has been changed.
    • getAttributesBinarySize

      public int getAttributesBinarySize()
      Depending on the concrete implementation, the binary file size of attributes may be known without computing them all.
      Returns:
      binary file size of attribute, if known, else -1
    • format

      @Deprecated public String format()
      Deprecated.
      This method is not guaranteed to return a valid SAM text representation of the SAMRecord. To get standard SAM text representation, getSAMString().
      Returns:
      String representation of this.
    • eagerDecode

      protected void eagerDecode()
      Force all lazily-initialized data members to be initialized. If a subclass overrides this method, typically it should also call super method.
    • getAlignmentBlocks

      public List<AlignmentBlock> getAlignmentBlocks()
      Returns blocks of the read sequence that have been aligned directly to the reference sequence. Note that clipped portions of the read and inserted and deleted bases (vs. the reference) are not represented in the alignment blocks.
    • validateCigar

      public List<SAMValidationError> validateCigar(long recordNumber)
      Run all validations of CIGAR. These include validation that the CIGAR makes sense independent of placement, plus validation that CIGAR + placement yields all bases with M operator within the range of the reference.
      Parameters:
      recordNumber - For error reporting, the record number in the SAM/BAM file. -1 if not known.
      Returns:
      List of errors, or null if no errors.
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • isValid

      public List<SAMValidationError> isValid()
      Perform various validations of SAMRecord. Note that this method deliberately returns null rather than Collections.emptyList() if there are no validation errors, because callers tend to assume that if a non-null list is returned, it is modifiable. A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.
      Returns:
      null if valid. If invalid, returns a list of error messages.
    • isValid

      public List<SAMValidationError> isValid(boolean firstOnly)
      Perform various validations of SAMRecord. Note that this method deliberately returns null rather than Collections.emptyList() if there are no validation errors, because callers tend to assume that if a non-null list is returned, it is modifiable. A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.
      Parameters:
      firstOnly - return only the first error if true, false otherwise
      Returns:
      null if valid. If invalid, returns a list of error messages.
    • getFileSource

      public SAMFileSource getFileSource()
      Gets the source of this SAM record -- both the reader that retrieved the record and the position on disk from whence it came.
      Returns:
      The file source. Note that the reader will be null if the reader source has not be set.
    • setFileSource

      public void setFileSource(SAMFileSource fileSource)
      Sets a marker providing the source reader for this file and the position in the file from which the read originated.
      Parameters:
      fileSource - source of the given file.
    • clone

      public Object clone() throws CloneNotSupportedException
      Note that this does a shallow copy of everything, except for the attribute list, for which a copy of the list is made, but the attributes themselves are copied by reference. This should be safe because callers should never modify a mutable value returned by any of the get() methods anyway. If one of the cloned record's SEQ or QUAL needs to be modified, a deeper copy should be made (e.g. Reverse Complement).
      Overrides:
      clone in class Object
      Throws:
      CloneNotSupportedException
    • deepCopy

      public SAMRecord deepCopy()
      Returns a deep copy of the SAM record, with the following exceptions: - The header field, which shares the header reference with the original record - The file source field, which will always always be set to null in the copy
    • toString

      public String toString()
      Simple toString() that gives a little bit of useful info about the read.
      Overrides:
      toString in class Object
    • getSAMString

      public String getSAMString()
      Returns the record in the SAM line-based text format. Fields are separated by '\t' characters, and the String is terminated by '\n'.
    • getPairedReadName

      public String getPairedReadName()
    • getSAMFlags

      public final Set<SAMFlag> getSAMFlags()
      shortcut to
      SAMFlag.getFlags( this.getFlags() );
    • getTransientAttribute

      public final Object getTransientAttribute(Object key)
      Fetches the value of a transient attribute on the SAMRecord, of null if not set. The intended use for transient attributes is to store values that are 1-to-1 with the SAMRecord, may be needed many times and are expensive to compute. These values can be computed lazily and then stored as transient attributes to avoid frequent re-computation.
    • setTransientAttribute

      public final Object setTransientAttribute(Object key, Object value)
      Sets the value of a transient attribute, and returns the previous value if defined. The intended use for transient attributes is to store values that are 1-to-1 with the SAMRecord, may be needed many times and are expensive to compute. These values can be computed lazily and then stored as transient attributes to avoid frequent re-computation.
    • removeTransientAttribute

      public final Object removeTransientAttribute(Object key)
      Removes a transient attribute if it is stored, and returns the stored value. If there is not a stored value, will return null.
    • reverseComplement

      public void reverseComplement()
      Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment. Changes made after making a copy of the bases, qualities, and any attributes that will be altered. If in-place update is needed use reverseComplement(boolean). See TAGS_TO_REVERSE_COMPLEMENT TAGS_TO_REVERSE for the default set of tags that are handled.
    • reverseComplement

      public void reverseComplement(boolean inplace)
      Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment. Optionally makes a copy of the bases, qualities or attributes instead of altering them in-place. See TAGS_TO_REVERSE_COMPLEMENT TAGS_TO_REVERSE for the default set of tags that are handled.
      Parameters:
      inplace - Setting this to false will clone all attributes, bases and qualities before changing the values.
    • reverseComplement

      public void reverseComplement(Collection<String> tagsToRevcomp, Collection<String> tagsToReverse, boolean inplace)
      Reverse complement bases and reverse quality scores. In addition reverse complement any non-null attributes specified by tagsToRevcomp and reverse and non-null attributes specified by tagsToReverse.