Class SAMRecord
- All Implemented Interfaces:
HtsRecord
,Locatable
,Serializable
,Cloneable
- Direct Known Subclasses:
BAMRecord
,SRALazyRecord
The presence of reference name/reference index and alignment start do not necessarily mean that a read is aligned. Those values may merely be set to force a SAMRecord to appear in a certain place in the sort order. The readUnmappedFlag must be checked to determine whether or not a read is mapped. Only if the readUnmappedFlag is false can the reference name/index and alignment start be interpreted as indicating an actual alignment position.
Likewise, presence of mate reference name/index and mate alignment start do not necessarily mean that the mate is aligned. These may be set for an unaligned mate if the mate has been forced into a particular place in the sort order per the above paragraph. Only if the mateUnmappedFlag is false can the mate reference name/index and mate alignment start be interpreted as indicating the actual alignment position of the mate.
Note also that there are a number of getters & setters that are linked, i.e. they present different representations of the same underlying data. In these cases there is typically a representation that is preferred because it ought to be faster than some other representation. The following are the preferred representations:
- getReadNameLength() is preferred to getReadName().length()
- get/setReadBases() is preferred to get/setReadString()
- get/setBaseQualities() is preferred to get/setBaseQualityString()
- get/setReferenceIndex() is preferred to get/setReferenceName() for records with valid SAMFileHeaders
- get/setMateReferenceIndex() is preferred to get/setMateReferenceName() for records with valid SAMFileHeaders
- getCigarLength() is preferred to getCigar().getNumElements()
- get/setCigar() is preferred to get/setCigarString()
setHeader() is called by the SAM reading code, so the get/setReferenceIndex() and get/setMateReferenceIndex() methods will have access to the sequence dictionary to resolve reference and mate reference names to dictionary indices.
setHeader() need not be called explicitly when writing SAMRecords, however the writers require a record in order to call get/setReferenceIndex() and get/setMateReferenceIndex(). Therefore adding records to a writer has a side effect: any record that does not have an assigned header at the time it is added to a writer will be updated and assigned the header associated with the writer.
Some of the get() methods return values that are mutable, due to the limitations of Java. A caller should never change the value returned by a get() method. If you want to change the value of some attribute of a SAMRecord, create a new value object and call the appropriate set() method.
Note that setIndexingBin() need not be called when writing SAMRecords. It will be computed as necessary. It is only present as an optimization in the event that the value is already known and need not be computed.By default, extensive validation of SAMRecords is done when they are read. Very limited validation is done when values are set onto SAMRecords.
Notes on Headerless SAMRecords
If the header is null, the following SAMRecord methods may throw exceptions:
- getReferenceIndex
- setReferenceIndex
- getMateReferenceIndex
- setMateReferenceIndex
Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.
A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.
Also, SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require the reference and mate reference names to be valid in order to be written. At the time a record is added to a writer it will be updated to use the header associated with the writer and the reference and mate reference names must be valid for that header. If the names cannot be resolved using the writer's header, an exception will be thrown.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Tag name and value of an attribute, for getAttributes() method. -
Field Summary
Modifier and TypeFieldDescriptionstatic final int
abs(insertSize) must be <= thisprotected Integer
protected Integer
static final String
Cigar string for an unaligned read.static final int
If a read has this reference index, it is unaligned, but not all unaligned reads have this reference index (see above).static final String
If a read has this reference name, it is unaligned, but not all unaligned reads have this reference name (see above).static final int
If a read has reference name "*", it will have this value for position.static final int
Alignment score for an unaligned read.static final byte[]
This should rarely be used, since all reads should have quality scores.static final String
static final byte[]
This should rarely be used, since a read with no sequence doesn't make much sense.static final String
static final long
Tags that are known to need the reverse if the read is reverse complemented.Tags that are known to need the reverse complement if the read is reverse complemented.static final int
Alignment score for a good alignment, but where computing a Phred-score is not feasible. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Removes all attributes.clone()
Note that this does a shallow copy of everything, except for the attribute list, for which a copy of the list is made, but the attributes themselves are copied by reference.int
computeIndexingBinIfAbsent
(SAMRecord alignment) Deprecated.Use computeIndexingBin() if accessible or GenomicIndexUtil.regionToBin() otherwise.deepCopy()
Returns a deep copy of the SAM record, with the following exceptions: - The header field, which shares the header reference with the original record - The file source field, which will always always be set to null in the copyprotected void
Force all lazily-initialized data members to be initialized.boolean
format()
Deprecated.This method is not guaranteed to return a valid SAM text representation of the SAMRecord.Returns blocks of the read sequence that have been aligned directly to the reference sequence.int
int
getAttribute
(short tag) getAttribute
(SAMTag tag) Get the value for a SAM tag.getAttribute
(String tag) Get the value for a SAM tag.int
Depending on the concrete implementation, the binary file size of attributes may be known without computing them all.byte[]
Do not modify the value returned by this method.protected SAMBinaryTagAndValue
byte[]
Will work for signed byte array, unsigned byte array, or old-style hex arraybyte[]
Will work for signed byte array, unsigned byte array, or old-style hex arraygetByteAttribute
(SAMTag tag) Get the tag value and attempt to coerce it into the requested type.getByteAttribute
(String tag) Get the tag value and attempt to coerce it into the requested type.getCigar()
Do not modify the value returned by this method.int
This method is preferred over getCigar().getNumElements(), because for BAMRecord it may be faster.Gets the contig name for the contig this is mapped to.boolean
the read is either a PCR duplicate or an optical duplicate.int
getEnd()
an alias ofgetAlignmentEnd()
Gets the source of this SAM record -- both the reader that retrieved the record and the position on disk from whence it came.boolean
the read is the first read in a pair.int
getFlags()
It is preferable to use the get*Flag() methods that handle the flag word symbolically.float[]
float[]
getFloatAttribute
(SAMTag tag) getFloatAttribute
(String tag) int
Get the tag value and attempt to coerce it into the requested type.Get the tag value and attempt to coerce it into the requested type.int
int
boolean
strand of the mate (false for forward; true for reverse strand).Returns the mate reference index for this record.boolean
the mate is unmapped.boolean
Deprecated.useisSecondaryAlignment()
instead.byte[]
If the original base quality scores have been store in the "OQ" tag will return the numeric score as a byte[]boolean
the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment).byte[]
Do not modify the value returned by this method.boolean
the read fails platform/vendor quality checks.Get the SAMReadGroupRecord for this SAMRecord.int
This method is preferred over getReadBases().length, because for BAMRecord it may be faster.int
This method is preferred over getReadName().length(), because for BAMRecord it may be faster.boolean
strand of the query (false for forward; true for reverse strand).boolean
the read is paired in sequencing, no matter whether it is mapped in a pair.int
getReadPositionAtReferencePosition
(int pos) Returns the 1-based position in the read of the 1-based reference position provided.int
getReadPositionAtReferencePosition
(int pos, boolean returnLastBaseIfDeleted) Non-static version of static function with the same name.static int
getReadPositionAtReferencePosition
(SAMRecord rec, int pos, boolean returnLastBaseIfDeleted) Returns the 1-based position in the read of the provided reference position, or 0 if no such position exists.boolean
the query sequence itself is unmapped.Returns the reference index for this record.int
getReferencePositionAtReadPosition
(int position) Non static version of the static function with the same name.static int
getReferencePositionAtReadPosition
(SAMRecord rec, int position) Returns the 1-based reference position for the provided 1-based position in read.shortcut toReturns the record in the SAM line-based text format.boolean
the read is the second read in a pair.getShortAttribute
(SAMTag tag) Get the tag value and attempt to coerce it into the requested type.getShortAttribute
(String tag) Get the tag value and attempt to coerce it into the requested type.byte[]
Will work for signed byte array or old-style hex arraybyte[]
Will work for signed byte array or old-style hex arrayint[]
int[]
short[]
short[]
int
getStart()
an alias ofgetAlignmentStart()
getStringAttribute
(SAMTag tag) getStringAttribute
(String tag) boolean
final Object
Fetches the value of a transient attribute on the SAMRecord, of null if not set.int
int
byte[]
byte[]
int[]
int[]
getUnsignedIntegerAttribute
(short tag) A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.short[]
short[]
byte[]
If this record has a valid binary representation of the variable-length portion of a binary record stored, return that byte array, otherwise return null.boolean
hasAttribute
(SAMTag tag) boolean
hasAttribute
(String tag) int
hashCode()
protected void
initializeCigar
(Cigar cigar) For setting the Cigar string when BAMRecord has decoded it.protected static boolean
isAllowedAttributeValue
(Object value) Deprecated.The attribute type and value checks have been moved directly intoSAMBinaryTagAndValue
.boolean
boolean
Tests if this record is either a secondary and/or supplementary alignment; equivalent to(getNotPrimaryAlignmentFlag() || getSupplementaryAlignmentFlag())
.boolean
isValid()
Perform various validations of SAMRecord.isValid
(boolean firstOnly) Perform various validations of SAMRecord.final Object
Removes a transient attribute if it is stored, and returns the stored value.protected static Integer
resolveIndexFromName
(String referenceName, SAMFileHeader header, boolean strict) Static method that resolves and returns the reference index corresponding to a given reference name.protected static String
resolveNameFromIndex
(int referenceIndex, SAMFileHeader header) Static method that resolves and returns the reference name corresponding to a given reference index.void
Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment.void
reverseComplement
(boolean inplace) Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment.void
reverseComplement
(Collection<String> tagsToRevcomp, Collection<String> tagsToReverse, boolean inplace) Reverse complement bases and reverse quality scores.void
setAlignmentStart
(int value) protected void
setAttribute
(short tag, Object value) protected void
setAttribute
(short tag, Object value, boolean isUnsignedArray) void
setAttribute
(SAMTag tag, Object value) protected void
setAttribute
(SAMTag tag, Object value, boolean isUnsignedArray) void
setAttribute
(String tag, Object value) Set a named attribute onto the SAMRecord.protected void
setAttributes
(SAMBinaryTagAndValue attributes) Replace any existing attributes with the given linked item.void
setBaseQualities
(byte[] value) void
setBaseQualityString
(String value) void
For setting the Cigar string when changed.void
setCigarString
(String value) void
setDuplicateReadFlag
(boolean flag) the read is either a PCR duplicate or an optical duplicate.void
setFileSource
(SAMFileSource fileSource) Sets a marker providing the source reader for this file and the position in the file from which the read originated.void
setFirstOfPairFlag
(boolean flag) the read is the first read in a pair.void
setFlags
(int value) void
setHeader
(SAMFileHeader header) Sets the SAMFileHeader for this record.void
setHeaderStrict
(SAMFileHeader header) Establishes the SAMFileHeader for this record and forces resolution of the record's reference and mate reference names against the header using the sequence dictionary in the new header.void
setInferredInsertSize
(int inferredInsertSize) void
setMappingQuality
(int value) void
setMateAlignmentStart
(int mateAlignmentStart) void
setMateNegativeStrandFlag
(boolean flag) strand of the mate (false for forward; true for reverse strand).void
setMateReferenceIndex
(int mateReferenceIndex) Updates the mate reference index.void
setMateReferenceName
(String mateReferenceName) Sets the mate reference name for this record.void
setMateUnmappedFlag
(boolean flag) the mate is unmapped.void
setNotPrimaryAlignmentFlag
(boolean flag) Deprecated.usesetSecondaryAlignment(boolean)
instead.void
setOriginalBaseQualities
(byte[] oq) Sets the original base quality scores into the "OQ" tag as a String.void
setProperPairFlag
(boolean flag) the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment).void
setReadBases
(byte[] value) void
setReadFailsVendorQualityCheckFlag
(boolean flag) the read fails platform/vendor quality checks.void
setReadName
(String value) void
setReadNegativeStrandFlag
(boolean flag) strand of the query (false for forward; true for reverse strand).void
setReadPairedFlag
(boolean flag) the read is paired in sequencing, no matter whether it is mapped in a pair.void
setReadString
(String value) void
setReadUmappedFlag
(boolean flag) Deprecated.void
setReadUnmappedFlag
(boolean flag) the query sequence itself is unmapped.void
setReferenceIndex
(int referenceIndex) Updates the reference index.void
setReferenceName
(String referenceName) Sets the reference name for this record.void
setSecondaryAlignment
(boolean flag) set whether this alignment is secondary (an alternative alignment of the read).void
setSecondOfPairFlag
(boolean flag) the read is the second read in a pair.void
setSupplementaryAlignmentFlag
(boolean flag) set whether this alignment is supplementary (a split alignment such as a chimeric alignment).final Object
setTransientAttribute
(Object key, Object value) Sets the value of a transient attribute, and returns the previous value if defined.void
setUnsignedArrayAttribute
(String tag, Object value) Because Java does not support unsigned integer types, we think it is a bad idea to encode them in SAM files.void
setValidationStringency
(ValidationStringency validationStringency) Control validation of lazily-decoded elements.toString()
Simple toString() that gives a little bit of useful info about the read.validateCigar
(long recordNumber) Run all validations of CIGAR.Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface htsjdk.samtools.util.Locatable
contains, contigsMatch, getLengthOnReference, overlaps, withinDistanceOf
-
Field Details
-
serialVersionUID
public static final long serialVersionUID- See Also:
-
UNKNOWN_MAPPING_QUALITY
public static final int UNKNOWN_MAPPING_QUALITYAlignment score for a good alignment, but where computing a Phred-score is not feasible.- See Also:
-
NO_MAPPING_QUALITY
public static final int NO_MAPPING_QUALITYAlignment score for an unaligned read.- See Also:
-
NO_ALIGNMENT_REFERENCE_NAME
If a read has this reference name, it is unaligned, but not all unaligned reads have this reference name (see above).- See Also:
-
NO_ALIGNMENT_REFERENCE_INDEX
public static final int NO_ALIGNMENT_REFERENCE_INDEXIf a read has this reference index, it is unaligned, but not all unaligned reads have this reference index (see above).- See Also:
-
NO_ALIGNMENT_CIGAR
Cigar string for an unaligned read.- See Also:
-
NO_ALIGNMENT_START
public static final int NO_ALIGNMENT_STARTIf a read has reference name "*", it will have this value for position.- See Also:
-
NULL_SEQUENCE
public static final byte[] NULL_SEQUENCEThis should rarely be used, since a read with no sequence doesn't make much sense. -
NULL_SEQUENCE_STRING
- See Also:
-
NULL_QUALS
public static final byte[] NULL_QUALSThis should rarely be used, since all reads should have quality scores. -
NULL_QUALS_STRING
- See Also:
-
MAX_INSERT_SIZE
public static final int MAX_INSERT_SIZEabs(insertSize) must be <= this- See Also:
-
TAGS_TO_REVERSE_COMPLEMENT
Tags that are known to need the reverse complement if the read is reverse complemented. -
TAGS_TO_REVERSE
Tags that are known to need the reverse if the read is reverse complemented. -
mReferenceIndex
-
mMateReferenceIndex
-
-
Constructor Details
-
SAMRecord
-
-
Method Details
-
getReadName
-
getReadNameLength
public int getReadNameLength()This method is preferred over getReadName().length(), because for BAMRecord it may be faster.- Returns:
- length not including a null terminator.
-
setReadName
-
getReadString
- Returns:
- read sequence as a string of ACGTN=.
-
setReadString
-
getReadBases
public byte[] getReadBases()Do not modify the value returned by this method. If you want to change the bases, create a new byte[] and call setReadBases() or call setReadString().- Returns:
- read sequence as ASCII bytes ACGTN=.
-
setReadBases
public void setReadBases(byte[] value) -
getReadLength
public int getReadLength()This method is preferred over getReadBases().length, because for BAMRecord it may be faster.- Returns:
- number of bases in the read.
-
getBaseQualityString
- Returns:
- Base qualities, encoded as a FASTQ string.
-
setBaseQualityString
-
getBaseQualities
public byte[] getBaseQualities()Do not modify the value returned by this method. If you want to change the qualities, create a new byte[] and call setBaseQualities() or call setBaseQualityString().- Returns:
- Base qualities, as binary phred scores (not ASCII).
-
setBaseQualities
public void setBaseQualities(byte[] value) -
getOriginalBaseQualities
public byte[] getOriginalBaseQualities()If the original base quality scores have been store in the "OQ" tag will return the numeric score as a byte[] -
setOriginalBaseQualities
public void setOriginalBaseQualities(byte[] oq) Sets the original base quality scores into the "OQ" tag as a String. Supplied value should be as phred-scaled numeric qualities. -
getReferenceName
- Returns:
- Reference name, or NO_ALIGNMENT_REFERENCE_NAME (*) if the record has no reference name
-
setReferenceName
Sets the reference name for this record. If the record has a valid SAMFileHeader and the reference name is present in the associated sequence dictionary, the record's reference index will also be updated with the corresponding sequence index. If referenceName is NO_ALIGNMENT_REFERENCE_NAME, sets the reference index to NO_ALIGNMENT_REFERENCE_INDEX.- Parameters:
referenceName
- - must not be null- Throws:
IllegalArgumentException
- ifreferenceName
is null
-
getReferenceIndex
Returns the reference index for this record. If the reference name for this record has previously been resolved against the sequence dictionary, the corresponding index is returned directly. Otherwise, the record must have a non-null SAMFileHeader that can be used to resolve the index for the record's current reference name, unless the reference name is NO_ALIGNMENT_REFERENCE_NAME. If the record has a header, and the name does not appear in the header's sequence dictionary, the value NO_ALIGNMENT_REFERENCE_INDEX (-1) will be returned. If the record does not have a header, an IllegalStateException is thrown.- Returns:
- Index in the sequence dictionary of the reference sequence. If the read has no reference sequence, or if the reference name is not found in the sequence index, NO_ALIGNMENT_REFERENCE_INDEX (-1) is returned.
- Throws:
IllegalStateException
- if the reference index must be resolved but cannot be because the SAMFileHeader for the record is null.
-
setReferenceIndex
public void setReferenceIndex(int referenceIndex) Updates the reference index. The record must have a valid SAMFileHeader unless the referenceIndex parameter equals NO_ALIGNMENT_REFERENCE_INDEX, and the reference index must appear in the header's sequence dictionary. If the reference index is valid, the reference name will also be resolved and updated to the name for the sequence dictionary entry corresponding to the index.- Parameters:
referenceIndex
- Must either equal NO_ALIGNMENT_REFERENCE_INDEX (-1) indicating no reference, or the record must have a SAMFileHeader and the index must exist in the associated sequence dictionary.- Throws:
IllegalStateException
- ifreferenceIndex
is not equal to NO_ALIGNMENT_REFERENCE_INDEX and the SAMFileHeader is null for this recordIllegalArgumentException
- ifreferenceIndex
is not found in the sequence dictionary in the header for this record.
-
getMateReferenceName
- Returns:
- Mate reference name, or NO_ALIGNMENT_REFERENCE_NAME (*) if the record has no mate reference name
-
setMateReferenceName
Sets the mate reference name for this record. If the record has a valid SAMFileHeader and the mate reference name is present in the associated sequence dictionary, the record's mate reference index will also be updated with the corresponding sequence index. If mateReferenceName is NO_ALIGNMENT_REFERENCE_NAME, sets the mate reference index to NO_ALIGNMENT_REFERENCE_INDEX.- Parameters:
mateReferenceName
- - must not be null- Throws:
IllegalArgumentException
- ifmateReferenceName
is null
-
getMateReferenceIndex
Returns the mate reference index for this record. If the mate reference name for this record has previously been resolved against the sequence dictionary, the corresponding index is returned directly. Otherwise, the record must have a non-null SAMFileHeader that can be used to resolve the index for the record's current mate reference name, unless the mate reference name is NO_ALIGNMENT_REFERENCE_NAME. If the record has a header, and the name does not appear in the header's sequence dictionary, the value NO_ALIGNMENT_REFERENCE_INDEX (-1) will be returned. If the record does not have a header, an IllegalStateException is thrown.- Returns:
- Index in the sequence dictionary of the mate reference sequence. If the read has no mate reference sequence, or if the mate reference name is not found in the sequence index, NO_ALIGNMENT_REFERENCE_INDEX (-1) is returned.
- Throws:
IllegalStateException
- if the mate reference index must be resolved but cannot be because the SAMFileHeader for the record is null.
-
setMateReferenceIndex
public void setMateReferenceIndex(int mateReferenceIndex) Updates the mate reference index. The record must have a valid SAMFileHeader, and the mate reference index must appear in the header's sequence dictionary, unless the mateReferenceIndex parameter equals NO_ALIGNMENT_REFERENCE_INDEX. If the mate reference index is valid, the mate reference name will also be resolved and updated to the name for the sequence dictionary entry corresponding to the index.- Parameters:
mateReferenceIndex
- Must either equal NO_ALIGNMENT_REFERENCE_INDEX (-1) indicating no reference, or the record must have a SAMFileHeader and the index must exist in the associated sequence dictionary.- Throws:
IllegalStateException
- if the SAMFileHeader is null for this recordIllegalArgumentException
- if the mate reference index is not found in the sequence dictionary in the header for this record.
-
resolveIndexFromName
protected static Integer resolveIndexFromName(String referenceName, SAMFileHeader header, boolean strict) Static method that resolves and returns the reference index corresponding to a given reference name.- Parameters:
referenceName
- IfreferenceName
is NO_ALIGNMENT_REFERENCE_NAME, the value NO_ALIGNMENT_REFERENCE_INDEX is returned directly. OtherwisereferenceName
must be looked up in the header's sequence dictionary.header
- SAMFileHeader to use when resolvingreferenceName
to an index. Must be non null if thereferenceName
is not NO_ALIGNMENT_REFERENCE_NAME.strict
- if true, throws ifreferenceName
does not appear in the header's sequence dictionary- Throws:
IllegalStateException
- ifreferenceName
is not equal to NO_ALIGNMENT_REFERENCE_NAME and the header is nullIllegalArgumentException
- if strict is true and the name does not appear in header's sequence dictionary. Does not mutate the SAMRecord.
-
resolveNameFromIndex
Static method that resolves and returns the reference name corresponding to a given reference index.- Parameters:
referenceIndex
- IfreferenceIndex
is NO_ALIGNMENT_REFERENCE_INDEX, the value NO_ALIGNMENT_REFERENCE_NAME is returned directly. OtherwisereferenceIndex
must be looked up in the header's sequence dictionary.header
- SAMFileHeader to use when resolvingreferenceIndex
to a name. Must be non null unless the thereferenceIndex
is NO_ALIGNMENT_REFERENCE_INDEX.- Throws:
IllegalStateException
- ifreferenceIndex
is not equal to NO_ALIGNMENT_REFERENCE_NAME and the header is nullIllegalArgumentException
- ifreferenceIndex
does not appear in header's sequence dictionary. Does not mutate the SAMRecord.
-
getAlignmentStart
public int getAlignmentStart()- Returns:
- 1-based inclusive leftmost position of the sequence remaining after clipping, or 0 if there is no position, e.g. for unmapped read.
-
setAlignmentStart
public void setAlignmentStart(int value) - Parameters:
value
- 1-based inclusive leftmost position of the sequence remaining after clipping or 0 if there is no position, e.g. for unmapped read.
-
getAlignmentEnd
public int getAlignmentEnd()- Returns:
- 1-based inclusive rightmost position of the sequence remaining after clipping or 0 if there is no position, e.g. for unmapped read.
-
getUnclippedStart
public int getUnclippedStart()- Returns:
- the alignment start (1-based, inclusive) adjusted for clipped bases. For example if the read has an alignment start of 100 but the first 4 bases were clipped (hard or soft clipped) then this method will return 96. Invalid to call on an unmapped read.
-
getUnclippedEnd
public int getUnclippedEnd()- Returns:
- the alignment end (1-based, inclusive) adjusted for clipped bases. For example if the read has an alignment end of 100 but the last 7 bases were clipped (hard or soft clipped) then this method will return 107. Invalid to call on an unmapped read.
-
getReferencePositionAtReadPosition
public int getReferencePositionAtReadPosition(int position) Non static version of the static function with the same name.- Parameters:
position
- 1-based location within the unclipped sequence- Returns:
- 1-based reference position of the unclipped sequence at a given read position, or 0 if there is no position.
-
getReferencePositionAtReadPosition
Returns the 1-based reference position for the provided 1-based position in read. For example, given the sequence NNNAAACCCGGG, cigar 3S9M, and an alignment start of 1, and a (1-based) position of 10 (start of GGG) it returns 7 (1-based position starting after the soft clip. For example: given the sequence AAACCCGGGTTT, cigar 4M1D6M, an alignment start of 1, a position of 4, returns reference position 4, a position of 5 returns reference position 6. Another example: given the sequence AAACCCGGGTTT, cigar 4M1I6M, an alignment start of 1, a position of 4 returns reference position 4, an position of 5 returns 0.- Parameters:
rec
- record to useposition
- 1-based location within the unclipped sequence- Returns:
- 1-based reference position of the unclipped sequence at a given read position, or 0 if there is no position.
-
getReadPositionAtReferencePosition
public int getReadPositionAtReferencePosition(int pos) Returns the 1-based position in the read of the 1-based reference position provided.- Parameters:
pos
- 1-based reference position- Returns:
- 1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. See examples in the static version below
-
getReadPositionAtReferencePosition
public int getReadPositionAtReferencePosition(int pos, boolean returnLastBaseIfDeleted) Non-static version of static function with the same name. See examples below.- Parameters:
pos
- 1-based reference positionreturnLastBaseIfDeleted
- if positive, and reference position matches a deleted base in the read, function will return the offset- Returns:
- 1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. If returnLastBaseIfDeleted is true deletions are assumed to "live" on the last read base in the preceding block.
-
getReadPositionAtReferencePosition
public static int getReadPositionAtReferencePosition(SAMRecord rec, int pos, boolean returnLastBaseIfDeleted) Returns the 1-based position in the read of the provided reference position, or 0 if no such position exists. For example, given the sequence NNNAAACCCGGG, cigar 3S9M, and an alignment start of 1, and a (1-based) pos of 7 (start of GGG) it returns 10 (1-based position including the soft clip). For example: given the sequence AAACCCGGGT, cigar 4M1D6M, an alignment start of 1, a reference position of 4 returns read position 4, a reference position of 5 also returns a read position of 4 if returnLastBaseIfDeleted and 0 otherwise. For example: given the sequence AAACtCGGGTT, cigar 4M1I6M, an alignment start of 1, a position 4 returns a position of 5, a position of 5 returns 6 (the inserted base is the 5th read position), a position of 11 returns 0 since that position in the reference doesn't overlap the read at all.- Parameters:
rec
- record to usepos
- 1-based reference positionreturnLastBaseIfDeleted
- if positive, and reference position matches a deleted base in the read, function will return the position of the last non-deleted base- Returns:
- 1-based (to match getReferencePositionAtReadPosition behavior) inclusive position into the unclipped sequence at a given reference position, or 0 if there is no such position. If returnLastBaseIfDeleted is true deletions are assumed to "live" on the last read base in the preceding block.
-
getMateAlignmentStart
public int getMateAlignmentStart()- Returns:
- 1-based inclusive leftmost position of the clipped mate sequence, or 0 if there is no position.
-
setMateAlignmentStart
public void setMateAlignmentStart(int mateAlignmentStart) -
getInferredInsertSize
public int getInferredInsertSize()- Returns:
- insert size (difference btw 5' end of read & 5' end of mate), if possible, else 0. Negative if mate maps to lower position than read.
-
setInferredInsertSize
public void setInferredInsertSize(int inferredInsertSize) -
getMappingQuality
public int getMappingQuality()- Returns:
- phred scaled mapping quality. 255 implies valid mapping but quality is hard to compute.
-
setMappingQuality
public void setMappingQuality(int value) -
getCigarString
-
setCigarString
-
getCigar
Do not modify the value returned by this method. If you want to change the Cigar, create a new Cigar and call setCigar() or call setCigarString()- Returns:
- Cigar object for the read, or null if there is none.
-
getCigarLength
public int getCigarLength()This method is preferred over getCigar().getNumElements(), because for BAMRecord it may be faster.- Returns:
- number of cigar elements (number + operator) in the cigar string.
-
setCigar
For setting the Cigar string when changed. Note that this nulls the indexing bin, which would need to be recomputed on write (if needed). To avoid clobbering the indexing bin, useinitializeCigar(htsjdk.samtools.Cigar)
-
initializeCigar
For setting the Cigar string when BAMRecord has decoded it. Use this rather thansetCigar(htsjdk.samtools.Cigar)
so that indexing bin doesn't get clobbered. -
getReadGroup
Get the SAMReadGroupRecord for this SAMRecord.- Returns:
- The SAMReadGroupRecord from the SAMFileHeader for this SAMRecord, or null if 1) this record has no RG tag, or 2) the header doesn't contain the read group with the given ID.or 3) this record has no SAMFileHeader
- Throws:
ClassCastException
- if RG tag does not have a String value.
-
getFlags
public int getFlags()It is preferable to use the get*Flag() methods that handle the flag word symbolically. -
setFlags
public void setFlags(int value) -
getReadPairedFlag
public boolean getReadPairedFlag()the read is paired in sequencing, no matter whether it is mapped in a pair. -
getProperPairFlag
public boolean getProperPairFlag()the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment). -
getReadUnmappedFlag
public boolean getReadUnmappedFlag()the query sequence itself is unmapped. -
getMateUnmappedFlag
public boolean getMateUnmappedFlag()the mate is unmapped. -
getReadNegativeStrandFlag
public boolean getReadNegativeStrandFlag()strand of the query (false for forward; true for reverse strand). -
getMateNegativeStrandFlag
public boolean getMateNegativeStrandFlag()strand of the mate (false for forward; true for reverse strand). -
getFirstOfPairFlag
public boolean getFirstOfPairFlag()the read is the first read in a pair. -
getSecondOfPairFlag
public boolean getSecondOfPairFlag()the read is the second read in a pair. -
getNotPrimaryAlignmentFlag
Deprecated.useisSecondaryAlignment()
instead.the alignment is not primary (a read having split hits may have multiple primary alignment records). -
isSecondaryAlignment
public boolean isSecondaryAlignment()- Returns:
- whether the alignment is secondary (an alternative alignment of the read).
-
getSupplementaryAlignmentFlag
public boolean getSupplementaryAlignmentFlag()- Returns:
- whether the alignment is supplementary (a split alignment such as a chimeric alignment).
-
getReadFailsVendorQualityCheckFlag
public boolean getReadFailsVendorQualityCheckFlag()the read fails platform/vendor quality checks. -
getDuplicateReadFlag
public boolean getDuplicateReadFlag()the read is either a PCR duplicate or an optical duplicate. -
setReadPairedFlag
public void setReadPairedFlag(boolean flag) the read is paired in sequencing, no matter whether it is mapped in a pair. -
setProperPairFlag
public void setProperPairFlag(boolean flag) the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment). -
setReadUmappedFlag
Deprecated.the query sequence itself is unmapped. This method name is misspelled. UsesetReadUnmappedFlag(boolean)
instead. -
setReadUnmappedFlag
public void setReadUnmappedFlag(boolean flag) the query sequence itself is unmapped. -
setMateUnmappedFlag
public void setMateUnmappedFlag(boolean flag) the mate is unmapped. -
setReadNegativeStrandFlag
public void setReadNegativeStrandFlag(boolean flag) strand of the query (false for forward; true for reverse strand). -
setMateNegativeStrandFlag
public void setMateNegativeStrandFlag(boolean flag) strand of the mate (false for forward; true for reverse strand). -
setFirstOfPairFlag
public void setFirstOfPairFlag(boolean flag) the read is the first read in a pair. -
setSecondOfPairFlag
public void setSecondOfPairFlag(boolean flag) the read is the second read in a pair. -
setNotPrimaryAlignmentFlag
Deprecated.usesetSecondaryAlignment(boolean)
instead.the alignment is not primary (a read having split hits may have multiple primary alignment records). -
setSecondaryAlignment
public void setSecondaryAlignment(boolean flag) set whether this alignment is secondary (an alternative alignment of the read). -
setSupplementaryAlignmentFlag
public void setSupplementaryAlignmentFlag(boolean flag) set whether this alignment is supplementary (a split alignment such as a chimeric alignment). -
setReadFailsVendorQualityCheckFlag
public void setReadFailsVendorQualityCheckFlag(boolean flag) the read fails platform/vendor quality checks. -
setDuplicateReadFlag
public void setDuplicateReadFlag(boolean flag) the read is either a PCR duplicate or an optical duplicate. -
isSecondaryOrSupplementary
public boolean isSecondaryOrSupplementary()Tests if this record is either a secondary and/or supplementary alignment; equivalent to(getNotPrimaryAlignmentFlag() || getSupplementaryAlignmentFlag())
. -
getValidationStringency
-
setValidationStringency
Control validation of lazily-decoded elements. -
hasAttribute
- Returns:
true
if the SAM record has the requested attribute set,false
otherwise.
-
hasAttribute
- Returns:
true
if the SAM record has the requested attribute set,false
otherwise.
-
getAttribute
Get the value for a SAM tag. WARNING: Some value types (e.g. byte[]) are mutable. It is dangerous to change one of these values in place, because some SAMRecord implementations keep track of when attributes have been changed. If you want to change an attribute value, call setAttribute() to replace the value.- Parameters:
tag
- Two-character tag name.- Returns:
- Appropriately typed tag value, or null if the requested tag is not present.
-
getAttribute
Get the value for a SAM tag. WARNING: Some value types (e.g. byte[]) are mutable. It is dangerous to change one of these values in place, because some SAMRecord implementations keep track of when attributes have been changed. If you want to change an attribute value, call setAttribute() to replace the value.- Parameters:
tag
- the SAM tag.- Returns:
- Appropriately typed tag value, or null if the requested tag is not present.
-
getIntegerAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a signed Integer if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a signed Integer.
-
getIntegerAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a signed Integer if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a signed Integer.
-
getUnsignedIntegerAttribute
A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.- Parameters:
tag
- Two-character tag name.- Returns:
- valid unsigned integer associated with the tag, as a Long
- Throws:
SAMException
- if the value is out of range for a 32-bit unsigned value, or not a Number
-
getUnsignedIntegerAttribute
A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.- Parameters:
tag
- Two-character tag name.- Returns:
- valid unsigned integer associated with the tag, as a Long
- Throws:
SAMException
- if the value is out of range for a 32-bit unsigned value, or not a Number
-
getUnsignedIntegerAttribute
A convenience method that will return a valid unsigned integer as a Long, or fail with an exception if the tag value is invalid.- Parameters:
tag
- Binary representation of a 2-char String tag as created by SAMTagUtil.- Returns:
- valid unsigned integer associated with the tag, as a Long
- Throws:
SAMException
- if the value is out of range for a 32-bit unsigned value, or not a Number
-
getShortAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a Short if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a Short.
-
getShortAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a Short if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a Short.
-
getByteAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a Byte if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a Byte.
-
getByteAttribute
Get the tag value and attempt to coerce it into the requested type.- Parameters:
tag
- The requested tag.- Returns:
- The value of a tag, converted into a Byte if possible.
- Throws:
RuntimeException
- If the value is not an integer type, or will not fit in a Byte.
-
getStringAttribute
-
getStringAttribute
-
getCharacterAttribute
-
getCharacterAttribute
-
getFloatAttribute
-
getFloatAttribute
-
getByteArrayAttribute
Will work for signed byte array, unsigned byte array, or old-style hex array -
getByteArrayAttribute
Will work for signed byte array, unsigned byte array, or old-style hex array -
getUnsignedByteArrayAttribute
-
getUnsignedByteArrayAttribute
-
getSignedByteArrayAttribute
Will work for signed byte array or old-style hex array -
getSignedByteArrayAttribute
Will work for signed byte array or old-style hex array -
getUnsignedShortArrayAttribute
-
getUnsignedShortArrayAttribute
-
getSignedShortArrayAttribute
-
getSignedShortArrayAttribute
-
getUnsignedIntArrayAttribute
-
getUnsignedIntArrayAttribute
-
getSignedIntArrayAttribute
-
getSignedIntArrayAttribute
-
getFloatArrayAttribute
-
getFloatArrayAttribute
-
isUnsignedArrayAttribute
- Returns:
- True if this tag is an unsigned array, else false.
- Throws:
SAMException
- if the tag is not present.
-
getAttribute
- Parameters:
tag
- Binary representation of a 2-char String tag as created by SAMTagUtil.- See Also:
-
setAttribute
Set a named attribute onto the SAMRecord. Passing a null value causes the attribute to be cleared.- Parameters:
tag
- two-character tag name. See http://samtools.sourceforge.net/SAM1.pdf for standard and user-defined tags.value
- Supported types are String, Char, Integer, Float, Long (for values that fit into a signed or unsigned 32-bit integer only), byte[], short[], int[], float[]. If value == null, tag is cleared. Byte and Short are allowed but discouraged. If written to a SAM file, these will be converted to Integer, whereas if written to BAM, getAttribute() will return as Byte or Short, respectively. Long is allowed for values that fit into a signed or unsigned 32-bit integer only, but discouraged. To set unsigned byte[], unsigned short[] or unsigned int[] (which is discouraged because of poor Java language support), setUnsignedArrayAttribute() must be used instead of this method. String values are not validated to ensure that they conform to SAM spec.
-
setUnsignedArrayAttribute
Because Java does not support unsigned integer types, we think it is a bad idea to encode them in SAM files. If you must do so, however, you must call this method rather than setAttribute, because calling this method is the way to indicate that, e.g. a short array should be interpreted as unsigned shorts.- Parameters:
value
- must be one of byte[], short[], int[]
-
setAttribute
- Parameters:
tag
- Binary representation of a 2-char String tag as created by SAMTagUtil.- See Also:
-
setAttribute
- Parameters:
tag
- Binary representation of a 2-char String tag as created by SAMTagUtil.- See Also:
-
isAllowedAttributeValue
Deprecated.The attribute type and value checks have been moved directly intoSAMBinaryTagAndValue
.Checks if the value is allowed as an attribute value.- Parameters:
value
- the value to be checked- Returns:
- true if the value is valid and false otherwise
-
setAttribute
-
setAttribute
-
clearAttributes
public void clearAttributes()Removes all attributes. -
setAttributes
Replace any existing attributes with the given linked item. NOTE: this method is intended to only be called from subclasses. -
getBinaryAttributes
- Returns:
- Pointer to the first of the tags. Returns null if there are no tags.
-
getContig
Description copied from interface:Locatable
Gets the contig name for the contig this is mapped to. May return null if there is no unique mapping. -
getStart
public int getStart()an alias ofgetAlignmentStart()
-
getEnd
public int getEnd()an alias ofgetAlignmentEnd()
-
getAttributes
- Returns:
- list of {tag, value} tuples
-
computeIndexingBinIfAbsent
Deprecated.Use computeIndexingBin() if accessible or GenomicIndexUtil.regionToBin() otherwise. -
getHeader
- Returns:
- the SAMFileHeader for this record. If the header is null, the following SAMRecord methods may throw
exceptions:
- getReferenceIndex
- setReferenceIndex
- getMateReferenceIndex
- setMateReferenceIndex
Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.
A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.
SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require records to have a valid header in order to be written. Any record that does not have a header at the time it is added to the writer will be updated to use the header associated with the writer.
-
setHeader
Sets the SAMFileHeader for this record. Setting the header into SAMRecord facilitates conversion between reference sequence names and indices.NOTE: If the record has a reference or mate reference name, the corresponding reference and mate reference indices are resolved and updated using the sequence dictionary in the new header. setHeader does not throw an exception if either the reference or mate reference name does not appear in the new header's sequence dictionary.
When the SAMFileHeader is set to null, the reference and mate reference indices are cleared. Therefore, calls to the following SAMRecord methods on records with a null header may throw IllegalArgumentExceptions:
- getReferenceIndex
- setReferenceIndex
- getMateReferenceIndex
- setMateReferenceIndex
Record comparators (i.e. SAMRecordCoordinateComparator and SAMRecordDuplicateComparator) require records with non-null header values.
A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.
SAMTextWriter, BAMFileWriter, and CRAMFileWriter all require records to have a valid header in order to be written. Any record that does not have a header at the time it is added to the writer will be updated to use the header associated with the writer.
- Parameters:
header
- contains sequence dictionary for this SAMRecord
-
setHeaderStrict
Establishes the SAMFileHeader for this record and forces resolution of the record's reference and mate reference names against the header using the sequence dictionary in the new header. If either the reference or mate reference name does not appear in the new header's sequence dictionary, an IllegalArgumentException is thrown.- Parameters:
header
- new header for this record. May be null.- Throws:
IllegalArgumentException
- if the record has reference or mate reference names that cannot be resolved to indices using the new header.
-
getVariableBinaryRepresentation
public byte[] getVariableBinaryRepresentation()If this record has a valid binary representation of the variable-length portion of a binary record stored, return that byte array, otherwise return null. This will never be true for SAMRecords. It will be true for BAMRecords that have not been eagerDecoded(), and for which none of the data in the variable-length portion has been changed. -
getAttributesBinarySize
public int getAttributesBinarySize()Depending on the concrete implementation, the binary file size of attributes may be known without computing them all.- Returns:
- binary file size of attribute, if known, else -1
-
format
Deprecated.This method is not guaranteed to return a valid SAM text representation of the SAMRecord. To get standard SAM text representation,getSAMString()
.- Returns:
- String representation of this.
-
eagerDecode
protected void eagerDecode()Force all lazily-initialized data members to be initialized. If a subclass overrides this method, typically it should also call super method. -
getAlignmentBlocks
Returns blocks of the read sequence that have been aligned directly to the reference sequence. Note that clipped portions of the read and inserted and deleted bases (vs. the reference) are not represented in the alignment blocks. -
validateCigar
Run all validations of CIGAR. These include validation that the CIGAR makes sense independent of placement, plus validation that CIGAR + placement yields all bases with M operator within the range of the reference.- Parameters:
recordNumber
- For error reporting, the record number in the SAM/BAM file. -1 if not known.- Returns:
- List of errors, or null if no errors.
-
equals
-
hashCode
public int hashCode() -
isValid
Perform various validations of SAMRecord. Note that this method deliberately returns null rather than Collections.emptyList() if there are no validation errors, because callers tend to assume that if a non-null list is returned, it is modifiable. A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.- Returns:
- null if valid. If invalid, returns a list of error messages.
-
isValid
Perform various validations of SAMRecord. Note that this method deliberately returns null rather than Collections.emptyList() if there are no validation errors, because callers tend to assume that if a non-null list is returned, it is modifiable. A record with null a header may be validated by the isValid method, but the reference and mate reference indices, read group, sequence dictionary, and alignment start will not be fully validated unless a header is present.- Parameters:
firstOnly
- return only the first error if true, false otherwise- Returns:
- null if valid. If invalid, returns a list of error messages.
-
getFileSource
Gets the source of this SAM record -- both the reader that retrieved the record and the position on disk from whence it came.- Returns:
- The file source. Note that the reader will be null if the reader source has not be set.
-
setFileSource
Sets a marker providing the source reader for this file and the position in the file from which the read originated.- Parameters:
fileSource
- source of the given file.
-
clone
Note that this does a shallow copy of everything, except for the attribute list, for which a copy of the list is made, but the attributes themselves are copied by reference. This should be safe because callers should never modify a mutable value returned by any of the get() methods anyway. If one of the cloned record's SEQ or QUAL needs to be modified, a deeper copy should be made (e.g. Reverse Complement).- Overrides:
clone
in classObject
- Throws:
CloneNotSupportedException
-
deepCopy
Returns a deep copy of the SAM record, with the following exceptions: - The header field, which shares the header reference with the original record - The file source field, which will always always be set to null in the copy -
toString
Simple toString() that gives a little bit of useful info about the read. -
getSAMString
Returns the record in the SAM line-based text format. Fields are separated by '\t' characters, and the String is terminated by '\n'. -
getPairedReadName
-
getSAMFlags
shortcut toSAMFlag.getFlags( this.getFlags() );
-
getTransientAttribute
Fetches the value of a transient attribute on the SAMRecord, of null if not set. The intended use for transient attributes is to store values that are 1-to-1 with the SAMRecord, may be needed many times and are expensive to compute. These values can be computed lazily and then stored as transient attributes to avoid frequent re-computation. -
setTransientAttribute
Sets the value of a transient attribute, and returns the previous value if defined. The intended use for transient attributes is to store values that are 1-to-1 with the SAMRecord, may be needed many times and are expensive to compute. These values can be computed lazily and then stored as transient attributes to avoid frequent re-computation. -
removeTransientAttribute
Removes a transient attribute if it is stored, and returns the stored value. If there is not a stored value, will return null. -
reverseComplement
public void reverseComplement()Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment. Changes made after making a copy of the bases, qualities, and any attributes that will be altered. If in-place update is needed usereverseComplement(boolean)
. SeeTAGS_TO_REVERSE_COMPLEMENT
TAGS_TO_REVERSE
for the default set of tags that are handled. -
reverseComplement
public void reverseComplement(boolean inplace) Reverse-complement bases and reverse quality scores along with known optional attributes that need the same treatment. Optionally makes a copy of the bases, qualities or attributes instead of altering them in-place. SeeTAGS_TO_REVERSE_COMPLEMENT
TAGS_TO_REVERSE
for the default set of tags that are handled.- Parameters:
inplace
- Setting this to false will clone all attributes, bases and qualities before changing the values.
-
reverseComplement
public void reverseComplement(Collection<String> tagsToRevcomp, Collection<String> tagsToReverse, boolean inplace) Reverse complement bases and reverse quality scores. In addition reverse complement any non-null attributes specified by tagsToRevcomp and reverse and non-null attributes specified by tagsToReverse.
-