Package htsjdk.samtools
Class SRAIndex
java.lang.Object
htsjdk.samtools.SRAIndex
- All Implemented Interfaces:
BAMIndex
,BrowseableBAMIndex
,Closeable
,AutoCloseable
Emulates BAM index so that we can request chunks of records from SRAFileReader
Here is how it works:
SRA allows reading of alignments by Reference position fast, so we divide our "file" range for alignments as
a length of all references. Reading unaligned reads is then fast if we use read positions for lookup and (internally)
filter out aligned fragments.
Total SRA "file" range is calculated as sum of all reference lengths plus number of reads (both aligned and unaligned)
in SRA archive.
Now, we can use Chunks to lookup for aligned and unaligned fragments.
We emulate BAM index bins by mapping SRA reference positions to bin numbers.
And then we map from bin number to list of chunks, which represent SRA "file" positions (which are simply reference
positions).
We only emulate last level of BAM index bins (and they refer to a portion of reference SRA_BIN_SIZE bases long).
For all other bins RuntimeException will be returned (but since nobody else creates bins, except SRAIndex class
that is fine).
But since the last level of bins was not meant to refer to fragments that only partially overlap bin reference
positions, we also return chunk that goes 5000 bases left before beginning of the bin to assure fragments that
start before the bin positions but still overlap with it can be retrieved by SRA reader.
Later we will add support to NGS API to get a maximum number of bases that we need to go left to retrieve such fragments.
Created by andrii.nikitiuk on 9/4/15.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
Number of reference bases bins in last level can representstatic final int
Chunks of that size will be created when using SRA indexFields inherited from interface htsjdk.samtools.BAMIndex
BAI_INDEX_SUFFIX, BAMIndexSuffix, CSI_INDEX_SUFFIX
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Close the index and release any associated resources.getBinsOverlapping
(int referenceIndex, int startPos, int endPos) Provides a list of bins that contain bases at requested positionsint
getFirstLocusInBin
(Bin bin) Gets the first locus that this bin can index into.int
getLastLocusInBin
(Bin bin) Gets the last locus that this bin can index into.int
getLevelForBin
(Bin bin) SRA only operates on bins from last levelint
getLevelSize
(int levelNumber) Gets the size (number of bins in) a given level of a BAM index.getMetaData
(int reference) Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate recordsgetSpanOverlapping
(int referenceIndex, int startPos, int endPos) Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive.getSpanOverlapping
(Bin bin) Perform an overlapping query of all bins bounding the given location.long
Gets the start of the last linear bin in the index.
-
Field Details
-
SRA_BIN_SIZE
public static final int SRA_BIN_SIZENumber of reference bases bins in last level can represent- See Also:
-
SRA_CHUNK_SIZE
public static final int SRA_CHUNK_SIZEChunks of that size will be created when using SRA index- See Also:
-
-
Constructor Details
-
SRAIndex
- Parameters:
header
- sam headerrecordRangeInfo
- info about record ranges withing SRA archive
-
-
Method Details
-
getLevelSize
public int getLevelSize(int levelNumber) Gets the size (number of bins in) a given level of a BAM index.- Specified by:
getLevelSize
in interfaceBrowseableBAMIndex
- Parameters:
levelNumber
- Level for which to inspect the size.- Returns:
- Size of the given level.
-
getLevelForBin
SRA only operates on bins from last level- Specified by:
getLevelForBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin for which to determine the level.- Returns:
- bin level
-
getFirstLocusInBin
Gets the first locus that this bin can index into.- Specified by:
getFirstLocusInBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin to test.- Returns:
- first position that associated with given bin number
-
getLastLocusInBin
Gets the last locus that this bin can index into.- Specified by:
getLastLocusInBin
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin to test.- Returns:
- last position that associated with given bin number
-
getBinsOverlapping
Provides a list of bins that contain bases at requested positions- Specified by:
getBinsOverlapping
in interfaceBrowseableBAMIndex
- Parameters:
referenceIndex
- sequence of desired SAMRecordsstartPos
- 1-based start of the desired interval, inclusiveendPos
- 1-based end of the desired interval, inclusive- Returns:
- a list of bins that contain relevant data
-
getSpanOverlapping
Description copied from interface:BrowseableBAMIndex
Perform an overlapping query of all bins bounding the given location.- Specified by:
getSpanOverlapping
in interfaceBrowseableBAMIndex
- Parameters:
bin
- The bin over which to perform an overlapping query.- Returns:
- The file pointers
-
getSpanOverlapping
Description copied from interface:BAMIndex
Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive. See the BAM spec for more information on how a chunk is represented.- Specified by:
getSpanOverlapping
in interfaceBAMIndex
- Parameters:
referenceIndex
- The contig.startPos
- Genomic start of query.endPos
- Genomic end of query.- Returns:
- A file span listing the chunks in the BAM file.
-
getStartOfLastLinearBin
public long getStartOfLastLinearBin()Description copied from interface:BAMIndex
Gets the start of the last linear bin in the index.- Specified by:
getStartOfLastLinearBin
in interfaceBAMIndex
- Returns:
- a position where aligned fragments end
-
getMetaData
Description copied from interface:BAMIndex
Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate records- Specified by:
getMetaData
in interfaceBAMIndex
- Parameters:
reference
- the reference of interest- Returns:
- meta data for the reference
-
close
public void close()Description copied from interface:BAMIndex
Close the index and release any associated resources.
-