Class IndexedFastaSequenceFile

java.lang.Object
htsjdk.samtools.reference.IndexedFastaSequenceFile
All Implemented Interfaces:
ReferenceSequenceFile, Closeable, AutoCloseable

public class IndexedFastaSequenceFile extends Object
A fasta file driven by an index for fast, concurrent lookups. Supports two interfaces: the ReferenceSequenceFile for old-style, stateful lookups and a direct getter.
  • Constructor Details

    • IndexedFastaSequenceFile

      public IndexedFastaSequenceFile(File file, FastaSequenceIndex index)
      Open the given indexed fasta sequence file. Throw an exception if the file cannot be opened.
      Parameters:
      file - The file to open.
      index - Pre-built FastaSequenceIndex, for the case in which one does not exist on disk.
      Throws:
      FileNotFoundException - If the fasta or any of its supporting files cannot be found.
    • IndexedFastaSequenceFile

      public IndexedFastaSequenceFile(File file) throws FileNotFoundException
      Open the given indexed fasta sequence file. Throw an exception if the file cannot be opened.
      Parameters:
      file - The file to open.
      Throws:
      FileNotFoundException - If the fasta or any of its supporting files cannot be found.
    • IndexedFastaSequenceFile

      public IndexedFastaSequenceFile(Path path, FastaSequenceIndex index)
      Open the given indexed fasta sequence file. Throw an exception if the file cannot be opened.
      Parameters:
      path - The file to open.
      index - Pre-built FastaSequenceIndex, for the case in which one does not exist on disk.
    • IndexedFastaSequenceFile

      public IndexedFastaSequenceFile(Path path) throws FileNotFoundException
      Open the given indexed fasta sequence file. Throw an exception if the file cannot be opened.
      Parameters:
      path - The file to open.
      Throws:
      FileNotFoundException - If the fasta or any of its supporting files cannot be found.
    • IndexedFastaSequenceFile

      public IndexedFastaSequenceFile(String source, SeekableStream in, FastaSequenceIndex index, SAMSequenceDictionary dictionary)
      Initialise the given indexed fasta sequence file stream.
      Parameters:
      source - The named source of the reference file (used in error messages).
      in - The input stream to read the fasta file from.
      index - The fasta index.
      dictionary - The sequence dictionary, or null if there isn't one.
  • Method Details

    • canCreateIndexedFastaReader

      @Deprecated public static boolean canCreateIndexedFastaReader(File fastaFile)
    • canCreateIndexedFastaReader

      @Deprecated public static boolean canCreateIndexedFastaReader(Path fastaFile)
    • readFromPosition

      protected int readFromPosition(ByteBuffer buffer, long position) throws IOException
      Reads a sequence of bytes from this channel into the given buffer, starting at the given file position.
      Parameters:
      buffer - the buffer into which bytes are to be transferred
      position - the position to start reading at
      Returns:
      the number of bytes read
      Throws:
      IOException - if an I/O error occurs while reading
    • close

      public void close() throws IOException
      Throws:
      IOException
    • findRequiredFastaIndexFile

      protected static Path findRequiredFastaIndexFile(Path fastaFile) throws FileNotFoundException
      Throws:
      FileNotFoundException
    • findFastaIndex

      protected static Path findFastaIndex(Path fastaFile)
    • sanityCheckDictionaryAgainstIndex

      protected static void sanityCheckDictionaryAgainstIndex(String fastaFile, SAMSequenceDictionary sequenceDictionary, FastaSequenceIndex index)
      Do some basic checking to make sure the dictionary and the index match.
      Parameters:
      fastaFile - Used for error reporting only.
      sequenceDictionary - sequence dictionary to check against the index.
      index - index file to check against the dictionary.
    • getIndex

      public FastaSequenceIndex getIndex()
    • nextSequence

      public ReferenceSequence nextSequence()
      Gets the next sequence if available, or null if not present.
      Returns:
      next sequence if available, or null if not present.
    • reset

      public void reset()
      Reset the iterator over the index.
    • isIndexed

      public final boolean isIndexed()
      default implementation -- override if index is supported
      Specified by:
      isIndexed in interface ReferenceSequenceFile
      Returns:
      true if getSequence and getSubsequenceAt methods are allowed.
    • getSequence

      public ReferenceSequence getSequence(String contig)
      Retrieves the complete sequence described by this contig.
      Specified by:
      getSequence in interface ReferenceSequenceFile
      Parameters:
      contig - contig whose data should be returned.
      Returns:
      The full sequence associated with this contig.
    • getSubsequenceAt

      public ReferenceSequence getSubsequenceAt(String contig, long start, long stop)
      Gets the subsequence of the contig in the range [start,stop]
      Specified by:
      getSubsequenceAt in interface ReferenceSequenceFile
      Parameters:
      contig - Contig whose subsequence to retrieve.
      start - inclusive, 1-based start of region.
      stop - inclusive, 1-based stop of region.
      Returns:
      The partial reference sequence associated with this range.
    • findAndLoadSequenceDictionary

      protected SAMSequenceDictionary findAndLoadSequenceDictionary(Path fasta)
      Attempts to find and load the sequence dictionary if present.
    • findSequenceDictionary

      @Deprecated protected static File findSequenceDictionary(File file)
      Deprecated.
      use findSequenceDictionary(Path) instead.
    • findSequenceDictionary

      protected static Path findSequenceDictionary(Path fastaPath)
      Attempts to locate the sequence dictionary file adjacent to the reference fasta file.
    • getPath

      protected Path getPath()
      Returns the path to the reference file.
    • getSource

      protected String getSource()
      Returns the named source of the reference file.
    • getSequenceDictionary

      public SAMSequenceDictionary getSequenceDictionary()
      Returns the list of sequence records associated with the reference sequence if found otherwise null.
      Specified by:
      getSequenceDictionary in interface ReferenceSequenceFile
      Returns:
      a list of sequence records representing the sequences in this reference file
    • getAbsolutePath

      protected String getAbsolutePath()
      Returns the full path to the reference file.
    • toString

      public String toString()
      Returns the full path to the reference file, or the source if no path was specified.
      Specified by:
      toString in interface ReferenceSequenceFile
      Overrides:
      toString in class Object
      Returns:
      Reference name, file name, or something other human-readable representation.