Class FastaSequenceFile

java.lang.Object
htsjdk.samtools.reference.FastaSequenceFile
All Implemented Interfaces:
ReferenceSequenceFile, Closeable, AutoCloseable

public class FastaSequenceFile extends Object
Implementation of ReferenceSequenceFile for reading from FASTA files.
  • Constructor Details

    • FastaSequenceFile

      public FastaSequenceFile(File file, boolean truncateNamesAtWhitespace)
      Constructs a FastaSequenceFile that reads from the specified file.
    • FastaSequenceFile

      public FastaSequenceFile(Path path, boolean truncateNamesAtWhitespace)
      Constructs a FastaSequenceFile that reads from the specified file.
    • FastaSequenceFile

      public FastaSequenceFile(String source, SeekableStream seekableStream, SAMSequenceDictionary dictionary, boolean truncateNamesAtWhitespace)
      Constructs a FastaSequenceFile that reads from the specified stream (which must not be compressed, i.e. the caller is responsible for decompressing the stream).
  • Method Details

    • close

      public void close()
      It's good to call this to free up memory.
    • nextSequence

      public ReferenceSequence nextSequence()
      Description copied from interface: ReferenceSequenceFile
      Retrieves the next whole sequences from the file.
      Returns:
      a ReferenceSequence or null if at the end of the file
    • reset

      public void reset()
      Description copied from interface: ReferenceSequenceFile
      Resets the ReferenceSequenceFile so that the next call to nextSequence() will return the first sequence in the file.
    • findAndLoadSequenceDictionary

      protected SAMSequenceDictionary findAndLoadSequenceDictionary(Path fasta)
      Attempts to find and load the sequence dictionary if present.
    • findSequenceDictionary

      @Deprecated protected static File findSequenceDictionary(File file)
      Deprecated.
      use findSequenceDictionary(Path) instead.
    • findSequenceDictionary

      protected static Path findSequenceDictionary(Path fastaPath)
      Attempts to locate the sequence dictionary file adjacent to the reference fasta file.
    • getPath

      protected Path getPath()
      Returns the path to the reference file.
    • getSource

      protected String getSource()
      Returns the named source of the reference file.
    • getSequenceDictionary

      public SAMSequenceDictionary getSequenceDictionary()
      Returns the list of sequence records associated with the reference sequence if found otherwise null.
      Specified by:
      getSequenceDictionary in interface ReferenceSequenceFile
      Returns:
      a list of sequence records representing the sequences in this reference file
    • getAbsolutePath

      protected String getAbsolutePath()
      Returns the full path to the reference file.
    • toString

      public String toString()
      Returns the full path to the reference file, or the source if no path was specified.
      Specified by:
      toString in interface ReferenceSequenceFile
      Overrides:
      toString in class Object
      Returns:
      Reference name, file name, or something other human-readable representation.
    • isIndexed

      public boolean isIndexed()
      default implementation -- override if index is supported
      Specified by:
      isIndexed in interface ReferenceSequenceFile
      Returns:
      true if getSequence and getSubsequenceAt methods are allowed.
    • getSequence

      public ReferenceSequence getSequence(String contig)
      default implementation -- override if index is supported
      Specified by:
      getSequence in interface ReferenceSequenceFile
      Parameters:
      contig - contig whose data should be returned.
      Returns:
      The full sequence associated with this contig.
    • getSubsequenceAt

      public ReferenceSequence getSubsequenceAt(String contig, long start, long stop)
      default implementation -- override if index is supported
      Specified by:
      getSubsequenceAt in interface ReferenceSequenceFile
      Parameters:
      contig - Contig whose subsequence to retrieve.
      start - inclusive, 1-based start of region.
      stop - inclusive, 1-based stop of region.
      Returns:
      The partial reference sequence associated with this range.