Class IntervalList

java.lang.Object
htsjdk.samtools.util.IntervalList
All Implemented Interfaces:
Iterable<Interval>

public class IntervalList extends Object implements Iterable<Interval>
Represents a list of intervals against a reference sequence that can be written to and read from a file. The file format is relatively simple and reflects the SAM alignment format to a degree. A SAM style header must be present in the file which lists the sequence records against which the intervals are described. After the header the file then contains records one per line in text format with the following values tab-separated:
  • Sequence name
  • Start position (1-based)
  • End position (1-based, end inclusive)
  • Strand (either + or -)
  • Interval name (an, ideally unique, name for the interval)
  • Field Details

  • Constructor Details

    • IntervalList

      public IntervalList(SAMFileHeader header)
      Constructs a new interval list using the supplied header information.
    • IntervalList

      public IntervalList(SAMSequenceDictionary dict)
      Constructs a new interval list using the supplied header information.
  • Method Details

    • getHeader

      public SAMFileHeader getHeader()
      Gets the header (if there is one) for the interval list.
    • iterator

      public Iterator<Interval> iterator()
      Returns an iterator over the intervals.
      Specified by:
      iterator in interface Iterable<Interval>
    • add

      public void add(Interval interval)
      Adds an interval to the list of intervals.
    • addall

      public void addall(Collection<Interval> intervals)
      Adds a Collection of intervals to the list of intervals.
    • sort

      @Deprecated public void sort()
      Deprecated.
      use sorted() instead.
      Sorts the internal collection of intervals by coordinate. Note: this function modifies the object in-place and is therefore difficult to work with.
    • padded

      public IntervalList padded(int before, int after)
      Returns a new IntervalList where each interval is padded by the specified amount of bases.
    • padded

      public IntervalList padded(int padding)
      Returns a new IntervalList where each interval is padded by 'padding' bases on each side.
    • sorted

      public IntervalList sorted()
      returns an independent sorted IntervalList
    • uniqued

      public IntervalList uniqued()
      Returned an independent IntervalList that is sorted and uniquified.
    • uniqued

      public IntervalList uniqued(boolean concatenateNames)
      Returned an independent IntervalList that is sorted and uniquified.
      Parameters:
      concatenateNames - If false, interval names are not concatenated when merging intervals to save space.
    • unique

      @Deprecated public void unique()
      Deprecated.
      use uniqued() instead.
      Sorts and uniques the list of intervals held within this interval list. Note: this function modifies the object in-place and is therefore difficult to work with.
    • unique

      @Deprecated public void unique(boolean concatenateNames)
      Deprecated.
      use uniqued(boolean) instead.
      Sorts and uniques the list of intervals held within this interval list. Note: this function modifies the object in-place and is therefore difficult to work with.
      Parameters:
      concatenateNames - If false, interval names are not concatenated when merging intervals to save space.
    • getIntervals

      public List<Interval> getIntervals()
      Gets the set of intervals as held internally.
    • getUniqueIntervals

      @Deprecated public List<Interval> getUniqueIntervals()
      Deprecated.
      use {@link #uniqued()#getIntervals()} instead.
      Merges the list of intervals and then reduces them down where regions overlap or are directly adjacent to one another. During this process the "merged" interval will retain the strand and name of the 5' most interval merged. Note: has the side-effect of sorting the stored intervals in coordinate order if not already sorted. Note: this function modifies the object in-place and is therefore difficult to work with.
      Returns:
      the set of unique intervals condensed from the contained intervals
    • getUniqueIntervals

      public static List<Interval> getUniqueIntervals(IntervalList list, boolean concatenateNames)
      Merges list of intervals and reduces them like htsjdk.samtools.util.IntervalList#getUniqueIntervals()
      Parameters:
      concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.
    • getUniqueIntervals

      public static List<Interval> getUniqueIntervals(IntervalList list, boolean concatenateNames, boolean enforceSameStrands)
      Merges list of intervals and reduces them like htsjdk.samtools.util.IntervalList#getUniqueIntervals()
      Parameters:
      concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.
      enforceSameStrands - enforce that merged intervals have the same strand, otherwise ignore.
    • getUniqueIntervals

      public static List<Interval> getUniqueIntervals(IntervalList list, boolean combineAbuttingIntervals, boolean concatenateNames, boolean enforceSameStrands)
      Merges list of intervals and reduces them like htsjdk.samtools.util.IntervalList#getUniqueIntervals()
      Parameters:
      combineAbuttingIntervals - If true, intervals that are abutting will be combined into one interval.
      concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.
      enforceSameStrands - enforce that merged intervals have the same strand, otherwise ignore.
    • getUniqueIntervals

      @Deprecated public List<Interval> getUniqueIntervals(boolean concatenateNames)
      Deprecated.
      use {@link #uniqued(boolean)#getIntervals()} or getUniqueIntervals(IntervalList, boolean) instead.
      Merges list of intervals and reduces them like getUniqueIntervals(). Note: this function modifies the object in-place and is therefore difficult to work with.
      Parameters:
      concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.
    • breakIntervalsAtBandMultiples

      public static List<Interval> breakIntervalsAtBandMultiples(List<Interval> intervals, int bandMultiple)
      Given a list of Intervals and a band multiple, this method will return a list of Intervals such that all of the intervals do not straddle integer multiples of that band. ex: if there is an interval (7200-9300) and the bandMultiple is 1000, the interval will be split into: (7200-7999, 8000-8999, 9000-9300)
      Parameters:
      intervals - A list of Interval
      bandMultiple - integer value (> 0) to break up intervals in the list at integer multiples of
      Returns:
      list of intervals that are broken up
    • getBaseCount

      public long getBaseCount()
      Gets the (potentially redundant) sum of the length of the intervals in the list.
    • getUniqueBaseCount

      public long getUniqueBaseCount()
      Gets the count of unique bases represented by the intervals in the list.
    • size

      public int size()
      Returns the count of intervals in the list.
    • copyOf

      public static IntervalList copyOf(IntervalList list)
      creates a independent copy of the given IntervalList
    • fromFile

      public static IntervalList fromFile(File file)
      Parses an interval list from a file.
      Parameters:
      file - the file containing the intervals
      Returns:
      an IntervalList object that contains the headers and intervals from the file
    • fromPath

      public static IntervalList fromPath(Path path)
      Parses an interval list from a path.
      Parameters:
      path - the path containing the intervals
      Returns:
      an IntervalList object that contains the headers and intervals from the path
    • fromName

      public static IntervalList fromName(SAMFileHeader header, String sequenceName)
      Creates an IntervalList from the given sequence name
      Parameters:
      header - header to use to create IntervalList
      sequenceName - name of sequence in header
      Returns:
      a new intervalList with given header that contains the reference name
    • fromFiles

      public static IntervalList fromFiles(Collection<File> intervalListFiles)
      Calls fromFile(java.io.File) on the provided files, and returns their concatenate(Collection)
    • fromReader

      public static IntervalList fromReader(BufferedReader in)
      Parses an interval list from a reader in a stream based fashion.
      Parameters:
      in - a BufferedReader that can be read from. Caller is responsible to close reader as needed.
      Returns:
      an IntervalList object that contains the headers and intervals from the file
      Throws:
      IllegalArgumentException - if start or end are less than 1 or greater than the length of the sequence
    • write

      public void write(Path path)
      Writes out the list of intervals to the supplied path.
      Parameters:
      path - a path to write to. If exists it will be overwritten.
    • write

      public void write(File file)
      Writes out the list of intervals to the supplied file.
      Parameters:
      file - a file to write to. If exists it will be overwritten.
    • intersection

      public static IntervalList intersection(IntervalList list1, IntervalList list2)
      A utility function for generating the intersection of two IntervalLists, checks for equal dictionaries.
      Parameters:
      list1 - the first IntervalList
      list2 - the second IntervalList
      Returns:
      the intersection of list1 and list2.
    • intersection

      public static IntervalList intersection(Collection<IntervalList> lists)
      A utility function for intersecting a list of IntervalLists, checks for equal dictionaries.
      Parameters:
      lists - the list of IntervalList
      Returns:
      the intersection of all the IntervalLists in lists.
    • concatenate

      public static IntervalList concatenate(IntervalList list1, IntervalList list2)
      A utility function for merging a two IntervalLists, checks for equal dictionaries. Merging does not look for overlapping intervals nor uniquify
      Parameters:
      list1 - the first list
      list2 - the second list
      Returns:
      the union of all the IntervalLists in lists.
    • addOther

      public IntervalList addOther(IntervalList other)
      A method for concatenating the intervals from one list to this one, checks for equal dictionaries. Does not look for overlapping intervals nor uniquify.
      Parameters:
      other - the other list
      Returns:
      the modified this
    • concatenate

      public static IntervalList concatenate(Collection<IntervalList> lists)
      A utility function for concatenating a list of IntervalLists, checks for equal dictionaries. Concatenating does not look for overlapping intervals nor uniquify the intervals.
      Parameters:
      lists - a list of IntervalList
      Returns:
      the union of all the IntervalLists in lists.
    • union

      public static IntervalList union(Collection<IntervalList> lists)
      A utility function for finding the union of a list of IntervalLists, checks for equal dictionaries. also looks for overlapping intervals, uniquifies, and sorts (by coordinate)
      Parameters:
      lists - the list of IntervalList
      Returns:
      the union of all the IntervalLists in lists.
    • union

      public static IntervalList union(IntervalList list1, IntervalList list2)
    • invert

      public static IntervalList invert(IntervalList list)
      inverts an IntervalList and returns one that has exactly all the bases in the dictionary that the original one does not.
      Parameters:
      list - an IntervalList
      Returns:
      an IntervalList that is complementary to list
    • subtract

      public static IntervalList subtract(IntervalList lhs, IntervalList rhs)
      A utility function for subtracting one IntervalLists from another. Resulting loci are those that are in the first but not the second.
      Parameters:
      lhs - the IntervalList from which to subtract intervals
      rhs - the IntervalList to subtract
      Returns:
      an IntervalList comprising all loci that are in the first IntervalList but not the second lhs-rhs=answer.
    • subtract

      public static IntervalList subtract(Collection<IntervalList> lhs, Collection<IntervalList> rhs)
      A utility function for subtracting a collection of IntervalLists from another. Resulting loci are those that are in the first collection but not the second.
      Parameters:
      lhs - the collection of IntervalList from which to subtract intervals
      rhs - the collection of intervals to subtract
      Returns:
      an IntervalList comprising all loci that are in the first collection but not the second lhs-rhs=answer.
    • difference

      public static IntervalList difference(Collection<IntervalList> lists1, Collection<IntervalList> lists2)
      A utility function for finding the difference between two IntervalLists.
      Parameters:
      lists1 - the first collection of IntervalLists
      lists2 - the second collection of IntervalLists
      Returns:
      the difference between the two intervals, i.e. the loci that are only in one IntervalList but not both
    • difference

      public static IntervalList difference(IntervalList list1, IntervalList list2)
      A utility function for finding the difference between two IntervalLists.
      Parameters:
      list1 - the first collection of IntervalLists
      list2 - the second collection of IntervalLists
      Returns:
      the difference between the two intervals, i.e. the loci that are only in one IntervalList but not both
    • overlaps

      public static IntervalList overlaps(IntervalList lhs, IntervalList rhs)
      A utility function for finding the intervals in the first list that have at least 1bp overlap with any interval in the second list.
      Parameters:
      lhs - the first collection of IntervalLists
      rhs - the second collection of IntervalLists
      Returns:
      an IntervalList comprising of all intervals in the first IntervalList that have at least 1bp overlap with any interval in the second.
    • overlaps

      public static IntervalList overlaps(Collection<IntervalList> lists1, Collection<IntervalList> lists2)
      A utility function for finding the intervals in the first list that have at least 1bp overlap with any interval in the second list.
      Parameters:
      lists1 - the first collection of IntervalLists
      lists2 - the second collection of IntervalLists
      Returns:
      an IntervalList comprising of all intervals in the first collection of lists that have at least 1bp overlap with any interval in the second lists.
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object