Class ReadNameParser

java.lang.Object
picard.sam.util.ReadNameParser
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
OpticalDuplicateFinder

public class ReadNameParser extends Object implements Serializable
Provides access to the physical location information about a cluster. All values should be defaulted to -1 if unavailable. ReadGroup and Tile should only allow non-zero positive integers, x and y coordinates may be negative.
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
    The read name regular expression (regex) is used to extract three pieces of information from the read name: tile, x location, and y location.
    protected final String
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates are read name parser using the default read name regex and optical duplicate distance.
    ReadNameParser(String readNameRegex)
    Creates are read name parser using the given read name regex.
    ReadNameParser(String readNameRegex, htsjdk.samtools.util.Log log)
    Creates are read name parser using the given read name regex.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
     
    static int
    getLastThreeFields(String readName, char delim, int[] tokens)
    Given a string, splits the string by the delimiter, and returns the the last three fields parsed as integers.
    static int
    Very specialized method to rapidly parse a sequence of digits from a String up until the first non-digit character.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • DEFAULT_READ_NAME_REGEX

      public static final String DEFAULT_READ_NAME_REGEX
      The read name regular expression (regex) is used to extract three pieces of information from the read name: tile, x location, and y location. Any read name regex should parse the read name to produce these and only these values. An example regex is: (?:.*:)?([0-9]+)[^:]*:([0-9]+)[^:]*:([0-9]+)[^:]*$ which assumes that fields in the read name are delimited by ':' and the last three fields correspond to the tile, x and y locations, ignoring any trailing non-digit characters. The default regex is optimized for fast parsing (see getLastThreeFields(String, char, int[])) by searching for the last three fields, ignoring any trailing non-digit characters, assuming the delimiter ':'. This should consider correctly read names where we have 5 or 7 field with the last three fields being tile/x/y, as is the case for the majority of read names produced by Illumina technology.
    • readNameRegex

      protected final String readNameRegex
  • Constructor Details

    • ReadNameParser

      public ReadNameParser()
      Creates are read name parser using the default read name regex and optical duplicate distance. See DEFAULT_READ_NAME_REGEX for an explanation on how the read name is parsed.
    • ReadNameParser

      public ReadNameParser(String readNameRegex)
      Creates are read name parser using the given read name regex. See DEFAULT_READ_NAME_REGEX for an explanation on how to format the regular expression (regex) string.
      Parameters:
      readNameRegex - the read name regular expression string to parse read names, null to never parse location information.
    • ReadNameParser

      public ReadNameParser(String readNameRegex, htsjdk.samtools.util.Log log)
      Creates are read name parser using the given read name regex. See DEFAULT_READ_NAME_REGEX for an explanation on how to format the regular expression (regex) string.
      Parameters:
      readNameRegex - the read name regular expression string to parse read names, null to never parse location information..
      log - the log to which to write messages.
  • Method Details

    • addLocationInformation

      public boolean addLocationInformation(String readName, PhysicalLocation loc)
    • getLastThreeFields

      public static int getLastThreeFields(String readName, char delim, int[] tokens) throws NumberFormatException
      Given a string, splits the string by the delimiter, and returns the the last three fields parsed as integers. Parsing a field considers only a sequence of digits up until the first non-digit character. The three values are stored in the passed-in array.
      Throws:
      NumberFormatException - if any of the tokens that should contain numbers do not start with parsable numbers
    • rapidParseInt

      public static int rapidParseInt(String input) throws NumberFormatException
      Very specialized method to rapidly parse a sequence of digits from a String up until the first non-digit character.
      Throws:
      NumberFormatException - if the String does not start with an optional - followed by at least on digit