public enum Nucleotide extends java.lang.Enum<Nucleotide>
This enumeration not only contains standard (non-ambiguous) nucleotides, but also
contains ambiguous nucleotides, as well as a code X
(a.k.a. INVALID
)
for invalid nucleotide calls.
You can query whether a value refers to a non-ambiguous nucleotide with isStandard()
or
isAmbiguous()
whichever is most convenient. Notice that the special value X
is neither of those.
Querying the X
value for its complement
, transition
or
transversion
or using it in other operations
such as intersect(org.broadinstitute.hellbender.utils.Nucleotide)
will return X
; similar to Double.NaN
in
double
arithmetic.
For naming consistency it is recommended to use decode(byte)
and encodeAsByte(boolean)
or encodeAsString()
methods to translate byte/char and string encodings from and
into values of this enum over the inherited Enum.toString()
, Enum.name
or valueOf(java.lang.String)
.
Although the canonical names for values use the single letter IUPAC
encodings, this class provides convenient longer form names constant aliases
(e.g. ADENINE
for A
, PURINE
for R
, etc.).
Uracil and Thymine are considered equivalent in this enum with T
as the canonical name.
Finally, notice that there is no code of the "gap nucleotide" that may appear in aligned sequences as in fact
that is not a nucleotide. A base encoding using the typical gap representation such as '.' or '-' would
be interpreted as an INVALID
(i.e. X
) call which is probably not what you want.
So code to support those will need to do so outside this enum
.
Modifier and Type | Class and Description |
---|---|
static class |
Nucleotide.Counter
Helper class to count the number of occurrences of each nucleotide code in
a sequence.
|
Modifier and Type | Field and Description |
---|---|
static Nucleotide |
ADENINE |
static Nucleotide |
AMINO |
static Nucleotide |
ANY |
static Nucleotide |
CYTOSINE |
static Nucleotide |
GUANINE |
static Nucleotide |
INVALID |
static Nucleotide |
KETO |
static Nucleotide |
PURINE |
static Nucleotide |
PYRIMIDINE |
static java.util.List<Nucleotide> |
STANDARD_BASES
List of the standard (non-redundant) nucleotide values in their preferred alphabetical order.
|
static Nucleotide |
STRONG |
static Nucleotide |
THYMINE |
static Nucleotide |
U |
static Nucleotide |
URACIL |
static Nucleotide |
WEAK |
Modifier and Type | Method and Description |
---|---|
Nucleotide |
complement()
Returns the complement nucleotide code for this one.
|
static byte |
complement(byte b)
Returns the complement for a base code.
|
static byte |
complement(byte b,
boolean upperCase)
Returns the complement for a base code.
|
static Nucleotide |
decode(byte base)
Returns the nucleotide that corresponds to a particular
byte typed base code. |
static Nucleotide |
decode(char ch)
Returns the nucleotide that corresponds to a particular
char typed base code. |
static Nucleotide |
decode(java.lang.CharSequence seq)
Transform a single-letter character string into the corresponding nucleotide.
|
byte |
encodeAsByte()
Returns this nucleotide's exclusive upper-case
byte encoding. |
byte |
encodeAsByte(boolean upperCase)
Returns the
byte typed encoding that corresponds to this nucleotide. |
char |
encodeAsChar()
Returns the nucleotide's exclusive upper-case
char encoding. |
char |
encodeAsChar(boolean upperCase)
Returns the
char typed encoding that corresponds to this nucleotide. |
java.lang.String |
encodeAsString()
Returns the nucleotide's exclusive upper-case
String encoding. |
java.lang.String |
encodeAsString(boolean upperCase)
Returns the nucleotide's exclusive
String typed encoding. |
boolean |
includes(byte b)
Checks whether this nucleotide code encloses all possible nucleotides for another code.
|
boolean |
includes(Nucleotide other)
Checks whether this nucleotide code encloses all possible nucleotides for another code.
|
static boolean |
intersect(byte a,
byte b)
Checks whether two nucleotides intersect given their byte encodings.
|
Nucleotide |
intersect(Nucleotide other)
Returns the nucleotide code that include all and only the nucleotides that are
included by this another code.
|
boolean |
isAmbiguous()
Checks whether the nucleotide refer to an ambiguous base.
|
boolean |
isStandard()
Checks whether the nucleotide refers to a concrete (rather than ambiguous) base.
|
boolean |
isValid()
Whether this nucleotide code is valid or not.
|
static boolean |
same(byte a,
byte b)
Checks whether two base encodings make reference to the same
Nucleotide(org.broadinstitute.hellbender.utils.Nucleotide...)
instance regardless of their case. |
boolean |
same(Nucleotide other)
Checks whether this and another
Nucleotide(org.broadinstitute.hellbender.utils.Nucleotide...) make reference to the same nucleotide(s). |
Nucleotide |
transition()
Returns the instance that would include all possible transition mutations from this one.
|
Nucleotide |
transversion()
Returns the instance that would include all possible tranversion mutations from nucleotides included
in this one.
|
Nucleotide |
transversion(boolean strong)
Transversion mutation toward a strong or a weak base.
|
static Nucleotide |
valueOf(java.lang.String name)
Returns the enum constant of this type with the specified name.
|
static Nucleotide[] |
values()
Returns an array containing the constants of this enum type, in
the order they are declared.
|
public static final Nucleotide A
public static final Nucleotide C
public static final Nucleotide G
public static final Nucleotide T
public static final Nucleotide R
public static final Nucleotide Y
public static final Nucleotide S
public static final Nucleotide W
public static final Nucleotide K
public static final Nucleotide M
public static final Nucleotide B
public static final Nucleotide D
public static final Nucleotide H
public static final Nucleotide V
public static final Nucleotide N
public static final Nucleotide X
public static final Nucleotide U
public static final Nucleotide ADENINE
public static final Nucleotide CYTOSINE
public static final Nucleotide GUANINE
public static final Nucleotide THYMINE
public static final Nucleotide URACIL
public static final Nucleotide STRONG
public static final Nucleotide WEAK
public static final Nucleotide PURINE
public static final Nucleotide PYRIMIDINE
public static final Nucleotide AMINO
public static final Nucleotide KETO
public static final Nucleotide ANY
public static final Nucleotide INVALID
public static final java.util.List<Nucleotide> STANDARD_BASES
public static Nucleotide[] values()
for (Nucleotide c : Nucleotide.values()) System.out.println(c);
public static Nucleotide valueOf(java.lang.String name)
name
- the name of the enum constant to be returned.java.lang.IllegalArgumentException
- if this enum type has no constant with the specified namejava.lang.NullPointerException
- if the argument is nullpublic byte encodeAsByte(boolean upperCase)
byte
typed encoding that corresponds to this nucleotide.upperCase
- whether to return the upper- or lower-case byte
representation.byte
representation for a nucleotide.public char encodeAsChar(boolean upperCase)
char
typed encoding that corresponds to this nucleotide.upperCase
- whether to return the upper- or lower-case char
representation.char
representation for a nucleotide.public byte encodeAsByte()
byte
encoding.public char encodeAsChar()
char
encoding.public java.lang.String encodeAsString()
String
encoding.public java.lang.String encodeAsString(boolean upperCase)
String
typed encoding.upperCase
- whether the upper or lower-case representation should be returned.String
representation for this nucleotide.public static Nucleotide decode(byte base)
byte
typed base code.base
- the query base code.null
, but INVALID
if the base code does not
correspond to a valid nucleotide specification.public static Nucleotide decode(char ch)
char
typed base code.ch
- the query base code.null
, but INVALID
if the base code does not correspond
to a valid nucleotide specification.public static Nucleotide decode(java.lang.CharSequence seq)
Null
, empty or multi-letter input will result in an IllegalArgumentException
.
These are not simply invalid encodings as the fact that are not a single character is
an indication of a probable bug.
seq
- the input character sequence to transform into.null
, perhaps INVALID
to indicate that the input is not a valid
single letter encoding encoding.public boolean isStandard()
true
iff this is a concrete nucleotide.public boolean isAmbiguous()
true
iff this is an ambiguous nucleotide.public boolean isValid()
true
iff valid.public boolean includes(Nucleotide other)
other
- the other nucleotide to compare to.true
iff every nucleotide in other
is enclosed in this code.public boolean includes(byte b)
b
- the other nucleotide to compare to encoded as a byte.true
iff every nucleotide in other
is enclosed in this code.public Nucleotide intersect(Nucleotide other)
other
- the other nucleotide code.null
. Returns INVALID
if the intersection does not contain
any nucleotide.java.lang.IllegalArgumentException
- if other
is null
.public static boolean intersect(byte a, byte b)
a
- first nucleotide.b
- second nucleotide.true
iff the input nucleotides intersect.public static boolean same(byte a, byte b)
Nucleotide(org.broadinstitute.hellbender.utils.Nucleotide...)
instance regardless of their case.
This method is a shorthand for:
.decode(byte)
(a)same
(decode(byte)
(b))
The order of the inputs is not relevant, therefore same(a, b) == same(b, a)
for any
given a
and b
.
Notice that if either or both input bases make reference to an invalid nucleotide (i.e.
decode(byte)
(x) ==INVALID
}, this method will returnfalse
even ifa == b
.
a
- the first base to compare (however order is not relevant).b
- the second base to compare (however order is not relevant).true
iff {@link #decode}
.same(decode(byte)
(b))}}public boolean same(Nucleotide other)
Nucleotide(org.broadinstitute.hellbender.utils.Nucleotide...)
make reference to the same nucleotide(s).
In contrast with Enum.equals(java.lang.Object)
, this method will return false
if any of the two, this
or the input nucleotide is the INVALID
enum value. So even
will returnINVALID
.same(INVALID
)
false
.
other
- the other nucleotide.true
iff this and the input nucleotide make reference to the same nucleotides.public Nucleotide complement()
For ambiguous nucleotide codes, this will return the ambiguous code that encloses the complement of each possible nucleotide in this code.
The complement of the INVALID
nucleotide is itself.
null
.public static byte complement(byte b, boolean upperCase)
When an invalid base is provided this method will return the input byte (lower- or upper-cased depending on that flag value).
b
- the input baseupperCase
- whether to return the uppercase (true
) or the lower case (false
) byte encoding.public static byte complement(byte b)
The case of the output will match the case of the input.
When an invalid base is provided this method will return the input base byte.
b
- the input basepublic Nucleotide transition()
null
.public Nucleotide transversion()
null
.public Nucleotide transversion(boolean strong)
This method provides a non-ambiguous alternative to transversion()
for
concrete nucleotides.
strong
- whether the result should be a strong (S: G, C
) or weak (W: A, T
) nucleotide(s).