package umi
- Alphabetic
- Public
- All
Type Members
-
class
AnnotateBamWithUmis
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
CallDuplexConsensusReads
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
CallMolecularConsensusReads
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
CollectDuplexSeqMetrics
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
ConsensusCaller
extends AnyRef
Generic consensus caller class that can be used to produce consensus base calls and qualities from a pileup of raw base calls and qualities.
Generic consensus caller class that can be used to produce consensus base calls and qualities from a pileup of raw base calls and qualities.
The consensus caller sees the process of going from a DNA source molecule in its original, pristine, state to a sequenced base as having three phases each with their own distinct error profiles: 1) The phase whereby the source molecule is harvested (e.g. cells extracted and lysed) to the point where some kind of molecular identifier has been attached that will allow for identification of replicates that are generated from the same original source molecule. Errors in this phase will be present in all copies of the molecule that are prepared for sequencing (except where a second error reverts the change).
2) Everything between phases 1 & 3! Generally including any sample preparation activities after a molecular identifier has been attached but prior to sequencing. Errors introduced in this phase will be present in some fraction of the molecules available at sequencing.
3) Sequencing of the molecule (or clonal cluster of molecules) on a sequencer. E.g. the process of base-by-base resynthesis and sequencing on an Illumina sequencer _after_ cluster amplification. Errors in this phase are captured by the raw base quality scores from the sequencer.
-
class
ConsensusCallingIterator
extends Iterator[SamRecord]
An iterator that consumes from an incoming iterator of SAMRecords and generates consensus read SAMRecords using the supplied caller.
-
class
CorrectUmis
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
DuplexConsensusCaller
extends UmiConsensusCaller[DuplexConsensusRead] with LazyLogging
Creates duplex consensus reads from SamRecords that have been grouped by their source molecule but not yet by source strand.
Creates duplex consensus reads from SamRecords that have been grouped by their source molecule but not yet by source strand.
Filters incoming bases by quality before building the duplex.
Output reads and bases are constructed only if there is at least one read from each source molecule strand. Otherwise no filtering is performed.
Note that a consequence of the above is that the output reads can be shorter than _some_ of the input reads if the input reads are of varying length; they will be the length at which there is coverage from both source strands.
-
case class
DuplexConsensusPerBaseValues
(abStrand: Boolean, bases: String, depths: Array[Short], errors: Array[Short]) extends Product with Serializable
A little class to store the per-base values for ab and ba-strand
-
class
ExtractUmisFromBam
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
FilterConsensusReads
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
GroupReadsByUmi
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
class
ReviewConsensusVariants
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
sealed
trait
Strategy
extends EnumEntry
The strategies implemented by GroupReadsByUmi to identify reads from the same source molecule.
-
case class
TagFamilySizeMetric
(family_size: Int, count: Count, fraction: Proportion = 0, fraction_gt_or_eq_family_size: Proportion = 0) extends Metric with Product with Serializable
Metrics produced by
GroupReadsByUmi
to describe the distribution of tag family sizes observed during grouping.Metrics produced by
GroupReadsByUmi
to describe the distribution of tag family sizes observed during grouping.- family_size
The family size, or number of templates/read-pairs belonging to the family.
- count
The number of families (or source molecules) observed with
family_size
observations.- fraction
The fraction of all families of all sizes that have this specific
family_size
.- fraction_gt_or_eq_family_size
The fraction of all families that have
>= family_size
.
-
trait
UmiConsensusCaller
[C <: SimpleRead] extends AnyRef
A trait that can be mixed in by any consensus caller that works at the read level, mapping incoming SamRecords into consensus SamRecords.
A trait that can be mixed in by any consensus caller that works at the read level, mapping incoming SamRecords into consensus SamRecords.
- C
Internally, the type of lightweight consensus read that is used prior to rebuilding SamRecords.
-
case class
VanillaConsensusRead
(id: String, bases: Array[Byte], quals: Array[Byte], depths: Array[Short], errors: Array[Short]) extends SimpleRead with Product with Serializable
Stores information about a consensus read.
Stores information about a consensus read. All four arrays are of equal length.
Depths and errors that have values exceeding Short.MaxValue (32767) will be called to Short.MaxValue.
- bases
the base calls of the consensus read
- quals
the calculated phred-scaled quality scores of the bases
- depths
the number of raw reads that contributed to the consensus call at each position
- errors
the number of contributing raw reads that disagree with the final consensus base at each position
-
class
VanillaUmiConsensusCaller
extends UmiConsensusCaller[VanillaConsensusRead] with LazyLogging
Calls consensus reads by grouping consecutive reads with the same SAM tag.
Calls consensus reads by grouping consecutive reads with the same SAM tag.
Consecutive reads with the SAM tag are partitioned into fragments, first of pair, and second of pair reads, and a consensus read is created for each partition. A consensus read for a given partition may not be returned if any of the conditions are not met (ex. minimum number of reads, minimum mean consensus base quality, ...).
-
case class
VanillaUmiConsensusCallerOptions
(tag: String = DefaultTag, errorRatePreUmi: PhredScore = DefaultErrorRatePreUmi, errorRatePostUmi: PhredScore = DefaultErrorRatePostUmi, minInputBaseQuality: PhredScore = DefaultMinInputBaseQuality, qualityTrim: Boolean = DefaultQualityTrim, minConsensusBaseQuality: PhredScore = DefaultMinConsensusBaseQuality, minReads: Int = DefaultMinReads, maxReads: Int = DefaultMaxReads, producePerBaseTags: Boolean = DefaultProducePerBaseTags) extends Product with Serializable
Holds the parameters/options for consensus calling.
Value Members
-
object
CollectDuplexSeqMetrics
Companion object for CollectDuplexSeqMetrics that contains various constants and types, including all the various Metric sub-types produced by the program.
- object ConsensusCaller
-
object
ConsensusTags
Object that encapsulates the various consensus related tags that are added to consensus reads at both the per-read and per-base level.
Object that encapsulates the various consensus related tags that are added to consensus reads at both the per-read and per-base level.
Currently only contains tags for single-strand consensus reads, but with a view to using the following names for consistency if/when we add duplex calling: Value AB BA Final
== == ===== per-read-depth aD bD cD per-read-min-depth aM bM cM per-read-error-rate aE bE cE per-base-depth ad bd cd per-base-error-count ae be ce per-base-bases ac bc bases per-base-quals aq bq quals The second letter in the tag is lower case if it is per-base, upper case if it is per-read.
- object CorrectUmis
-
object
DuplexConsensusCaller
Container for constant values and types used by the DuplexConsensusCaller
- object DuplexConsensusPerBaseValues extends Serializable
- object ExtractUmisFromBam
- object GroupReadsByUmi
- object ReviewConsensusVariants
- object Strategy extends FgBioEnum[Strategy]
-
object
UmiConsensusCaller
Contains shared types and functions used when writing UMI-driven consensus callers that take in SamRecords and emit SamRecords.
-
object
VanillaUmiConsensusCallerOptions
extends Serializable
Holds the defaults for consensus caller options.