object CollectDuplexSeqMetrics

Companion object for CollectDuplexSeqMetrics that contains various constants and types, including all the various Metric sub-types produced by the program.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CollectDuplexSeqMetrics
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class DuplexFamilySizeMetric (ab_size: Int, ba_size: Int, count: Count = 0, fraction: Proportion = 0, fraction_gt_or_eq_size: Proportion = 0) extends Metric with Ordered[DuplexFamilySizeMetric] with Product with Serializable

    Metrics produced by CollectDuplexSeqMetrics to describe the distribution of double-stranded (duplex) tag families in terms of the number of reads observed on each strand.

    Metrics produced by CollectDuplexSeqMetrics to describe the distribution of double-stranded (duplex) tag families in terms of the number of reads observed on each strand.

    We refer to the two strands as ab and ba because we identify the two strands by observing the same pair of UMIs (A and B) in opposite order (A->B vs B->A). Which strand is ab and which is ba is largely arbitrary, so to make interpretation of the metrics simpler we use a definition here that for a given tag family ab is the sub-family with more reads and ba is the tag family with fewer reads.

    ab_size

    The number of reads in the ab sub-family (the larger sub-family) for this double-strand tag family.

    ba_size

    The number of reads in the ba sub-family (the smaller sub-family) for this double-strand tag family.

    count

    The number of families with the ab and ba single-strand families of size ab_size and ba_size.

    fraction

    The fraction of all double-stranded tag families that have ab_size and ba_size.

    fraction_gt_or_eq_size

    The fraction of all double-stranded tag families that have ab reads >= ab_size and ba reads >= ba_size.

  2. case class DuplexUmiMetric (umi: String, raw_observations: Count = 0, raw_observations_with_errors: Count = 0, unique_observations: Count = 0, fraction_raw_observations: Proportion = 0, fraction_unique_observations: Proportion = 0, fraction_unique_observations_expected: Proportion = 0) extends Metric with Product with Serializable

    Metrics produced by CollectDuplexSeqMetrics describing the set of observed duplex UMI sequences and the frequency of their observations.

    Metrics produced by CollectDuplexSeqMetrics describing the set of observed duplex UMI sequences and the frequency of their observations. The UMI sequences reported may have been corrected using information within a double-stranded tag family. For example if a tag family is comprised of three read pairs with UMIs ACGT-TGGT, ACGT-TGGT, and ACGT-TGGG then a consensus UMI of ACGT-TGGT will be generated.

    UMI pairs are normalized within a tag family so that observations are always reported as if they came from a read pair with read 1 on the positive strand (F1R2). Another way to view this is that for FR or RF read pairs, the duplex UMI reported is the UMI from the positive strand read followed by the UMI from the negative strand read. E.g. a read pair with UMI AAAA-GGGG and with R1 on the negative strand and R2 on the positive strand, will be reported as GGGG-AAAA.

    umi

    The duplex UMI sequence, possibly-corrected.

    raw_observations

    The number of read pairs in the input BAM that observe the duplex UMI (after correction).

    raw_observations_with_errors

    The subset of raw observations that underwent any correction.

    unique_observations

    The number of double-stranded tag families (i.e unique double-stranded molecules) that observed the duplex UMI.

    fraction_raw_observations

    The fraction of all raw observations that the duplex UMI accounts for.

    fraction_unique_observations

    The fraction of all unique observations that the duplex UMI accounts for.

    fraction_unique_observations_expected

    The fraction of all unique observations that are expected to be attributed to the duplex UMI based on the fraction_unique_observations of the two individual UMIs.

  3. case class DuplexYieldMetric (fraction: Proportion, read_pairs: Count, cs_families: Count, ss_families: Count, ds_families: Count, ds_duplexes: Count, ds_fraction_duplexes: Proportion, ds_fraction_duplexes_ideal: Proportion) extends Metric with Product with Serializable

    Metrics produced by CollectDuplexSeqMetrics that are sampled at various levels of coverage, via random downsampling, during the construction of duplex metrics.

    Metrics produced by CollectDuplexSeqMetrics that are sampled at various levels of coverage, via random downsampling, during the construction of duplex metrics. The downsampling is done in such a way that the fractions are approximate, and not exact, therefore the fraction field should only be interpreted as a guide and the read_pairs field used to quantify how much data was used.

    See FamilySizeMetric for detailed definitions of CS, SS and DS as used below.

    fraction

    The approximate fraction of the full dataset that was used to generate the remaining values.

    read_pairs

    The number of read pairs upon which the remaining metrics are based.

    cs_families

    The number of _CS_ (Coordinate & Strand) families present in the data.

    ss_families

    The number of _SS_ (Single-Strand by UMI) families present in the data.

    ds_families

    The number of _DS_ (Double-Strand by UMI) families present in the data.

    ds_duplexes

    The number of _DS_ families that had the minimum number of observations on both strands to be called duplexes (default = 1 read on each strand).

    ds_fraction_duplexes

    The fraction of _DS_ families that are duplexes (ds_duplexes / ds_families).

    ds_fraction_duplexes_ideal

    The fraction of _DS_ families that should be duplexes under an idealized model where each strand, A and B, have equal probability of being sampled, given the observed distribution of _DS_ family sizes.

  4. case class FamilySizeMetric (family_size: Int, cs_count: Count = 0, cs_fraction: Proportion = 0, cs_fraction_gt_or_eq_size: Proportion = 0, ss_count: Count = 0, ss_fraction: Proportion = 0, ss_fraction_gt_or_eq_size: Proportion = 0, ds_count: Count = 0, ds_fraction: Proportion = 0, ds_fraction_gt_or_eq_size: Proportion = 0) extends Metric with Product with Serializable

    Metrics produced by CollectDuplexSeqMetrics to quantify the distribution of different kinds of read family sizes.

    Metrics produced by CollectDuplexSeqMetrics to quantify the distribution of different kinds of read family sizes. Three kinds of families are described:

    1. _CS_ or _Coordinate & Strand_: families of reads that are grouped together by their unclipped 5' genomic positions and strands just as they are in traditional PCR duplicate marking 2. _SS_ or _Single Strand_: single-strand families that are each subsets of a CS family create by also using the UMIs to partition the larger family, but not linking up families that are created from opposing strands of the same source molecule. 3. _DS_ or _Double Strand_: families that are created by combining single-strand families that are from opposite strands of the same source molecule. This does **not** imply that all DS families are composed of reads from both strands; where only one strand of a source molecule is observed a DS family is still counted.

    family_size

    The family size, i.e. the number of read pairs grouped together into a family.

    cs_count

    The count of families with size == family_size when grouping just by coordinates and strand information.

    cs_fraction

    The fraction of all _CS_ families where size == family_size.

    cs_fraction_gt_or_eq_size

    The fraction of all _CS_ families where size >= family_size.

    ss_count

    The count of families with size == family_size when also grouping by UMI to create single-strand families.

    ss_fraction

    The fraction of all _SS_ families where size == family_size.

    ss_fraction_gt_or_eq_size

    The fraction of all _SS_ families where size >= family_size.

    ds_count

    The count of families with size == family_sizewhen also grouping by UMI and merging single-strand families from opposite strands of the same source molecule.

    ds_fraction

    The fraction of all _DS_ families where size == family_size.

    ds_fraction_gt_or_eq_size

    The fraction of all _DS_ families where size >= family_size.

  5. case class UmiMetric (umi: String, raw_observations: Count = 0, raw_observations_with_errors: Count = 0, unique_observations: Count = 0, fraction_raw_observations: Proportion = 0, fraction_unique_observations: Proportion = 0) extends Metric with Product with Serializable

    Metrics produced by CollectDuplexSeqMetrics describing the set of observed UMI sequences and the frequency of their observations.

    Metrics produced by CollectDuplexSeqMetrics describing the set of observed UMI sequences and the frequency of their observations. The UMI sequences reported may have been corrected using information within a double-stranded tag family. For example if a tag family is comprised of three read pairs with UMIs ACGT-TGGT, ACGT-TGGT, and ACGT-TGGG then a consensus UMI of ACGT-TGGT will be generated, and three raw observations counted for each of ACGT and TGGT, and no observations counted for TGGG.

    umi

    The UMI sequence, possibly-corrected.

    raw_observations

    The number of read pairs in the input BAM that observe the UMI (after correction).

    raw_observations_with_errors

    The subset of raw-observations that underwent any correction.

    unique_observations

    The number of double-stranded tag families (i.e unique double-stranded molecules) that observed the UMI.

    fraction_raw_observations

    The fraction of all raw observations that the UMI accounts for.

    fraction_unique_observations

    The fraction of all unique observations that the UMI accounts for.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val DuplexFamilySizeMetricsExt: String
  5. val DuplexUmiMetricsExt: String
  6. val FamilySizeMetricsExt: String
  7. val PlotsExt: String
  8. val UmiMetricsExt: String
  9. val YieldMetricsExt: String
  10. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  11. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  14. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
  16. def hashCode(): Int
    Definition Classes
    AnyRef → Any
  17. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  18. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. final def notify(): Unit
    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit
    Definition Classes
    AnyRef
  21. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  22. def toString(): String
    Definition Classes
    AnyRef → Any
  23. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped