org.bdgenomics.qc

cli

package cli

Visibility
  1. Public
  2. All

Type Members

  1. class CompareADAM extends BDGSparkCommand[CompareADAMArgs] with Serializable

  2. class CompareADAMArgs extends Args4jBase with ParquetArgs with Serializable

  3. class FindReads extends BDGSparkCommand[FindReadsArgs] with Serializable

  4. class FindReadsArgs extends Args4jBase with ParquetArgs with Serializable

  5. class GenotypeConcordance extends BDGSparkCommand[GenotypeConcordanceArgs] with Logging

  6. class GenotypeConcordanceArgs extends Args4jBase with ParquetArgs

  7. class SummarizeGenotypes extends BDGSparkCommand[SummarizeGenotypesArgs] with Logging

  8. class SummarizeGenotypesArgs extends Args4jBase with ParquetArgs

Value Members

  1. object CompareADAM extends BDGCommandCompanion with Serializable

    CompareADAM is a tool for pairwise comparison of ADAM files (or merged sets of ADAM files, see the note on the -recurse{1,2} optional parameters, below).

    CompareADAM is a tool for pairwise comparison of ADAM files (or merged sets of ADAM files, see the note on the -recurse{1,2} optional parameters, below).

    The canonical use-case for CompareADAM involves a single input file run through (for example) two different implementations of the same pipeline, producing two comparable ADAM files at the end.

    CompareADAM will load these ADAM files and perform a read-name-based equi-join. It then computes one or more metrics (embodied as BucketComparisons values) across the joined records, as specified on the command-line, and aggregates each metric into a histogram (although, this can be modified if other aggregations are required in the future) and outputs the resulting histograms to a specified directory as text files.

    There is an R script in the adam-scripts module to process those outputs into a figure.

    The available metrics to be calculated are defined, by name, in the DefaultComparisons object.

    A subsequent tool like FindReads can be used to track down which reads give rise to particular aggregated bins in the output histograms, if further diagnosis is needed.

  2. object FindReads extends BDGCommandCompanion with Serializable

    FindReads is an auxiliary command to CompareADAM -- whereas CompareADAM takes two ADAM files (which presumably contain the same reads, processed differently), joins them based on read-name, and computes aggregates of one or more metrics (BucketComparisons) across those joined reads -- FindReads performs the same join-and-metric-computation, but takes a second argument as well: a boolean condition on the value(s) of the metric(s) computed.

    FindReads is an auxiliary command to CompareADAM -- whereas CompareADAM takes two ADAM files (which presumably contain the same reads, processed differently), joins them based on read-name, and computes aggregates of one or more metrics (BucketComparisons) across those joined reads -- FindReads performs the same join-and-metric-computation, but takes a second argument as well: a boolean condition on the value(s) of the metric(s) computed. FindReads then outputs just the names of the reads whose metric values(s) meet the given condition.

    So, for example, CompareADAM might be used to find out that the same reads processed through two different pipelines are aligned to different locations 3% of the time.

    FindReads would then allow you to output the names of those 3% of the reads which are aligned differently, using a filter expression like "positions!=0".

  3. object GenotypeConcordance extends BDGCommandCompanion

  4. object QCMain extends Logging

  5. object SummarizeGenotypes extends BDGCommandCompanion

Ungrouped