@DocumentedFeature @BetaFeature public class CollectIndependentReplicateMetrics extends CommandLineProgram
The estimation is based on duplicate-sets of size 2 and 3 and gives separate estimates from each. The assumption is that the duplication rate (biological or otherwise) is independent of the duplicate-set size. A significant difference between the two rates may be an indication that this assumption is incorrect.
The duplicate sets are found using the mate-cigar tag (MC) which is added by MergeBamAlignment
, or FixMateInformation
.
This program will not work without the MC tag.
Explanation of the calculation behind the estimation can be found in the IndependentReplicateMetric
class.
The calculation Assumes a diploid organism (more accurately, assumes that only two alleles can appear at a HET site and that these two alleles will appear at equal probabilities. It requires as input a VCF with genotypes for the sample in question. NOTE: This class is very much in alpha stage, and still under heavy development (feel free to join!)
Modifier and Type | Field and Description |
---|---|
java.lang.String |
BARCODE_BQ |
java.lang.String |
BARCODE_TAG |
java.io.File |
INPUT |
java.io.File |
MATRIX_OUTPUT |
java.lang.Integer |
MINIMUM_BARCODE_BQ |
java.lang.Integer |
MINIMUM_BQ |
java.lang.Integer |
MINIMUM_GQ |
java.lang.Integer |
MINIMUM_MQ |
java.io.File |
OUTPUT |
java.lang.String |
SAMPLE |
java.lang.Integer |
STOP_AFTER |
java.io.File |
VCF |
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
Constructor and Description |
---|
CollectIndependentReplicateMetrics() |
Modifier and Type | Method and Description |
---|---|
protected int |
doWork()
Do the work after command line has been parsed.
|
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getFaqLink, getMetricsFile, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
@Argument(shortName="I", doc="Input (indexed) BAM file.") public java.io.File INPUT
@Argument(shortName="O", doc="Write metrics to this file") public java.io.File OUTPUT
@Argument(shortName="MO", doc="Write the confusion matrix (of UMIs) to this file", optional=true) public java.io.File MATRIX_OUTPUT
@Argument(shortName="V", doc="Input VCF file") public java.io.File VCF
@Argument(shortName="GQ", doc="minimal value for the GQ field in the VCF to use variant site.", optional=true) public java.lang.Integer MINIMUM_GQ
@Argument(shortName="MQ", doc="minimal value for the mapping quality of the reads to be used in the estimation.", optional=true) public java.lang.Integer MINIMUM_MQ
@Argument(shortName="BQ", doc="minimal value for the base quality of a base to be used in the estimation.", optional=true) public java.lang.Integer MINIMUM_BQ
@Argument(shortName="ALIAS", doc="Name of sample to look at in VCF. Can be omitted if VCF contains only one sample.", optional=true) public java.lang.String SAMPLE
@Argument(doc="Number of sets to examine before stopping.", optional=true) public java.lang.Integer STOP_AFTER
@Argument(doc="Barcode SAM tag.", optional=true) public java.lang.String BARCODE_TAG
@Argument(doc="Barcode Quality SAM tag.", optional=true) public java.lang.String BARCODE_BQ
@Argument(shortName="MBQ", doc="minimal value for the base quality of all the bases in a molecular barcode, for it to be used.", optional=true) public java.lang.Integer MINIMUM_BARCODE_BQ
protected int doWork()
CommandLineProgram
doWork
in class CommandLineProgram