@DocumentedFeature public class MarkDuplicatesWithMateCigar extends AbstractMarkDuplicatesCommandLineProgram
AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator
Modifier and Type | Field and Description |
---|---|
int |
BLOCK_SIZE |
int |
MINIMUM_DISTANCE |
ASSUME_SORT_ORDER, ASSUME_SORTED, COMMENT, DUPLICATE_SCORING_STRATEGY, INPUT, METRICS_FILE, OUTPUT, pgIdsSeen, pgTagArgumentCollection, PROGRAM_GROUP_COMMAND_LINE, PROGRAM_GROUP_NAME, PROGRAM_GROUP_VERSION, PROGRAM_RECORD_ID, REMOVE_DUPLICATES
LOG, MAX_OPTICAL_DUPLICATE_SET_SIZE, OPTICAL_DUPLICATE_PIXEL_DISTANCE, opticalDuplicateFinder, READ_NAME_REGEX
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
Constructor and Description |
---|
MarkDuplicatesWithMateCigar() |
Modifier and Type | Method and Description |
---|---|
protected int |
doWork()
Main work method.
|
addDuplicateReadToMetrics, addReadToLibraryMetrics, addSingletonToCount, finalizeAndWriteMetrics, getChainedPgIds, openInputs, trackOpticalDuplicates
customCommandLineValidation, setupOpticalDuplicateFinder
checkRInstallation, getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
@Argument(doc="The minimum distance to buffer records to account for clipping on the 5\' end of the records. For a given alignment, this parameter controls the width of the window to search for duplicates of that alignment. Due to 5\' read clipping, duplicates do not necessarily have the same 5\' alignment coordinates, so the algorithm needs to search around the neighborhood. For single end sequencing data, the neighborhood is only determined by the amount of clipping (assuming no split reads), thus setting MINIMUM_DISTANCE to twice the sequencing read length should be sufficient. For paired end sequencing, the neighborhood is also determined by the fragment insert size, so you may want to set MINIMUM_DISTANCE to something like twice the 99.5% percentile of the fragment insert size distribution (see CollectInsertSizeMetrics). Or you can set this number to -1 to use either a) twice the first read\'s read length, or b) 100, whichever is smaller. Note that the larger the window, the greater the RAM requirements, so you could run into performance limitations if you use a value that is unnecessarily large.", optional=true) public int MINIMUM_DISTANCE
@Argument(doc="The block size for use in the coordinate-sorted record buffer.", optional=true) public int BLOCK_SIZE
protected int doWork()
doWork
in class CommandLineProgram