public abstract class DuplicateSetWalker extends ReadWalker
SAMTag.MI
tag to be specific) with FGBio GroupReadsByUmi:
http://fulcrumgenomics.github.io/fgbio/tools/latest/GroupReadsByUmi.htmlModifier and Type | Field and Description |
---|---|
protected ReadsWithSameUMI |
currentReadsWithSameUMI |
static java.lang.String |
MIN_REQUIRED_READS_NAME |
static java.lang.String |
MIN_REQUIRED_READS_PER_STRAND_NAME |
FEATURE_CACHE_LOOKAHEAD
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
Constructor and Description |
---|
DuplicateSetWalker() |
Modifier and Type | Method and Description |
---|---|
void |
apply(GATKRead read,
ReferenceContext referenceContext,
FeatureContext featureContext)
FGBio GroupByUMI returns reads sorted by molecule ID: For example, the input bam may look like
read1: ...
|
abstract void |
apply(ReadsWithSameUMI readsWithSameUMI,
ReferenceContext referenceContext,
FeatureContext featureContext)
A subclass must specify how to process the duplicate sets by overriding this method.
|
java.util.List<ReadFilter> |
getDefaultReadFilters()
Returns the default list of CommandLineReadFilters that are used for this tool.
|
protected boolean |
rejectSet(ReadsWithSameUMI readsWithSameUMI)
Returns true for duplicate sets that does not meet required criteria for further processing.
|
void |
traverse()
A complete traversal from start to finish.
|
getProgressMeterRecordLabel, onShutdown, onStartup, requiresReads, resetReadsDataSource
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, closeTool, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getBestAvailableSequenceDictionary, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGenomicsDBOptions, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, onTraversalStart, onTraversalSuccess, requiresFeatures, requiresIntervals, requiresReference, transformTraversalIntervals, useVariantAnnotations
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
public static final java.lang.String MIN_REQUIRED_READS_NAME
public static final java.lang.String MIN_REQUIRED_READS_PER_STRAND_NAME
protected ReadsWithSameUMI currentReadsWithSameUMI
public final void traverse()
ReadWalker
GATKTool.makeReadFilter()
and transformers using
GATKTool.makePreReadFilterTransformer()
GATKTool.makePostReadFilterTransformer()
and then iterates over all reads, applies
the pre-filter transformer, the filter, then the post-filter transformer and hands the resulting reads to the ReadWalker.apply(org.broadinstitute.hellbender.utils.read.GATKRead, org.broadinstitute.hellbender.engine.ReferenceContext, org.broadinstitute.hellbender.engine.FeatureContext)
function of the walker (along with additional contextual information, if present, such as reference bases).
NOTE: You should only override ReadWalker.traverse()
if you are writing a new walker base class in the
engine package that extends this class. It is not meant to be overridden by tools outside of the engine
package.traverse
in class ReadWalker
public final void apply(GATKRead read, ReferenceContext referenceContext, FeatureContext featureContext)
apply
method,
process the set based on the child class's implementation of the method, and clear the currentDuplicateSet
variable and start collecting reads again.
Notice there are two apply() methods in this class:
This apply() inherited from ReadWalker is marked final to discourage subclassing.
A subclass must override the other apply() method that takes in the DuplicateSet.apply
in class ReadWalker
read
- current readreferenceContext
- Reference bases spanning the current read. Will be an empty, but non-null, context object
if there is no backing source of reference data (in which case all queries on it will return
an empty array/iterator). Can request extra bases of context around the current read's interval
by invoking ReferenceContext.setWindow(int, int)
on this object before calling ReferenceContext.getBases()
featureContext
- Features spanning the current read. Will be an empty, but non-null, context object
if there is no backing source of Feature data (in which case all queries on it will return an
empty List).public abstract void apply(ReadsWithSameUMI readsWithSameUMI, ReferenceContext referenceContext, FeatureContext featureContext)
readsWithSameUMI
- A set of reads with the matching UMIs with the same fragment start and endreferenceContext
- A reference context object over the intervals determined by the duplicate set.featureContext
- Entries from a secondary feature file (e.g. vcf) if providedprotected boolean rejectSet(ReadsWithSameUMI readsWithSameUMI)
public java.util.List<ReadFilter> getDefaultReadFilters()
ReadWalker
WellformedReadFilter
filter with all default options. Subclasses
can override to provide alternative filters.
Note: this method is called before command line parsing begins, and thus before a SAMFileHeader is
available through {link #getHeaderForReads}.getDefaultReadFilters
in class ReadWalker