Class CollectSVEvidence
java.lang.Object
org.broadinstitute.hellbender.cmdline.CommandLineProgram
org.broadinstitute.hellbender.engine.GATKTool
org.broadinstitute.hellbender.engine.WalkerBase
org.broadinstitute.hellbender.engine.ReadWalker
org.broadinstitute.hellbender.tools.walkers.sv.CollectSVEvidence
- All Implemented Interfaces:
org.broadinstitute.barclay.argparser.CommandLinePluginProvider
Creates discordant read pair, split read evidence, site depth, and read depth files for use in the GATK-SV pipeline.
This tool emulates the functionality of the "svtk collect-pesr" used in v1 of the GATK-SV pipeline.
The first output file, which should be named "*.pe.txt" or "*.pe.txt.gz" is a tab-delimited file
containing information on discordant read pairs in the input cram, with the following columns:
- read contig
- read start
- read strand
- mate contig
- mate start
- mate strand
- sample name
- contig
- clipping position
- direction: side of the read that was clipped (either "left" or "right")
- count: the number of reads clipped at this location in this direction
- sample name
- contig
- position
- sampleName
- A observations
- C observations
- G observations
- T observations
- contig
- starting position
- ending position
- read count
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
static final String
static final String
static final String
static final String
static final String
static final String
static final String
static final String
static final String
int
int
int
static final String
static final String
static final String
static final String
static final String
static final String
static final String
static final String
static final String
Fields inherited from class org.broadinstitute.hellbender.engine.ReadWalker
FEATURE_CACHE_LOOKAHEAD
Fields inherited from class org.broadinstitute.hellbender.engine.GATKTool
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
apply
(GATKRead read, ReferenceContext referenceContext, FeatureContext featureContext) Process an individual read (with optional contextual information).void
This method is called by the GATK framework at the end of theGATKTool.doWork()
template method.void
countSplitRead
(GATKRead read, PriorityQueue<org.broadinstitute.hellbender.tools.walkers.sv.CollectSVEvidence.SplitPos> splitCounts, FeatureSink<SplitReadEvidence> srWriter) Adds split read information about the current read to the counts in splitCounts.Returns the default list of CommandLineReadFilters that are used for this tool.org.broadinstitute.hellbender.tools.walkers.sv.CollectSVEvidence.DiscordantRead
getReportableDiscordantReadPair
(GATKRead read, Set<String> observedDiscordantNamesAtThisLocus, htsjdk.samtools.SAMSequenceDictionary samSequenceDictionary) void
Operations performed just prior to the start of traversal.Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).boolean
Does this tool require reads? Traversals types and/or tools that do should override to return true.Methods inherited from class org.broadinstitute.hellbender.engine.ReadWalker
getProgressMeterRecordLabel, onShutdown, onStartup, resetReadsDataSource, traverse
Methods inherited from class org.broadinstitute.hellbender.engine.WalkerBase
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
Methods inherited from class org.broadinstitute.hellbender.engine.GATKTool
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getBestAvailableSequenceDictionary, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGenomicsDBOptions, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, getUserSuppliedIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, requiresFeatures, requiresIntervals, requiresReference, transformTraversalIntervals, useVariantAnnotations
Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
-
Field Details
-
PAIRED_END_FILE_ARGUMENT_SHORT_NAME
- See Also:
-
PAIRED_END_FILE_ARGUMENT_LONG_NAME
- See Also:
-
SPLIT_READ_FILE_ARGUMENT_SHORT_NAME
- See Also:
-
SPLIT_READ_FILE_ARGUMENT_LONG_NAME
- See Also:
-
SITE_DEPTH_OUTPUT_ARGUMENT_SHORT_NAME
- See Also:
-
SITE_DEPTH_OUTPUT_ARGUMENT_LONG_NAME
- See Also:
-
SITE_DEPTH_INPUT_ARGUMENT_SHORT_NAME
- See Also:
-
SITE_DEPTH_INPUT_ARGUMENT_LONG_NAME
- See Also:
-
DEPTH_EVIDENCE_OUTPUT_FILE_ARGUMENT_SHORT_NAME
- See Also:
-
DEPTH_EVIDENCE_OUTPUT_FILE_ARGUMENT_LONG_NAME
- See Also:
-
DEPTH_EVIDENCE_SUMMARY_FILE_ARGUMENT_SHORT_NAME
- See Also:
-
DEPTH_EVIDENCE_SUMMARY_FILE_ARGUMENT_LONG_NAME
- See Also:
-
DEPTH_EVIDENCE_INTERVALS_INPUT_FILE_ARGUMENT_SHORT_NAME
- See Also:
-
DEPTH_EVIDENCE_INTERVALS_INPUT_FILE_ARGUMENT_LONG_NAME
- See Also:
-
MIN_DEPTH_EVIDENCE_MAPQ_ARGUMENT_NAME
- See Also:
-
MIN_SITE_DEPTH_MAPQ_ARGUMENT_NAME
- See Also:
-
MIN_SITE_DEPTH_BASEQ_ARGUMENT_NAME
- See Also:
-
SAMPLE_NAME_ARGUMENT_LONG_NAME
- See Also:
-
COMPRESSION_LEVEL_ARGUMENT_LONG_NAME
- See Also:
-
peFile
@Argument(shortName="PE", fullName="pe-file", doc="Output file for paired end evidence", optional=true) public GATKPath peFile -
srFile
@Argument(shortName="SR", fullName="sr-file", doc="Output file for split read evidence", optional=true) public GATKPath srFile -
siteDepthOutputFilename
@Argument(shortName="SD", fullName="sd-file", doc="Output file for site depth counts", optional=true) public GATKPath siteDepthOutputFilename -
siteDepthInputFilename
@Argument(shortName="F", fullName="site-depth-locs-vcf", doc="Input VCF of SNPs marking loci for site depth counts", optional=true) public GATKPath siteDepthInputFilename -
depthEvidenceOutputFilename
@Argument(shortName="RD", fullName="depth-evidence-file", doc="Output file for depth evidence", optional=true) public GATKPath depthEvidenceOutputFilename -
depthEvidenceSummaryFilename
@Argument(shortName="DS", fullName="depth-summary-file", doc="Output file for depth evidence summary statistics", optional=true) public GATKPath depthEvidenceSummaryFilename -
depthEvidenceInputFilename
@Argument(shortName="DI", fullName="depth-evidence-intervals", doc="Input feature file specifying intervals where depth evidence will be gathered", optional=true) public GATKPath depthEvidenceInputFilename -
minDepthEvidenceMapQ
@Argument(fullName="depth-evidence-min-mapq", doc="minimum mapping quality for read to be counted as depth evidence", optional=true) public int minDepthEvidenceMapQ -
minMapQ
@Argument(fullName="site-depth-min-mapq", doc="minimum mapping quality for read to be counted toward site depth", optional=true) public int minMapQ -
minQ
@Argument(fullName="site-depth-min-baseq", doc="minimum base call quality for SNP to be counted toward site depth", optional=true) public int minQ
-
-
Constructor Details
-
CollectSVEvidence
public CollectSVEvidence()
-
-
Method Details
-
requiresReads
public boolean requiresReads()Description copied from class:GATKTool
Does this tool require reads? Traversals types and/or tools that do should override to return true.- Overrides:
requiresReads
in classReadWalker
- Returns:
- true if this tool requires reads, otherwise false
-
onTraversalStart
public void onTraversalStart()Description copied from class:GATKTool
Operations performed just prior to the start of traversal. Should be overridden by tool authors who need to process arguments local to their tool or perform other kinds of local initialization. Default implementation does nothing.- Overrides:
onTraversalStart
in classGATKTool
-
getDefaultReadFilters
Description copied from class:ReadWalker
Returns the default list of CommandLineReadFilters that are used for this tool. The filters returned by this method are subject to selective enabling/disabling by the user via the command line. The default implementation uses theWellformedReadFilter
filter with all default options. Subclasses can override to provide alternative filters. Note: this method is called before command line parsing begins, and thus before a SAMFileHeader is available through {link #getHeaderForReads}.- Overrides:
getDefaultReadFilters
in classReadWalker
- Returns:
- List of individual filters to be applied for this tool.
-
apply
Description copied from class:ReadWalker
Process an individual read (with optional contextual information). Must be implemented by tool authors. In general, tool authors should simply stream their output from apply(), and maintain as little internal state as possible. TODO: Determine whether and to what degree the GATK engine should provide a reduce operation TODO: to complement this operation. At a minimum, we should make apply() return a value to TODO: discourage statefulness in walkers, but how this value should be handled is TBD.- Specified by:
apply
in classReadWalker
- Parameters:
read
- current readreferenceContext
- Reference bases spanning the current read. Will be an empty, but non-null, context object if there is no backing source of reference data (in which case all queries on it will return an empty array/iterator). Can request extra bases of context around the current read's interval by invokingReferenceContext.setWindow(int, int)
on this object before callingReferenceContext.getBases()
featureContext
- Features spanning the current read. Will be an empty, but non-null, context object if there is no backing source of Feature data (in which case all queries on it will return an empty List).
-
getReportableDiscordantReadPair
-
countSplitRead
public void countSplitRead(GATKRead read, PriorityQueue<org.broadinstitute.hellbender.tools.walkers.sv.CollectSVEvidence.SplitPos> splitCounts, FeatureSink<SplitReadEvidence> srWriter) Adds split read information about the current read to the counts in splitCounts. Flushes split read counts to srWriter if necessary. -
onTraversalSuccess
Description copied from class:GATKTool
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal). Should be overridden by tool authors who need to close local resources, etc., after traversal. Also allows tools to return a value representing the traversal result, which is printed by the engine. Default implementation does nothing and returns null.- Overrides:
onTraversalSuccess
in classGATKTool
- Returns:
- Object representing the traversal result, or null if a tool does not return a value
-
closeTool
public void closeTool()Description copied from class:GATKTool
This method is called by the GATK framework at the end of theGATKTool.doWork()
template method. It is called regardless of whether theGATKTool.traverse()
has succeeded or not. It is called after theGATKTool.onTraversalSuccess()
has completed (successfully or not) but before theGATKTool.doWork()
method returns. In other words, on successful runs bothGATKTool.onTraversalSuccess()
andGATKTool.closeTool()
will be called (in this order) while on failed runs (whenGATKTool.traverse()
causes an exception), onlyGATKTool.closeTool()
will be called. The default implementation does nothing. Subclasses should override this method to close any resources that must be closed regardless of the success of traversal.
-