Class CreateSomaticPanelOfNormals
- All Implemented Interfaces:
org.broadinstitute.barclay.argparser.CommandLinePluginProvider
The tool takes multiple normal sample callsets produced by Mutect2
's tumor-only mode and collates sites present in multiple samples
(two by default, set by the --min-sample-count argument) into a sites-only VCF. The PoN captures common artifacts. Mutect2 then
uses the PoN to filter variants at the site-level.
The --max-germline-probability argument sets the threshold for possible germline variants to be included in the PoN. By default this
is set to 0.5, so that likely germline events are excluded. This is usually the correct behavior as germline variants are best handled
by probabilistic modeling via Mutect2's --germline-resource argument. A germline resource, such as gnomAD in the case of humans, is a much
more refined tool for germline filtering than any PoN could be.
This tool is featured in the Somatic Short Mutation calling Best Practice Workflow. See Tutorial#11136 for a step-by-step description of the workflow and Article#11127 for an overview of what traditional somatic calling entails. For the latest pipeline scripts, see the Mutect2 WDL scripts directory.
Example workflow
Step 1. Run Mutect2 in tumor-only mode for each normal sample.
Note that as of May, 2019 -max-mnp-distance must be set to zero to avoid a bug in GenomicsDBImport.
gatk Mutect2 -R reference.fasta -I normal1.bam -max-mnp-distance 0 -O normal1.vcf.gz
Step 2. Create a GenomicsDB from the normal Mutect2 calls.
gatk GenomicsDBImport -R reference.fasta -L intervals.interval_list \ --genomicsdb-workspace-path pon_db \ -V normal1.vcf.gz \ -V normal2.vcf.gz \ -V normal3.vcf.gz
Step 3. Combine the normal calls using CreateSomaticPanelOfNormals.
gatk CreateSomaticPanelOfNormals -R reference.fasta -V gendb://pon_db -O pon.vcf.gz
-
Nested Class Summary
Nested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
static final double
static final int
static final String
FeatureInput<htsjdk.variant.variantcontext.VariantContext>
A resource, such as gnomAD, containing population allele frequencies of common and rare variants.static final String
double
static final String
Fields inherited from class org.broadinstitute.hellbender.engine.VariantWalker
drivingVariantFile
Fields inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
DEFAULT_DRIVING_VARIANTS_LOOKAHEAD_BASES, genomicsDBOptions
Fields inherited from class org.broadinstitute.hellbender.engine.GATKTool
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
apply
(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext rc, ReferenceContext ref, FeatureContext fc) Process an individual variant.void
This method is called by the GATK framework at the end of theGATKTool.doWork()
template method.protected GenomicsDBOptions
Get the GenomicsDB read settings for the current toolvoid
Operations performed just prior to the start of traversal.Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).Methods inherited from class org.broadinstitute.hellbender.engine.VariantWalker
getDrivingVariantsFeatureInput, getHeaderForVariants, getSequenceDictionaryForDrivingVariants, getSpliteratorForDrivingVariants, initializeDrivingVariants, onShutdown, onStartup, traverse
Methods inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
getBestAvailableSequenceDictionary, getDrivingVariantCacheLookAheadBases, getProgressMeterRecordLabel, getTransformedVariantStream, getTransformedVariantStream, makePostVariantFilterTransformer, makePreVariantFilterTransformer, makeVariantFilter, requiresFeatures
Methods inherited from class org.broadinstitute.hellbender.engine.WalkerBase
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
Methods inherited from class org.broadinstitute.hellbender.engine.GATKTool
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, getUserSuppliedIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, requiresIntervals, requiresReads, requiresReference, transformTraversalIntervals, useVariantAnnotations
Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
-
Field Details
-
MIN_SAMPLE_COUNT_LONG_NAME
- See Also:
-
DEFAULT_MIN_SAMPLE_COUNT
public static final int DEFAULT_MIN_SAMPLE_COUNT- See Also:
-
MAX_GERMLINE_PROBABILITY_LONG_NAME
- See Also:
-
DEFAULT_MAX_GERMLINE_PROBABILITY
public static final double DEFAULT_MAX_GERMLINE_PROBABILITY- See Also:
-
FRACTION_INFO_FIELD
- See Also:
-
BETA_SHAPE_INFO_FIELD
- See Also:
-
germlineResource
@Argument(fullName="germline-resource", doc="Population vcf of germline sequencing containing allele fractions.", optional=true) public FeatureInput<htsjdk.variant.variantcontext.VariantContext> germlineResourceA resource, such as gnomAD, containing population allele frequencies of common and rare variants. We use this to remove germline variants from the panel of normals, keeping only technical artifacts -
maxGermlineProbability
@Argument(fullName="max-germline-probability", doc="Skip genotypes with germline probability greater than this value", optional=true) public double maxGermlineProbability
-
-
Constructor Details
-
CreateSomaticPanelOfNormals
public CreateSomaticPanelOfNormals()
-
-
Method Details
-
getGenomicsDBOptions
Description copied from class:GATKTool
Get the GenomicsDB read settings for the current tool- Overrides:
getGenomicsDBOptions
in classVariantWalkerBase
- Returns:
- By default, just return the vanilla options
-
onTraversalStart
public void onTraversalStart()Description copied from class:GATKTool
Operations performed just prior to the start of traversal. Should be overridden by tool authors who need to process arguments local to their tool or perform other kinds of local initialization. Default implementation does nothing.- Overrides:
onTraversalStart
in classGATKTool
-
apply
public void apply(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext rc, ReferenceContext ref, FeatureContext fc) Description copied from class:VariantWalker
Process an individual variant. Must be implemented by tool authors. In general, tool authors should simply stream their output from apply(), and maintain as little internal state as possible.- Specified by:
apply
in classVariantWalker
- Parameters:
vc
- Current variant being processed.rc
- Reads overlapping the current variant. Will be an empty, but non-null, context object if there is no backing source of reads data (in which case all queries on it will return an empty array/iterator)ref
- Reference bases spanning the current variant. Will be an empty, but non-null, context object if there is no backing source of reference data (in which case all queries on it will return an empty array/iterator). Can request extra bases of context around the current variant's interval by invokingReferenceContext.setWindow(int, int)
on this object before callingReferenceContext.getBases()
fc
- Features spanning the current variant. Will be an empty, but non-null, context object if there is no backing source of Feature data (in which case all queries on it will return an empty List).
-
onTraversalSuccess
Description copied from class:GATKTool
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal). Should be overridden by tool authors who need to close local resources, etc., after traversal. Also allows tools to return a value representing the traversal result, which is printed by the engine. Default implementation does nothing and returns null.- Overrides:
onTraversalSuccess
in classGATKTool
- Returns:
- Object representing the traversal result, or null if a tool does not return a value
-
closeTool
public void closeTool()Description copied from class:GATKTool
This method is called by the GATK framework at the end of theGATKTool.doWork()
template method. It is called regardless of whether theGATKTool.traverse()
has succeeded or not. It is called after theGATKTool.onTraversalSuccess()
has completed (successfully or not) but before theGATKTool.doWork()
method returns. In other words, on successful runs bothGATKTool.onTraversalSuccess()
andGATKTool.closeTool()
will be called (in this order) while on failed runs (whenGATKTool.traverse()
causes an exception), onlyGATKTool.closeTool()
will be called. The default implementation does nothing. Subclasses should override this method to close any resources that must be closed regardless of the success of traversal.
-