@DocumentedFeature @BetaFeature public class CreateSomaticPanelOfNormals extends VariantWalker
The tool takes multiple normal sample callsets produced by Mutect2
's tumor-only mode and collates sites present in multiple samples
(two by default, set by the --min-sample-count argument) into a sites-only VCF. The PoN captures common artifacts. Mutect2 then
uses the PoN to filter variants at the site-level.
The --max-germline-probability argument sets the threshold for possible germline variants to be included in the PoN. By default this
is set to 0.5, so that likely germline events are excluded. This is usually the correct behavior as germline variants are best handled
by probabilistic modeling via Mutect2's --germline-resource argument. A germline resource, such as gnomAD in the case of humans, is a much
more refined tool for germline filtering than any PoN could be.
This tool is featured in the Somatic Short Mutation calling Best Practice Workflow. See Tutorial#11136 for a step-by-step description of the workflow and Article#11127 for an overview of what traditional somatic calling entails. For the latest pipeline scripts, see the Mutect2 WDL scripts directory.
Note that as of May, 2019 -max-mnp-distance must be set to zero to avoid a bug in GenomicsDBImport.
gatk Mutect2 -R reference.fasta -I normal1.bam -max-mnp-distance 0 -O normal1.vcf.gz
gatk GenomicsDBImport -R reference.fasta -L intervals.interval_list \ --genomicsdb-workspace-path pon_db \ -V normal1.vcf.gz \ -V normal2.vcf.gz \ -V normal3.vcf.gz
gatk CreateSomaticPanelOfNormals -R reference.fasta -V gendb://pon_db -O pon.vcf.gz
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
BETA_SHAPE_INFO_FIELD |
static double |
DEFAULT_MAX_GERMLINE_PROBABILITY |
static int |
DEFAULT_MIN_SAMPLE_COUNT |
static java.lang.String |
FRACTION_INFO_FIELD |
FeatureInput<htsjdk.variant.variantcontext.VariantContext> |
germlineResource
A resource, such as gnomAD, containing population allele frequencies of common and rare variants.
|
static java.lang.String |
MAX_GERMLINE_PROBABILITY_LONG_NAME |
double |
maxGermlineProbability |
static java.lang.String |
MIN_SAMPLE_COUNT_LONG_NAME |
drivingVariantFile
DEFAULT_DRIVING_VARIANTS_LOOKAHEAD_BASES, genomicsDBOptions
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
Constructor and Description |
---|
CreateSomaticPanelOfNormals() |
Modifier and Type | Method and Description |
---|---|
void |
apply(htsjdk.variant.variantcontext.VariantContext vc,
ReadsContext rc,
ReferenceContext ref,
FeatureContext fc)
Process an individual variant.
|
void |
closeTool()
This method is called by the GATK framework at the end of the
GATKTool.doWork() template method. |
void |
onTraversalStart()
Operations performed just prior to the start of traversal.
|
java.lang.Object |
onTraversalSuccess()
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).
|
getDrivingVariantsFeatureInput, getHeaderForVariants, getSequenceDictionaryForDrivingVariants, getSpliteratorForDrivingVariants, initializeDrivingVariants, onShutdown, onStartup, traverse
getBestAvailableSequenceDictionary, getDrivingVariantCacheLookAheadBases, getGenomicsDBOptions, getProgressMeterRecordLabel, getTransformedVariantStream, getTransformedVariantStream, makePostVariantFilterTransformer, makePreVariantFilterTransformer, makeVariantFilter, requiresFeatures
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, requiresIntervals, requiresReads, requiresReference, transformTraversalIntervals, useVariantAnnotations
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
public static final java.lang.String MIN_SAMPLE_COUNT_LONG_NAME
public static final int DEFAULT_MIN_SAMPLE_COUNT
public static final java.lang.String MAX_GERMLINE_PROBABILITY_LONG_NAME
public static final double DEFAULT_MAX_GERMLINE_PROBABILITY
public static final java.lang.String FRACTION_INFO_FIELD
public static final java.lang.String BETA_SHAPE_INFO_FIELD
@Argument(fullName="germline-resource", doc="Population vcf of germline sequencing containing allele fractions.", optional=true) public FeatureInput<htsjdk.variant.variantcontext.VariantContext> germlineResource
@Argument(fullName="max-germline-probability", doc="Skip genotypes with germline probability greater than this value", optional=true) public double maxGermlineProbability
public void onTraversalStart()
GATKTool
onTraversalStart
in class GATKTool
public void apply(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext rc, ReferenceContext ref, FeatureContext fc)
VariantWalker
apply
in class VariantWalker
vc
- Current variant being processed.rc
- Reads overlapping the current variant. Will be an empty, but non-null, context object
if there is no backing source of reads data (in which case all queries on it will return
an empty array/iterator)ref
- Reference bases spanning the current variant. Will be an empty, but non-null, context object
if there is no backing source of reference data (in which case all queries on it will return
an empty array/iterator). Can request extra bases of context around the current variant's interval
by invoking ReferenceContext.setWindow(int, int)
on this object before calling ReferenceContext.getBases()
fc
- Features spanning the current variant. Will be an empty, but non-null, context object
if there is no backing source of Feature data (in which case all queries on it will return an
empty List).public java.lang.Object onTraversalSuccess()
GATKTool
onTraversalSuccess
in class GATKTool
public void closeTool()
GATKTool
GATKTool.doWork()
template method.
It is called regardless of whether the GATKTool.traverse()
has succeeded or not.
It is called after the GATKTool.onTraversalSuccess()
has completed (successfully or not)
but before the GATKTool.doWork()
method returns.
In other words, on successful runs both GATKTool.onTraversalSuccess()
and GATKTool.closeTool()
will be called (in this order) while
on failed runs (when GATKTool.traverse()
causes an exception), only GATKTool.closeTool()
will be called.
The default implementation does nothing.
Subclasses should override this method to close any resources that must be closed regardless of the success of traversal.