@BetaFeature @DocumentedFeature public class GetPileupSummaries extends MultiVariantWalker
Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination
.
The tool requires a common germline variant sites VCF, e.g. derived from the gnomAD resource, with population allele frequencies (AF) in the INFO field. This resource must contain only biallelic SNPs and can be an eight-column sites-only VCF. The tool ignores the filter status of the variant calls in this germline resource.
This tool is featured in the Somatic Short Mutation calling Best Practice Workflow. See Tutorial#11136 for a step-by-step description of the workflow and Article#11127 for an overview of what traditional somatic calling entails. For the latest pipeline scripts, see the Mutect2 WDL scripts directory. In particular, the mutect_resources.wdl script prepares a suitable resource from a larger dataset. An example excerpt is shown.
#CHROM POS ID REF ALT QUAL FILTER INFO chr6 29942512 . G C 2974860 VQSRTrancheSNP99.80to99.90 AF=0.063 chr6 29942517 . C A 2975860 VQSRTrancheSNP99.80to99.90 AF=0.062 chr6 29942525 . G C 2975600 VQSRTrancheSNP99.60to99.80 AF=0.063 chr6 29942547 rs114945359 G C 15667700 PASS AF=0.077
gatk GetPileupSummaries \ -I tumor.bam \ -V common_biallelic.vcf.gz \ -O pileups.table
gatk GetPileupSummaries \ -I normal.bam \ -V common_biallelic.vcf.gz \ -L chr1 \ -O pileups.table
GetPileupSummaries tabulates results into six columns as shown below. The alt_count and allele_frequency correspond to the ALT allele in the germline resource.
contig position ref_count alt_count other_alt_count allele_frequency chr6 29942512 9 0 0 0.063 chr6 29942517 13 1 0 0.062 chr6 29942525 13 7 0 0.063 chr6 29942547 36 0 0 0.077
Note the default maximum population AF (--maximum-population-allele-frequency
or -max-af
)
is set to 0.2, which limits the sites the tool considers to those in the variants resource file that have
AF of 0.2 or less. Likewise, the default minimum population AF (--minimum-population-allele-frequency
or -min-af
) is set to 0.01, which limits the sites the tool considers to those in the variants resource
file that have AF of 0.01 or more.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
MAX_SITE_AF_LONG_NAME |
static java.lang.String |
MAX_SITE_AF_SHORT_NAME |
static java.lang.String |
MIN_MAPPING_QUALITY_LONG_NAME |
static java.lang.String |
MIN_MAPPING_QUALITY_SHORT_NAME |
static java.lang.String |
MIN_SITE_AF_LONG_NAME |
static java.lang.String |
MIN_SITE_AF_SHORT_NAME |
drivingVariantFiles
FEATURE_CACHE_LOOKAHEAD
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, QUIET, specialArgumentsCollection, TMP_DIR, useJdkDeflater, useJdkInflater, VERBOSITY
Constructor and Description |
---|
GetPileupSummaries() |
Modifier and Type | Method and Description |
---|---|
void |
apply(htsjdk.variant.variantcontext.VariantContext vc,
ReadsContext readsContext,
ReferenceContext referenceContext,
FeatureContext featureContext)
Process an individual variant.
|
java.util.List<ReadFilter> |
getDefaultReadFilters()
Returns the default list of ReadFilters that are used for this tool.
|
void |
onTraversalStart()
Operations performed just prior to the start of traversal.
|
java.lang.Object |
onTraversalSuccess()
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).
|
boolean |
requiresReads()
Does this tool require reads? Traversals types and/or tools that do should override to return true.
|
boolean |
requiresReference()
Does this tool require reference data? Traversals types and/or tools that do should override to return true.
|
getDrivingVariantsFeatureInputs, getHeaderForVariants, getSamplesForVariants, getSequenceDictionaryForDrivingVariants, getSpliteratorForDrivingVariants, initializeDrivingVariants, onShutdown, onStartup
getBestAvailableSequenceDictionary, getProgressMeterRecordLabel, makeVariantFilter, requiresFeatures, traverse
addFeatureInputsAfterInitialization, closeTool, createSAMWriter, createSAMWriter, createVCFWriter, doWork, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolkitShortName, getToolName, getTransformedReadStream, hasFeatures, hasIntervals, hasReads, hasReference, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeVariantAnnotations, requiresIntervals, useVariantAnnotations
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
public static final java.lang.String MAX_SITE_AF_LONG_NAME
public static final java.lang.String MIN_SITE_AF_LONG_NAME
public static final java.lang.String MAX_SITE_AF_SHORT_NAME
public static final java.lang.String MIN_SITE_AF_SHORT_NAME
public static final java.lang.String MIN_MAPPING_QUALITY_LONG_NAME
public static final java.lang.String MIN_MAPPING_QUALITY_SHORT_NAME
public boolean requiresReads()
GATKTool
requiresReads
in class GATKTool
public boolean requiresReference()
GATKTool
requiresReference
in class GATKTool
public java.util.List<ReadFilter> getDefaultReadFilters()
GATKTool
GATKTool.getHeaderForReads()
. The actual SAMFileHeader is propagated to the read filters
by GATKTool.makeReadFilter()
after the filters have been merged with command line arguments.getDefaultReadFilters
in class GATKTool
public void onTraversalStart()
GATKTool
onTraversalStart
in class GATKTool
public void apply(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext readsContext, ReferenceContext referenceContext, FeatureContext featureContext)
VariantWalkerBase
apply
in class VariantWalkerBase
vc
- Current variant being processed.readsContext
- Reads overlapping the current variant. Will be an empty, but non-null, context object
if there is no backing source of reads data (in which case all queries on it will return
an empty array/iterator)referenceContext
- Reference bases spanning the current variant. Will be an empty, but non-null, context object
if there is no backing source of reference data (in which case all queries on it will return
an empty array/iterator). Can request extra bases of context around the current variant's interval
by invoking ReferenceContext.setWindow(int, int)
on this object before calling ReferenceContext.getBases()
featureContext
- Features spanning the current variant. Will be an empty, but non-null, context object
if there is no backing source of Feature data (in which case all queries on it will return an
empty List).public java.lang.Object onTraversalSuccess()
GATKTool
onTraversalSuccess
in class GATKTool