Class VariantEval
- All Implemented Interfaces:
org.broadinstitute.barclay.argparser.CommandLinePluginProvider
- Direct Known Subclasses:
AlleleFrequencyQC
Given a variant callset, it is common to calculate various quality control metrics. These metrics include the number of raw or filtered SNP counts; ratio of transition mutations to transversions; concordance of a particular sample's calls to a genotyping chip; number of s per sample; etc. Furthermore, it is often useful to stratify these metrics by various criteria like functional class (missense, nonsense, silent), whether the site is CpG site, the amino acid degeneracy of the site, etc. VariantEval facilitates these calculations in two ways: by providing several built-in evaluation and stratification modules, and by providing a framework that permits the easy development of new evaluation and stratification modules.
Input
One or more variant sets to evaluate plus any number of comparison sets.
Output
Evaluation tables detailing the results of the eval modules which were applied. For example:
output.eval.grp: ##:GATKReport.v0.1 CountVariants : Counts different classes of variants in the sample CountVariants CompFeatureInput CpG EvalFeatureInput JexlExpression Novelty nProcessedLoci nCalledLoci nRefLoci nVariantLoci variantRate ... CountVariants dbsnp CpG eval none all 65900028 135770 0 135770 0.00206024 ... CountVariants dbsnp CpG eval none known 65900028 47068 0 47068 0.00071423 ... CountVariants dbsnp CpG eval none novel 65900028 88702 0 88702 0.00134601 ... CountVariants dbsnp all eval none all 65900028 330818 0 330818 0.00502000 ... CountVariants dbsnp all eval none known 65900028 120685 0 120685 0.00183133 ... CountVariants dbsnp all eval none novel 65900028 210133 0 210133 0.00318866 ... CountVariants dbsnp non_CpG eval none all 65900028 195048 0 195048 0.00295976 ... CountVariants dbsnp non_CpG eval none known 65900028 73617 0 73617 0.00111710 ... CountVariants dbsnp non_CpG eval none novel 65900028 121431 0 121431 0.00184265 ... ...
Usage examples
gatk VariantEval \ -R reference.fasta \ -O output.eval.grp \ --eval set1:set1.vcf \ --eval set2:set2.vcf \ [--comp comp.vcf]Count Mendelian violations for each family in a callset with multiple families (and provided pedigree)
gatk VariantEval \ -R reference.fasta \ -O output.MVs.byFamily.table \ --eval multiFamilyCallset.vcf \ -no-ev -noST \ -ST Family \ -EV MendelianViolationEvaluator
Caveat
Some stratifications and evaluators are incompatible with each other due to their respective memory requirements, such as AlleleCount and VariantSummary, or Sample and VariantSummary. If you specify such a combination, the program will output an error message and ask you to disable one of these options. We do not currently provide an exhaustive list of incompatible combinations, so we recommend trying out combinations that you are interested in on a dummy command line, to rapidly ascertain whether it will work or not.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected VariantEvalEngine
protected Boolean
Note that the --list argument requires a fully resolved and correct command-line to work.protected File
protected VariantEvalArgumentCollection
Fields inherited from class org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart
COMBINE_VARIANTS_DISTANCE, distanceToCombineVariants, IGNORE_VARIANTS_THAT_START_OUTSIDE_INTERVAL, ignoreIntervalsOutsideStart, MAX_COMBINED_DISTANCE, maxCombinedDistance, REFERENCE_WINDOW_PADDING, referenceWindowPadding
Fields inherited from class org.broadinstitute.hellbender.engine.MultiVariantWalker
multiVariantInputArgumentCollection
Fields inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
DEFAULT_DRIVING_VARIANTS_LOOKAHEAD_BASES, genomicsDBOptions
Fields inherited from class org.broadinstitute.hellbender.engine.GATKTool
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
apply
(List<htsjdk.variant.variantcontext.VariantContext> variantContexts, ReferenceContext referenceContext, List<ReadsContext> readsContexts) This method must be implemented by tool authors.protected MultiVariantInputArgumentCollection
Return an argument collection that provides the driving variants.List<? extends org.broadinstitute.barclay.argparser.CommandLinePluginDescriptor<?>>
Return the list of GATKCommandLinePluginDescriptors to be used for this tool.protected void
Process the feature inputs that represent the primary driving source(s) of variants for this tool, and perform any necessary header and sequence dictionary validation.void
List all of the available evaluation modules, then exit successfullyvoid
Operations performed just prior to the start of traversal.Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).Methods inherited from class org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart
apply, apply, defaultDistanceToGroupVariants, defaultMaxGroupedSpan, defaultReferenceWindowPadding, isWithinInterval, requiresReference, traverse
Methods inherited from class org.broadinstitute.hellbender.engine.MultiVariantWalker
doDictionaryCrossValidation, getDrivingVariantsFeatureInputs, getHeaderForVariants, getSamplesForVariants, getSequenceDictionaryForDrivingVariants, getSpliteratorForDrivingVariants, onShutdown, onStartup
Methods inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
getBestAvailableSequenceDictionary, getDrivingVariantCacheLookAheadBases, getGenomicsDBOptions, getProgressMeterRecordLabel, getTransformedVariantStream, getTransformedVariantStream, makePostVariantFilterTransformer, makePreVariantFilterTransformer, makeVariantFilter, requiresFeatures
Methods inherited from class org.broadinstitute.hellbender.engine.WalkerBase
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
Methods inherited from class org.broadinstitute.hellbender.engine.GATKTool
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, closeTool, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, getUserSuppliedIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, requiresIntervals, requiresReads, transformTraversalIntervals, useVariantAnnotations
Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
-
Field Details
-
engine
-
outFile
@Argument(fullName="output", shortName="O", doc="File to which variants should be written") protected File outFile -
variantEvalArgs
-
LIST
@Argument(fullName="list", shortName="ls", doc="List the available eval modules and exit", optional=true) protected Boolean LISTNote that the --list argument requires a fully resolved and correct command-line to work.
-
-
Constructor Details
-
VariantEval
public VariantEval()
-
-
Method Details
-
getMultiVariantInputArgumentCollection
Description copied from class:MultiVariantWalker
Return an argument collection that provides the driving variants. This allows subclasses to override and use a different argument pattern besides the default -V- Overrides:
getMultiVariantInputArgumentCollection
in classMultiVariantWalker
-
initializeDrivingVariants
protected void initializeDrivingVariants()Description copied from class:VariantWalkerBase
Process the feature inputs that represent the primary driving source(s) of variants for this tool, and perform any necessary header and sequence dictionary validation. Called by the framework during feature initialization.- Overrides:
initializeDrivingVariants
in classMultiVariantWalker
-
onTraversalStart
public void onTraversalStart()Description copied from class:GATKTool
Operations performed just prior to the start of traversal. Should be overridden by tool authors who need to process arguments local to their tool or perform other kinds of local initialization. Default implementation does nothing.- Overrides:
onTraversalStart
in classGATKTool
-
listModulesAndExit
public void listModulesAndExit()List all of the available evaluation modules, then exit successfully -
apply
public void apply(List<htsjdk.variant.variantcontext.VariantContext> variantContexts, ReferenceContext referenceContext, List<ReadsContext> readsContexts) Description copied from class:MultiVariantWalkerGroupedOnStart
This method must be implemented by tool authors. This is the primary traversal for any MultiVariantWalkerGroupedOnStart walkers. Will traverse over input variant contexts and call #apply() exactly once for each unique reference start position. All variants starting at each locus across source files will be grouped and passed as a list of VariantContext objects.- Specified by:
apply
in classMultiVariantWalkerGroupedOnStart
- Parameters:
variantContexts
- VariantContexts from driving variants with matching start position NOTE: This will never be emptyreferenceContext
- ReferenceContext object covering the reference of the longest spanning VariantContext
-
onTraversalSuccess
Description copied from class:GATKTool
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal). Should be overridden by tool authors who need to close local resources, etc., after traversal. Also allows tools to return a value representing the traversal result, which is printed by the engine. Default implementation does nothing and returns null.- Overrides:
onTraversalSuccess
in classGATKTool
- Returns:
- Object representing the traversal result, or null if a tool does not return a value
-
getPluginDescriptors
public List<? extends org.broadinstitute.barclay.argparser.CommandLinePluginDescriptor<?>> getPluginDescriptors()Description copied from class:GATKTool
Return the list of GATKCommandLinePluginDescriptors to be used for this tool. Uses the read filter plugin.- Specified by:
getPluginDescriptors
in interfaceorg.broadinstitute.barclay.argparser.CommandLinePluginProvider
- Overrides:
getPluginDescriptors
in classGATKTool
-