Class UpdateVCFSequenceDictionary
- All Implemented Interfaces:
org.broadinstitute.barclay.argparser.CommandLinePluginProvider
This tool is designed to update the sequence dictionary in a variant file using a dictionary from another variant, alignment, dictionary, or reference file. The dictionary must be valid, i.e. must contain a sequence record, for all variants in the target file. The dictionary lines start with '##contig='.
By specifying both --replace and --disable-sequence-dictionary-validation, one can force replace an invalid sequence dictionary in a variant file with a valid sequence dictionary in another file.
Usage example
Use the contig dictionary from a BAM (SQ lines) to replace an existing dictionary in the header of a VCF.
gatk UpdateVCFSequenceDictionary \ -V cohort.vcf.gz \ --source-dictionary sample.bam \ --output cohort_replacedcontiglines.vcf.gz \ --replace=true
Use a reference dictionary to add reference contig lines to a VCF without any.
gatk UpdateVCFSequenceDictionary \ -V resource.vcf.gz \ --source-dictionary reference.dict \ --output resource_newcontiglines.vcf.gz
Use the reference set to add contig lines to a VCF without any.
gatk UpdateVCFSequenceDictionary \ -V resource.vcf.gz \ -R reference.fasta \ --output resource_newcontiglines.vcf.gz
The -O argument specifies the name of the updated file. The --source-dictionary argument specifies the input sequence dictionary. The --replace argument is optional, and forces the replacement of the dictionary if the input file already has a dictionary.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
-
Field Summary
FieldsFields inherited from class org.broadinstitute.hellbender.engine.VariantWalker
drivingVariantFile
Fields inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
DEFAULT_DRIVING_VARIANTS_LOOKAHEAD_BASES, genomicsDBOptions
Fields inherited from class org.broadinstitute.hellbender.engine.GATKTool
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
apply
(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext readsContext, ReferenceContext ref, FeatureContext featureContext) Process an individual variant.void
Close out the new variants file.htsjdk.samtools.SAMSequenceDictionary
Overriding the superclass method to preferentially choose the sequence dictionary from the driving source of variants.void
Operations performed just prior to the start of traversal.Methods inherited from class org.broadinstitute.hellbender.engine.VariantWalker
getDrivingVariantsFeatureInput, getHeaderForVariants, getSequenceDictionaryForDrivingVariants, getSpliteratorForDrivingVariants, initializeDrivingVariants, onShutdown, onStartup, traverse
Methods inherited from class org.broadinstitute.hellbender.engine.VariantWalkerBase
getDrivingVariantCacheLookAheadBases, getGenomicsDBOptions, getProgressMeterRecordLabel, getTransformedVariantStream, getTransformedVariantStream, makePostVariantFilterTransformer, makePreVariantFilterTransformer, makeVariantFilter, requiresFeatures
Methods inherited from class org.broadinstitute.hellbender.engine.WalkerBase
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
Methods inherited from class org.broadinstitute.hellbender.engine.GATKTool
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, getUserSuppliedIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, onTraversalSuccess, requiresIntervals, requiresReads, requiresReference, transformTraversalIntervals, useVariantAnnotations
Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
-
Field Details
-
outFile
@Argument(fullName="output", shortName="O", doc="File to which updated variants should be written") public GATKPath outFile -
DICTIONARY_ARGUMENT_NAME
- See Also:
-
REPLACE_ARGUMENT_NAME
- See Also:
-
-
Constructor Details
-
UpdateVCFSequenceDictionary
public UpdateVCFSequenceDictionary()
-
-
Method Details
-
onTraversalStart
public void onTraversalStart()Description copied from class:GATKTool
Operations performed just prior to the start of traversal. Should be overridden by tool authors who need to process arguments local to their tool or perform other kinds of local initialization. Default implementation does nothing.- Overrides:
onTraversalStart
in classGATKTool
-
apply
public void apply(htsjdk.variant.variantcontext.VariantContext vc, ReadsContext readsContext, ReferenceContext ref, FeatureContext featureContext) Description copied from class:VariantWalker
Process an individual variant. Must be implemented by tool authors. In general, tool authors should simply stream their output from apply(), and maintain as little internal state as possible.- Specified by:
apply
in classVariantWalker
- Parameters:
vc
- Current variant being processed.readsContext
- Reads overlapping the current variant. Will be an empty, but non-null, context object if there is no backing source of reads data (in which case all queries on it will return an empty array/iterator)ref
- Reference bases spanning the current variant. Will be an empty, but non-null, context object if there is no backing source of reference data (in which case all queries on it will return an empty array/iterator). Can request extra bases of context around the current variant's interval by invokingReferenceContext.setWindow(int, int)
on this object before callingReferenceContext.getBases()
featureContext
- Features spanning the current variant. Will be an empty, but non-null, context object if there is no backing source of Feature data (in which case all queries on it will return an empty List).
-
closeTool
public void closeTool()Close out the new variants file. -
getBestAvailableSequenceDictionary
public htsjdk.samtools.SAMSequenceDictionary getBestAvailableSequenceDictionary()Description copied from class:VariantWalkerBase
Overriding the superclass method to preferentially choose the sequence dictionary from the driving source of variants.- Overrides:
getBestAvailableSequenceDictionary
in classVariantWalkerBase
- Returns:
- best available sequence dictionary given our inputs or
null
if no one dictionary is the best one.
-