Class CollectReadCounts
java.lang.Object
org.broadinstitute.hellbender.cmdline.CommandLineProgram
org.broadinstitute.hellbender.engine.GATKTool
org.broadinstitute.hellbender.engine.WalkerBase
org.broadinstitute.hellbender.engine.ReadWalker
org.broadinstitute.hellbender.tools.copynumber.CollectReadCounts
- All Implemented Interfaces:
org.broadinstitute.barclay.argparser.CommandLinePluginProvider
Collects read counts at specified intervals. The count for each interval is calculated by counting
the number of read starts that lie in the interval.
Inputs
- SAM format read data
-
Intervals at which counts will be collected.
The argument
interval-merging-rule
must be set toIntervalMergingRule.OVERLAPPING_ONLY
and all other common arguments for interval padding or merging must be set to their defaults. - Output file format. This can be used to select TSV or HDF5 output.
Outputs
-
Counts file.
By default, the tool produces HDF5 format results. This can be changed with the
format
option to TSV format. Using HDF5 files withCreateReadCountPanelOfNormals
can decrease runtime, by reducing time spent on IO, so this is the default output format. The HDF5 format contains information in the paths defined inHDF5SimpleCountCollection
. HDF5 files may be viewed using hdfview or loaded in Python using PyTables or h5py. The TSV format has a SAM-style header containing a read group sample name, a sequence dictionary, a row specifying the column headers contained inSimpleCountCollection.SimpleCountTableColumn
, and the corresponding entry rows.
Usage examples
gatk CollectReadCounts \ -I sample.bam \ -L intervals.interval_list \ --interval-merging-rule OVERLAPPING_ONLY \ -O sample.counts.hdf5
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
-
Field Summary
FieldsFields inherited from class org.broadinstitute.hellbender.engine.ReadWalker
FEATURE_CACHE_LOOKAHEAD
Fields inherited from class org.broadinstitute.hellbender.engine.GATKTool
addOutputSAMProgramRecord, addOutputVCFCommandLine, cloudIndexPrefetchBuffer, cloudPrefetchBuffer, createOutputBamIndex, createOutputBamMD5, createOutputVariantIndex, createOutputVariantMD5, disableBamIndexCaching, features, intervalArgumentCollection, lenientVCFProcessing, outputSitesOnlyVCFs, progressMeter, readArguments, referenceArguments, SECONDS_BETWEEN_PROGRESS_UPDATES_NAME, seqValidationArguments
Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
apply
(GATKRead read, ReferenceContext referenceContext, FeatureContext featureContext) Process an individual read (with optional contextual information).Returns the default list of CommandLineReadFilters that are used for this tool.void
Operations performed just prior to the start of traversal.Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal).boolean
Does this tool require intervals? Traversals types and/or tools that do should override to return true.Methods inherited from class org.broadinstitute.hellbender.engine.ReadWalker
getProgressMeterRecordLabel, onShutdown, onStartup, requiresReads, resetReadsDataSource, traverse
Methods inherited from class org.broadinstitute.hellbender.engine.WalkerBase
directlyAccessEngineFeatureManager, directlyAccessEngineReadsDataSource, directlyAccessEngineReferenceDataSource
Methods inherited from class org.broadinstitute.hellbender.engine.GATKTool
addFeatureInputsAfterInitialization, bamIndexCachingShouldBeEnabled, closeTool, createSAMWriter, createVCFWriter, createVCFWriter, createVCFWriter, disableProgressMeter, doWork, getBestAvailableSequenceDictionary, getDefaultCloudIndexPrefetchBufferSize, getDefaultCloudPrefetchBufferSize, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGenomicsDBOptions, getHeaderForFeatures, getHeaderForReads, getHeaderForSAMWriter, getMasterSequenceDictionary, getPluginDescriptors, getReferenceDictionary, getSequenceDictionaryValidationArgumentCollection, getToolName, getTransformedReadStream, getTraversalIntervals, getUserSuppliedIntervals, hasFeatures, hasReads, hasReference, hasUserSuppliedIntervals, initializeProgressMeter, makePostReadFilterTransformer, makePreReadFilterTransformer, makeReadFilter, makeSamReaderFactory, makeVariantAnnotations, requiresFeatures, requiresReference, transformTraversalIntervals, useVariantAnnotations
Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
-
Field Details
-
FORMAT_LONG_NAME
- See Also:
-
-
Constructor Details
-
CollectReadCounts
public CollectReadCounts()
-
-
Method Details
-
requiresIntervals
public boolean requiresIntervals()Description copied from class:GATKTool
Does this tool require intervals? Traversals types and/or tools that do should override to return true.- Overrides:
requiresIntervals
in classGATKTool
- Returns:
- true if this tool requires intervals, otherwise false
-
getDefaultReadFilters
Description copied from class:ReadWalker
Returns the default list of CommandLineReadFilters that are used for this tool. The filters returned by this method are subject to selective enabling/disabling by the user via the command line. The default implementation uses theWellformedReadFilter
filter with all default options. Subclasses can override to provide alternative filters. Note: this method is called before command line parsing begins, and thus before a SAMFileHeader is available through {link #getHeaderForReads}.- Overrides:
getDefaultReadFilters
in classReadWalker
- Returns:
- List of individual filters to be applied for this tool.
-
onTraversalStart
public void onTraversalStart()Description copied from class:GATKTool
Operations performed just prior to the start of traversal. Should be overridden by tool authors who need to process arguments local to their tool or perform other kinds of local initialization. Default implementation does nothing.- Overrides:
onTraversalStart
in classGATKTool
-
apply
Description copied from class:ReadWalker
Process an individual read (with optional contextual information). Must be implemented by tool authors. In general, tool authors should simply stream their output from apply(), and maintain as little internal state as possible. TODO: Determine whether and to what degree the GATK engine should provide a reduce operation TODO: to complement this operation. At a minimum, we should make apply() return a value to TODO: discourage statefulness in walkers, but how this value should be handled is TBD.- Specified by:
apply
in classReadWalker
- Parameters:
read
- current readreferenceContext
- Reference bases spanning the current read. Will be an empty, but non-null, context object if there is no backing source of reference data (in which case all queries on it will return an empty array/iterator). Can request extra bases of context around the current read's interval by invokingReferenceContext.setWindow(int, int)
on this object before callingReferenceContext.getBases()
featureContext
- Features spanning the current read. Will be an empty, but non-null, context object if there is no backing source of Feature data (in which case all queries on it will return an empty List).
-
onTraversalSuccess
Description copied from class:GATKTool
Operations performed immediately after a successful traversal (ie when no uncaught exceptions were thrown during the traversal). Should be overridden by tool authors who need to close local resources, etc., after traversal. Also allows tools to return a value representing the traversal result, which is printed by the engine. Default implementation does nothing and returns null.- Overrides:
onTraversalSuccess
in classGATKTool
- Returns:
- Object representing the traversal result, or null if a tool does not return a value
-