@DocumentedFeature @BetaFeature public final class ExtractSVEvidenceSpark extends GATKSparkTool
This tool is used in development and should not be of interest to most researchers. It repackages the first two steps of the structural variation workflow as a separate tool for the convenience of developers.
This tool examines a SAM/BAM/CRAM for reads, or groups of reads, that demonstrate evidence of a structural variation in the vicinity. It records this evidence as a group of text files in a specified output directory on Spark's HDFS file system.
gatk ExtractSVEvidenceSpark \ -I input_reads.bam \ -O hdfs://my_cluster-m:8020/output_directory --aligner-index-image ignored --kmers-to-ignore ignored
This tool can be run without explicitly specifying Spark options. That is to say, the given example command without Spark options will run locally. See Tutorial#10060 for an example of how to set up and run a Spark tool on a cloud Spark cluster.
GATKSparkTool.ReadInputMergingPolicy
addOutputVCFCommandLine, BAM_PARTITION_SIZE_LONG_NAME, bamPartitionSplitSize, CREATE_OUTPUT_BAM_SPLITTING_INDEX_LONG_NAME, createOutputBamIndex, createOutputBamSplittingIndex, createOutputVariantIndex, features, intervalArgumentCollection, NUM_REDUCERS_LONG_NAME, numReducers, OUTPUT_SHARD_DIR_LONG_NAME, readArguments, referenceArguments, sequenceDictionaryValidationArguments, SHARDED_OUTPUT_LONG_NAME, shardedOutput, shardedPartsDir, SPLITTING_INDEX_GRANULARITY, splittingIndexGranularity, USE_NIO, useNio
programName, SPARK_PROGRAM_NAME_LONG_NAME, sparkArgs
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
Constructor and Description |
---|
ExtractSVEvidenceSpark() |
Modifier and Type | Method and Description |
---|---|
boolean |
requiresReads()
Does this tool require reads? Tools that do should override to return true.
|
protected void |
runTool(org.apache.spark.api.java.JavaSparkContext ctx)
Runs the tool itself after initializing and validating inputs.
|
addReferenceFilesForSpark, addVCFsForSpark, editIntervals, getBestAvailableSequenceDictionary, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGatkReadJavaRDD, getHeaderForReads, getIntervals, getPluginDescriptors, getReadInputMergingPolicy, getReads, getReadSourceHeaderMap, getReadSourceName, getRecommendedNumReducers, getReference, getReferenceSequenceDictionary, getReferenceWindowFunction, getSequenceDictionaryValidationArgumentCollection, getTargetPartitionSize, getUnfilteredReads, hasReads, hasReference, hasUserSuppliedIntervals, makeReadFilter, makeReadFilter, makeVariantAnnotations, requiresIntervals, requiresReference, runPipeline, useVariantAnnotations, validateSequenceDictionaries, writeReads, writeReads
afterPipeline, doWork, getProgramName
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, onShutdown, onStartup, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus
public boolean requiresReads()
GATKSparkTool
requiresReads
in class GATKSparkTool
protected void runTool(org.apache.spark.api.java.JavaSparkContext ctx)
GATKSparkTool
runTool
in class GATKSparkTool
ctx
- our Spark context