org.broadinstitute.hellbender.tools.spark.pipelines.BwaAndMarkDuplicatesPipelineSpark

All Implemented Interfaces:: Serializable, org.broadinstitute.barclay.argparser.CommandLinePluginProvider

@DocumentedFeature @BetaFeature public final class BwaAndMarkDuplicatesPipelineSpark extends GATKSparkTool

Runs BWA and MarkDuplicates on Spark. It's an example of how to compose those two tools.

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
GATKSparkTool.ReadInputMergingPolicy

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
CommandLineProgram.AutoCloseableNoCheckedExceptions
Field Summary

Fields

Modifier and Type

Field

Description

final BwaArgumentCollection

bwaArgs

protected MarkDuplicatesSparkArgumentCollection

markDuplicatesSparkArgumentCollection

protected String

output

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addOutputVCFCommandLine, BAM_PARTITION_SIZE_LONG_NAME, bamPartitionSplitSize, CREATE_OUTPUT_BAM_SPLITTING_INDEX_LONG_NAME, createOutputBamIndex, createOutputBamSplittingIndex, createOutputVariantIndex, features, intervalArgumentCollection, NUM_REDUCERS_LONG_NAME, numReducers, OUTPUT_SHARD_DIR_LONG_NAME, readArguments, referenceArguments, sequenceDictionaryValidationArguments, SHARDED_OUTPUT_LONG_NAME, shardedOutput, shardedPartsDir, SPLITTING_INDEX_GRANULARITY, splittingIndexGranularity, USE_NIO, useNio

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
programName, SPARK_PROGRAM_NAME_LONG_NAME, sparkArgs

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY
Constructor Summary

Constructors

Constructor

Description

BwaAndMarkDuplicatesPipelineSpark()
Method Summary

Modifier and Type

Method

Description

protected SequenceDictionaryValidationArgumentCollection

getSequenceDictionaryValidationArgumentCollection()

subclasses can override this to provide different default behavior for sequence dictionary validation

boolean

requiresReads()

Does this tool require reads? Tools that do should override to return true.

boolean

requiresReference()

Does this tool require reference data? Tools that do should override to return true.

protected void

runTool(org.apache.spark.api.java.JavaSparkContext ctx)

Runs the tool itself after initializing and validating inputs.

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addReferenceFilesForSpark, addVCFsForSpark, editIntervals, getBestAvailableSequenceDictionary, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGatkReadJavaRDD, getHeaderForReads, getHeaderForReadsInput, getIntervals, getPluginDescriptors, getReadInputMergingPolicy, getReads, getReadSourceName, getRecommendedNumReducers, getReference, getReferenceSequenceDictionary, getReferenceWindowFunction, getTargetPartitionSize, getUnfilteredReads, hasReads, hasReference, hasUserSuppliedIntervals, makeReadFilter, makeReadFilter, makeVariantAnnotations, requiresIntervals, runPipeline, useVariantAnnotations, validateSequenceDictionaries, writeReads, writeReads

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
afterPipeline, doWork, getProgramName

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, onShutdown, onStartup, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- bwaArgs
  
  @ArgumentCollection public final BwaArgumentCollection bwaArgs
- output
  
  @Argument(doc="the output bam", shortName="O", fullName="output") protected String output
- markDuplicatesSparkArgumentCollection
  
  @ArgumentCollection protected MarkDuplicatesSparkArgumentCollection markDuplicatesSparkArgumentCollection
Constructor Details
- BwaAndMarkDuplicatesPipelineSpark
  
  public BwaAndMarkDuplicatesPipelineSpark()
Method Details
- requiresReads
  
  public boolean requiresReads()
  
  Description copied from class: GATKSparkTool
  
  Does this tool require reads? Tools that do should override to return true.
  
  Overrides:
  
  requiresReads in class GATKSparkTool
  
  Returns:
  
  true if this tool requires reads, otherwise false
- requiresReference
  
  public boolean requiresReference()
  
  Description copied from class: GATKSparkTool
  
  Does this tool require reference data? Tools that do should override to return true.
  
  Overrides:
  
  requiresReference in class GATKSparkTool
  
  Returns:
  
  true if this tool requires a reference, otherwise false
- getSequenceDictionaryValidationArgumentCollection
  
  protected SequenceDictionaryValidationArgumentCollection getSequenceDictionaryValidationArgumentCollection()
  
  Description copied from class: GATKSparkTool
  
  subclasses can override this to provide different default behavior for sequence dictionary validation
  
  Overrides:
  
  getSequenceDictionaryValidationArgumentCollection in class GATKSparkTool
  
  Returns:
  
  a SequenceDictionaryValidationArgumentCollection
- runTool
  
  protected void runTool(org.apache.spark.api.java.JavaSparkContext ctx)
  
  Description copied from class: GATKSparkTool
  
  Runs the tool itself after initializing and validating inputs. Must be implemented by subclasses.
  
  Specified by:
  
  runTool in class GATKSparkTool
  
  Parameters:
  
  ctx - our Spark context

Class BwaAndMarkDuplicatesPipelineSpark

Nested Class Summary

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Field Summary

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Constructor Summary

Method Summary

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Methods inherited from class java.lang.Object

Field Details

bwaArgs

output

markDuplicatesSparkArgumentCollection

Constructor Details

BwaAndMarkDuplicatesPipelineSpark

Method Details

requiresReads

requiresReference

getSequenceDictionaryValidationArgumentCollection

runTool