ReadsPipelineSpark (gatk 4.1.4.1 API)

java.lang.Object
- org.broadinstitute.hellbender.cmdline.CommandLineProgram
- - org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
  - - org.broadinstitute.hellbender.engine.spark.GATKSparkTool
    - - org.broadinstitute.hellbender.tools.spark.pipelines.ReadsPipelineSpark

All Implemented Interfaces:: java.io.Serializable, org.broadinstitute.barclay.argparser.CommandLinePluginProvider

@DocumentedFeature
 @BetaFeature
public class ReadsPipelineSpark
extends GATKSparkTool

ReadsPipelineSpark is our standard pipeline that takes unaligned or aligned reads and runs BWA (if specified), MarkDuplicates, BQSR, and HaplotypeCaller. The final result is analysis-ready variants.

Examples

 gatk ReadsPipelineSpark \
   -I gs://my-gcs-bucket/aligned_reads.bam \
   -R gs://my-gcs-bucket/reference.fasta \
   --known-sites gs://my-gcs-bucket/sites_of_variation.vcf \
   -O gs://my-gcs-bucket/output.vcf \
   -- \
   --sparkRunner GCS \
   --cluster my-dataproc-cluster

To additionally align reads with BWA-MEM:

 gatk ReadsPipelineSpark \
   -I gs://my-gcs-bucket/unaligned_reads.bam \
   -R gs://my-gcs-bucket/reference.fasta \
   --known-sites gs://my-gcs-bucket/sites_of_variation.vcf \
   --align
   -O gs://my-gcs-bucket/output.vcf \
   -- \
   --sparkRunner GCS \
   --cluster my-dataproc-cluster

See Also:: Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
  GATKSparkTool.ReadInputMergingPolicy

Field Summary

Fields
Modifier and Type	Field and Description
`ApplyBQSRUniqueArgumentCollection`	`applyBqsrArgs` command-line arguments to fine tune the apply BQSR step.
`AssemblyRegionArgumentCollection`	`assemblyRegionArgs`
`BwaArgumentCollection`	`bwaArgs`
`HaplotypeCallerArgumentCollection`	`hcArgs`
`protected java.util.List<java.lang.String>`	`knownVariants`
`protected MarkDuplicatesSparkArgumentCollection`	`markDuplicatesSparkArgumentCollection`
`protected java.lang.String`	`output`
`protected java.lang.String`	`outputBam`
`AssemblyRegionReadShardArgumentCollection`	`shardingArgs`
`boolean`	`strict`

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addOutputVCFCommandLine, BAM_PARTITION_SIZE_LONG_NAME, bamPartitionSplitSize, CREATE_OUTPUT_BAM_SPLITTING_INDEX_LONG_NAME, createOutputBamIndex, createOutputBamSplittingIndex, createOutputVariantIndex, features, intervalArgumentCollection, NUM_REDUCERS_LONG_NAME, numReducers, OUTPUT_SHARD_DIR_LONG_NAME, readArguments, referenceArguments, sequenceDictionaryValidationArguments, SHARDED_OUTPUT_LONG_NAME, shardedOutput, shardedPartsDir, USE_NIO, useNio

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
programName, SPARK_PROGRAM_NAME_LONG_NAME, sparkArgs

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY

Constructor Summary

Constructors
Constructor and Description

ReadsPipelineSpark()

Constructors
Constructor and Description
`ReadsPipelineSpark()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`java.util.List<java.lang.Class<? extends Annotation>>`	`getDefaultVariantAnnotationGroups()`
`java.util.Collection<Annotation>`	`makeVariantAnnotations()`
`boolean`	`requiresReads()` Does this tool require reads? Tools that do should override to return true.
`boolean`	`requiresReference()` Does this tool require reference data? Tools that do should override to return true.
`protected void`	`runTool(org.apache.spark.api.java.JavaSparkContext ctx)` Runs the tool itself after initializing and validating inputs.
`boolean`	`useVariantAnnotations()`
`protected void`	`validateSequenceDictionaries()` Validates standard tool inputs against each other.

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
afterPipeline, doWork, getProgramName

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, onShutdown, onStartup, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

knownVariants

@Argument(doc="the known variants",
          fullName="known-sites",
          optional=false)
protected java.util.List<java.lang.String> knownVariants

output

@Argument(doc="the output vcf",
          shortName="O",
          fullName="output",
          optional=false)
protected java.lang.String output

outputBam

@Argument(doc="the output bam",
          fullName="output-bam",
          optional=true)
protected java.lang.String outputBam

markDuplicatesSparkArgumentCollection

@ArgumentCollection
protected MarkDuplicatesSparkArgumentCollection markDuplicatesSparkArgumentCollection

bwaArgs

@ArgumentCollection
public final BwaArgumentCollection bwaArgs

shardingArgs

@ArgumentCollection
public final AssemblyRegionReadShardArgumentCollection shardingArgs

assemblyRegionArgs

@ArgumentCollection
public final AssemblyRegionArgumentCollection assemblyRegionArgs

applyBqsrArgs

@ArgumentCollection
public ApplyBQSRUniqueArgumentCollection applyBqsrArgs

command-line arguments to fine tune the apply BQSR step.

hcArgs

@ArgumentCollection
public HaplotypeCallerArgumentCollection hcArgs

strict

@Argument(doc="whether to use the strict implementation or not (defaults to the faster implementation that doesn\'t strictly match the walker version)",
          fullName="strict",
          optional=true)
public boolean strict

Constructor Detail
- ReadsPipelineSpark
```
public ReadsPipelineSpark()
```

Method Detail
- requiresReads
```
public boolean requiresReads()
```
  Description copied from class: GATKSparkTool
  
  Does this tool require reads? Tools that do should override to return true.
  
  Overrides:
  
  requiresReads in class GATKSparkTool
  
  Returns:
  
  true if this tool requires reads, otherwise false
- requiresReference
```
public boolean requiresReference()
```
  Description copied from class: GATKSparkTool
  
  Does this tool require reference data? Tools that do should override to return true.
  
  Overrides:
  
  requiresReference in class GATKSparkTool
  
  Returns:
  
  true if this tool requires a reference, otherwise false
- useVariantAnnotations
```
public boolean useVariantAnnotations()
```
  Overrides:
  
  useVariantAnnotations in class GATKSparkTool
  
  See Also:
  
  GATKTool.useVariantAnnotations()
- getDefaultVariantAnnotationGroups
```
public java.util.List<java.lang.Class<? extends Annotation>> getDefaultVariantAnnotationGroups()
```
  Overrides:
  
  getDefaultVariantAnnotationGroups in class GATKSparkTool
  
  See Also:
  
  GATKTool.getDefaultVariantAnnotationGroups()
- makeVariantAnnotations
```
public java.util.Collection<Annotation> makeVariantAnnotations()
```
  Overrides:
  
  makeVariantAnnotations in class GATKSparkTool
  
  See Also:
  
  GATKTool.makeVariantAnnotations()
- validateSequenceDictionaries
```
protected void validateSequenceDictionaries()
```
  Description copied from class: GATKSparkTool
  
  Validates standard tool inputs against each other.
  
  Overrides:
  
  validateSequenceDictionaries in class GATKSparkTool
- runTool
```
protected void runTool(org.apache.spark.api.java.JavaSparkContext ctx)
```
  Description copied from class: GATKSparkTool
  
  Runs the tool itself after initializing and validating inputs. Must be implemented by subclasses.
  
  Specified by:
  
  runTool in class GATKSparkTool
  
  Parameters:
  
  ctx - our Spark context

Class ReadsPipelineSpark

Examples

Nested Class Summary

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Field Summary

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Constructor Summary

Method Summary

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Methods inherited from class java.lang.Object

Field Detail

knownVariants

output

outputBam

markDuplicatesSparkArgumentCollection

bwaArgs

shardingArgs

assemblyRegionArgs

applyBqsrArgs

hcArgs

strict

Constructor Detail

ReadsPipelineSpark

Method Detail

requiresReads

requiresReference

useVariantAnnotations

getDefaultVariantAnnotationGroups

makeVariantAnnotations

validateSequenceDictionaries

runTool