LocusWalkerSpark (gatk 4.1.4.0 API)

java.lang.Object
- org.broadinstitute.hellbender.cmdline.CommandLineProgram
- - org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
  - - org.broadinstitute.hellbender.engine.spark.GATKSparkTool
    - - org.broadinstitute.hellbender.engine.spark.LocusWalkerSpark

All Implemented Interfaces:

java.io.Serializable, org.broadinstitute.barclay.argparser.CommandLinePluginProvider

Direct Known Subclasses:

CollectAllelicCountsSpark, ExampleLocusWalkerSpark, PileupSpark
```
public abstract class LocusWalkerSpark
extends GATKSparkTool
```
A Spark version of LocusWalker. Subclasses should implement processAlignments(JavaRDD, JavaSparkContext) and operate on the passed in RDD.

See Also:

Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
  GATKSparkTool.ReadInputMergingPolicy

Field Summary

Fields
Modifier and Type Field and Description

protected int maxDepthPerSample

int readShardSize

boolean shuffle
- Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
  addOutputVCFCommandLine, BAM_PARTITION_SIZE_LONG_NAME, bamPartitionSplitSize, CREATE_OUTPUT_BAM_SPLITTING_INDEX_LONG_NAME, createOutputBamIndex, createOutputBamSplittingIndex, createOutputVariantIndex, features, intervalArgumentCollection, NUM_REDUCERS_LONG_NAME, numReducers, OUTPUT_SHARD_DIR_LONG_NAME, readArguments, referenceArguments, sequenceDictionaryValidationArguments, SHARDED_OUTPUT_LONG_NAME, shardedOutput, shardedPartsDir, USE_NIO, useNio
- Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
  programName, SPARK_PROGRAM_NAME_LONG_NAME, sparkArgs
- Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
  GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY

Fields
Modifier and Type	Field and Description
`protected int`	`maxDepthPerSample`
`int`	`readShardSize`
`boolean`	`shuffle`

Constructor Summary

Constructors
Constructor and Description

LocusWalkerSpark()

Constructors
Constructor and Description
`LocusWalkerSpark()`

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`protected int`	`defaultMaxDepthPerSample()` Returns default value for the `maxDepthPerSample` parameter, if none is provided on the command line.
`boolean`	`emitEmptyLoci()` Does this tool emit information for uncovered loci? Tools that do should override to return `true`.
`org.apache.spark.api.java.JavaRDD<LocusWalkerContext>`	`getAlignments(org.apache.spark.api.java.JavaSparkContext ctx)` Loads alignments and the corresponding reference and features into a `JavaRDD` for the intervals specified.
`protected LIBSDownsamplingInfo`	`getDownsamplingInfo()` Returns the downsampling info using `maxDepthPerSample` as target coverage.
`protected abstract void`	`processAlignments(org.apache.spark.api.java.JavaRDD<LocusWalkerContext> rdd, org.apache.spark.api.java.JavaSparkContext ctx)` Process the alignments and write output.
`boolean`	`requiresReads()` Does this tool require reads? Tools that do should override to return true.
`protected void`	`runTool(org.apache.spark.api.java.JavaSparkContext ctx)` Runs the tool itself after initializing and validating inputs.

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addReferenceFilesForSpark, addVCFsForSpark, editIntervals, getBestAvailableSequenceDictionary, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGatkReadJavaRDD, getHeaderForReads, getIntervals, getPluginDescriptors, getReadInputMergingPolicy, getReads, getReadSourceHeaderMap, getReadSourceName, getRecommendedNumReducers, getReference, getReferenceSequenceDictionary, getReferenceWindowFunction, getSequenceDictionaryValidationArgumentCollection, getTargetPartitionSize, getUnfilteredReads, hasReads, hasReference, hasUserSuppliedIntervals, makeReadFilter, makeReadFilter, makeVariantAnnotations, requiresIntervals, requiresReference, runPipeline, useVariantAnnotations, validateSequenceDictionaries, writeReads, writeReads

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
afterPipeline, doWork, getProgramName

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, onShutdown, onStartup, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - maxDepthPerSample
```
@Argument(fullName="max-depth-per-sample",
          shortName="max-depth-per-sample",
          doc="Maximum number of reads to retain per sample per locus. Reads above this threshold will be downsampled. Set to 0 to disable.",
          optional=true)
protected int maxDepthPerSample
```
  - readShardSize
```
@Argument(fullName="read-shard-size",
          shortName="read-shard-size",
          doc="Maximum size of each read shard, in bases.",
          optional=true)
public int readShardSize
```
  - shuffle
```
@Argument(doc="whether to use the shuffle implementation or overlaps partitioning (the default)",
          shortName="shuffle",
          fullName="shuffle",
          optional=true)
public boolean shuffle
```
- Constructor Detail
  - LocusWalkerSpark
```
public LocusWalkerSpark()
```
- Method Detail
  - defaultMaxDepthPerSample
```
protected int defaultMaxDepthPerSample()
```
    Returns default value for the maxDepthPerSample parameter, if none is provided on the command line. Default implementation returns 0 (no downsampling by default).
  - requiresReads
```
public boolean requiresReads()
```
    Description copied from class: GATKSparkTool
    
    Does this tool require reads? Tools that do should override to return true.
    
    Overrides:
    
    requiresReads in class GATKSparkTool
    
    Returns:
    
    true if this tool requires reads, otherwise false
  - getDownsamplingInfo
```
protected final LIBSDownsamplingInfo getDownsamplingInfo()
```
    Returns the downsampling info using maxDepthPerSample as target coverage.
  - emitEmptyLoci
```
public boolean emitEmptyLoci()
```
    Does this tool emit information for uncovered loci? Tools that do should override to return true. NOTE: Typically, this should only be used when intervals are specified. NOTE: If MappedReadFilter is removed, then emitting empty loci will fail. NOTE: If there is no available sequence dictionary and this is set to true, there should be a failure. Please consider requiring reads and/or references for all tools that wish to set this to true.
    
    Returns:
    
    true if this tool requires uncovered loci information to be emitted, false otherwise
  - getAlignments
```
public org.apache.spark.api.java.JavaRDD<LocusWalkerContext> getAlignments(org.apache.spark.api.java.JavaSparkContext ctx)
```
    Loads alignments and the corresponding reference and features into a JavaRDD for the intervals specified. If no intervals were specified, returns all the alignments.
    
    Returns:
    
    all alignments as a JavaRDD, bounded by intervals if specified.
  - runTool
```
protected void runTool(org.apache.spark.api.java.JavaSparkContext ctx)
```
    Description copied from class: GATKSparkTool
    
    Runs the tool itself after initializing and validating inputs. Must be implemented by subclasses.
    
    Specified by:
    
    runTool in class GATKSparkTool
    
    Parameters:
    
    ctx - our Spark context
  - processAlignments
```
protected abstract void processAlignments(org.apache.spark.api.java.JavaRDD<LocusWalkerContext> rdd,
                                          org.apache.spark.api.java.JavaSparkContext ctx)
```
    Process the alignments and write output. Must be implemented by subclasses.
    
    Parameters:
    
    rdd - a distributed collection of LocusWalkerContext
    
    ctx - our Spark context

Class LocusWalkerSpark

Nested Class Summary

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Field Summary

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Constructor Summary

Method Summary

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Methods inherited from class java.lang.Object

Field Detail

maxDepthPerSample

readShardSize

shuffle

Constructor Detail

LocusWalkerSpark

Method Detail

defaultMaxDepthPerSample

requiresReads

getDownsamplingInfo

emitEmptyLoci

getAlignments

runTool

processAlignments