ReadWalkerSpark (gatk 4.1.4.1 API)

java.lang.Object
- org.broadinstitute.hellbender.cmdline.CommandLineProgram
- - org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
  - - org.broadinstitute.hellbender.engine.spark.GATKSparkTool
    - - org.broadinstitute.hellbender.engine.spark.ReadWalkerSpark

All Implemented Interfaces:

java.io.Serializable, org.broadinstitute.barclay.argparser.CommandLinePluginProvider

Direct Known Subclasses:

ExampleReadWalkerWithReferenceSpark, ExampleReadWalkerWithVariantsSpark
```
public abstract class ReadWalkerSpark
extends GATKSparkTool
```
A Spark version of ReadWalker. Subclasses should implement processReads(JavaRDD, JavaSparkContext) and operate on the passed in RDD.

See Also:

ExampleReadWalkerWithReferenceSpark, ExampleReadWalkerWithVariantsSpark, Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
  GATKSparkTool.ReadInputMergingPolicy

Field Summary

Fields
Modifier and Type	Field and Description
`static int`	`FEATURE_CACHE_LOOKAHEAD` This number controls the size of the cache for our FeatureInputs (specifically, the number of additional bases worth of overlapping records to cache when querying feature sources).

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addOutputVCFCommandLine, BAM_PARTITION_SIZE_LONG_NAME, bamPartitionSplitSize, CREATE_OUTPUT_BAM_SPLITTING_INDEX_LONG_NAME, createOutputBamIndex, createOutputBamSplittingIndex, createOutputVariantIndex, features, intervalArgumentCollection, NUM_REDUCERS_LONG_NAME, numReducers, OUTPUT_SHARD_DIR_LONG_NAME, readArguments, referenceArguments, sequenceDictionaryValidationArguments, SHARDED_OUTPUT_LONG_NAME, shardedOutput, shardedPartsDir, USE_NIO, useNio

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
programName, SPARK_PROGRAM_NAME_LONG_NAME, sparkArgs

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
GATK_CONFIG_FILE, logger, NIO_MAX_REOPENS, NIO_PROJECT_FOR_REQUESTER_PAYS, QUIET, specialArgumentsCollection, tmpDir, useJdkDeflater, useJdkInflater, VERBOSITY

Constructor Summary

Constructors
Constructor and Description

ReadWalkerSpark()

Constructors
Constructor and Description
`ReadWalkerSpark()`

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.spark.api.java.JavaRDD<ReadWalkerContext>`	`getReads(org.apache.spark.api.java.JavaSparkContext ctx)` Loads reads and the corresponding reference and features into a `JavaRDD` for the intervals specified.
`protected abstract void`	`processReads(org.apache.spark.api.java.JavaRDD<ReadWalkerContext> rdd, org.apache.spark.api.java.JavaSparkContext ctx)` Process the reads and write output.
`boolean`	`requiresReads()` Does this tool require reads? Tools that do should override to return true.
`protected void`	`runTool(org.apache.spark.api.java.JavaSparkContext ctx)` Runs the tool itself after initializing and validating inputs.

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool
addReferenceFilesForSpark, addVCFsForSpark, editIntervals, getBestAvailableSequenceDictionary, getDefaultReadFilters, getDefaultToolVCFHeaderLines, getDefaultVariantAnnotationGroups, getDefaultVariantAnnotations, getGatkReadJavaRDD, getHeaderForReads, getIntervals, getPluginDescriptors, getReadInputMergingPolicy, getReads, getReadSourceHeaderMap, getReadSourceName, getRecommendedNumReducers, getReference, getReferenceSequenceDictionary, getReferenceWindowFunction, getSequenceDictionaryValidationArgumentCollection, getTargetPartitionSize, getUnfilteredReads, hasReads, hasReference, hasUserSuppliedIntervals, makeReadFilter, makeReadFilter, makeVariantAnnotations, requiresIntervals, requiresReference, runPipeline, useVariantAnnotations, validateSequenceDictionaries, writeReads, writeReads

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram
afterPipeline, doWork, getProgramName

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getMetricsFile, getSupportInformation, getToolkitName, getToolkitShortName, getToolStatusWarning, getUsage, getVersion, instanceMain, instanceMainPostParseArgs, isBetaFeature, isExperimentalFeature, onShutdown, onStartup, parseArgs, printLibraryVersions, printSettings, printStartupMessage, runTool, setDefaultHeaders, warnOnToolStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - FEATURE_CACHE_LOOKAHEAD
```
public static final int FEATURE_CACHE_LOOKAHEAD
```
    This number controls the size of the cache for our FeatureInputs (specifically, the number of additional bases worth of overlapping records to cache when querying feature sources).
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - ReadWalkerSpark
```
public ReadWalkerSpark()
```
- Method Detail
  - requiresReads
```
public boolean requiresReads()
```
    Description copied from class: GATKSparkTool
    
    Does this tool require reads? Tools that do should override to return true.
    
    Overrides:
    
    requiresReads in class GATKSparkTool
    
    Returns:
    
    true if this tool requires reads, otherwise false
  - getReads
```
public org.apache.spark.api.java.JavaRDD<ReadWalkerContext> getReads(org.apache.spark.api.java.JavaSparkContext ctx)
```
    Loads reads and the corresponding reference and features into a JavaRDD for the intervals specified. If no intervals were specified, returns all the reads.
    
    Returns:
    
    all reads as a JavaRDD, bounded by intervals if specified.
  - runTool
```
protected void runTool(org.apache.spark.api.java.JavaSparkContext ctx)
```
    Description copied from class: GATKSparkTool
    
    Runs the tool itself after initializing and validating inputs. Must be implemented by subclasses.
    
    Specified by:
    
    runTool in class GATKSparkTool
    
    Parameters:
    
    ctx - our Spark context
  - processReads
```
protected abstract void processReads(org.apache.spark.api.java.JavaRDD<ReadWalkerContext> rdd,
                                     org.apache.spark.api.java.JavaSparkContext ctx)
```
    Process the reads and write output. Must be implemented by subclasses.
    
    Parameters:
    
    rdd - a distributed collection of ReadWalkerContext
    
    ctx - our Spark context

Class ReadWalkerSpark

Nested Class Summary

Nested classes/interfaces inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Field Summary

Fields inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Fields inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Fields inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Constructor Summary

Method Summary

Methods inherited from class org.broadinstitute.hellbender.engine.spark.GATKSparkTool

Methods inherited from class org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram

Methods inherited from class org.broadinstitute.hellbender.cmdline.CommandLineProgram

Methods inherited from class java.lang.Object

Field Detail

FEATURE_CACHE_LOOKAHEAD

Constructor Detail

ReadWalkerSpark

Method Detail

requiresReads

getReads

runTool

processReads