rdd

Type Members

class FunctionRecorder extends Serializable
class InstrumentedOrderedRDDFunctions[K, V] extends Serializable

A version of OrderedRDDFunctions which enables instrumentation of its operations.
A version of OrderedRDDFunctions which enables instrumentation of its operations. For more details and usage instructions see the MetricsContext class.
abstract class InstrumentedOutputFormat[K, V] extends OutputFormat[K, V]

Implementation of org.apache.hadoop.mapreduce.OutputFormat, which instruments its RecordWriter's write method.
Implementation of org.apache.hadoop.mapreduce.OutputFormat, which instruments its RecordWriter's write method. Classes should extend this one and provide the class of the underlying output format using the outputFormatClass method.
This class is intended for use with the methods in InstrumentedPairRDDFunctions that save hadoop files (saveAs*HadoopFile).
class InstrumentedPairRDDFunctions[K, V] extends Serializable

A version of PairRDDFunctions which enables instrumentation of its operations.
A version of PairRDDFunctions which enables instrumentation of its operations. For more details and usage instructions see the MetricsContext class.
class InstrumentedRDD[T] extends RDD[T]

An RDD which instruments its operations.
An RDD which instruments its operations. For further details and usage instructions see the MetricsContext class.

Note
This class needs to be in the org.apache.spark.rdd package, otherwise Spark will record the incorrect call site (which in turn becomes the stage name). This can be fixed when we use Spark 1.1.1 (needs SPARK-1853).
class InstrumentedRDDFunctions[T] extends AnyRef

Functions which permit creation of instrumented RDDs, as well as the ability to stop instrumentation by calling the unInstrument method.
Functions which permit creation of instrumented RDDs, as well as the ability to stop instrumentation by calling the unInstrument method. For more details and usage instructions see the MetricsContext class.
class Timer extends Serializable

Represents a timer, for timing a function.
Represents a timer, for timing a function. Call the time function, passing the function to time.
For recording metrics the Timer either uses the passed-in MetricsRecorder if it is defined, or it looks in the Metrics.Recorder field for a recorder. If neither of these are defined then no metrics are recorded (the function is executed without recording metrics).
The overhead of recording metrics has been measured at around 100 nanoseconds on an Intel i7-3720QM. The overhead of calling the time method when no metrics are being recorded (a recorder is not defined) is negligible.

Note
This class needs to be in the org.apache.spark.rdd package, otherwise Spark records somewhere in the time method as the call site (which in turn becomes the stage name). This can be fixed when Spark 1.1.1 is released (needs SPARK-1853).

Value Members

object InstrumentedPairRDDFunctions extends Serializable
object InstrumentedRDD extends Serializable
object MetricsContext

Contains implicit conversions which enable instrumentation of Spark operations.
Contains implicit conversions which enable instrumentation of Spark operations. This class should be used instead of org.apache.spark.SparkContext when instrumentation is required. Usage is as follows:
```
import org.bdgenomics.utils.instrumentation.Metrics._
import org.apache.spark.rdd.MetricsContext._
Metrics.initialize(sparkContext)
val instrumentedRDD = rdd.instrument()
```
Then, when any operations are performed on instrumentedRDD the RDD operation will be instrumented, along with any functions that operate on its data. All subsequent RDD operations will be instrumented until the unInstrument method is called on an RDD.
Note
When using this class, it is not a good idea to import SparkContext._, as the implicit conversions in there may conflict with those in here -- instead it is better to import only the specific parts of SparkContext that are needed.

package rdd

Type Members

class FunctionRecorder extends Serializable

class InstrumentedOrderedRDDFunctions[K, V] extends Serializable

abstract class InstrumentedOutputFormat[K, V] extends OutputFormat[K, V]

class InstrumentedPairRDDFunctions[K, V] extends Serializable

class InstrumentedRDD[T] extends RDD[T]

class InstrumentedRDDFunctions[T] extends AnyRef

class Timer extends Serializable

Value Members

object InstrumentedPairRDDFunctions extends Serializable

object InstrumentedRDD extends Serializable

object MetricsContext

Inherited from AnyRef

Inherited from Any

Ungrouped