Trait

spark.jobserver

NamedObjects

Related Doc: package jobserver

Permalink

trait NamedObjects extends AnyRef

NamedObjects - a trait that gives you safe, concurrent creation and access to named objects such as RDDs or DataFrames (the native SparkContext interface only has access to RDDs by numbers). It facilitates easy sharing of data objects amongst jobs sharing the same SparkContext. If two jobs simultaneously tries to create a data object with the same name and in the same namespace, only one will win and the other will retrieve the same one.

Note that to take advantage of NamedObjectSupport, a job must mix this in and use the APIs here instead of the native DataFrame/RDD cache(), otherwise we will not know about the names.

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. NamedObjects
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def defaultTimeout: Timeout

    Permalink
  2. abstract def destroy[O <: NamedObject](objOfType: O, name: String)(implicit persister: NamedObjectPersister[O]): Unit

    Permalink

    Destroys the named object with the given name, if one existed.

    Destroys the named object with the given name, if one existed. The reference to the object is removed from the cache and the persister is asked asynchronously to unpersist the object iff it was found in the list of named objects. Has no effect if no named object with this name is known to the cache.

    name

    the unique name of the object. The uniqueness is scoped to the current SparkContext.

  3. abstract def forget(name: String): Unit

    Permalink

    removes the named object with the given name, if one existed, from the cache Has no effect if no named object with this name exists.

    removes the named object with the given name, if one existed, from the cache Has no effect if no named object with this name exists.

    The persister is not (!) asked to unpersist the object, use destroy instead if that is desired

    name

    the unique name of the object. The uniqueness is scoped to the current SparkContext.

  4. abstract def get[O <: NamedObject](name: String)(implicit timeout: Timeout = defaultTimeout): Option[O]

    Permalink

    Gets an named object (NObj) with the given name if it already exists and is cached.

    Gets an named object (NObj) with the given name if it already exists and is cached. If the NObj does not exist, None is returned.

    Note that a previously-known name object could 'disappear' if it hasn't been used for a while, because for example, the SparkContext garbage-collects old cached RDDs.

    name

    the unique name of the NObj. The uniqueness is scoped to the current SparkContext.

    timeout

    if the RddManager doesn't respond within this timeout, an error will be thrown.

    returns

    the NObj with the given name.

    Exceptions thrown

    java.util.concurrent.TimeoutException if the request to the RddManager times out.

  5. abstract def getNames(): Iterable[String]

    Permalink

    Returns the names of all named object that are managed by the named objects implementation.

    Returns the names of all named object that are managed by the named objects implementation.

    Note: this returns a snapshot of object names at one point in time. The caller should always expect that the data returned from this method may be stale and incorrect.

    returns

    a collection of object names representing object managed by the NamedObjects implementation.

  6. abstract def getOrElseCreate[O <: NamedObject](name: String, objGen: ⇒ O)(implicit timeout: Timeout = defaultTimeout, persister: NamedObjectPersister[O]): O

    Permalink

    Gets a named object (NObj) with the given name, or creates it if one doesn't already exist.

    Gets a named object (NObj) with the given name, or creates it if one doesn't already exist.

    If the given NObj has already been computed by another job and cached in memory, this method will return a reference to the cached NObj. If the NObj has never been computed, then the generator will be called to compute it, in the caller's thread, and the result will be cached and returned to the caller.

    If an NObj is requested by thread B while thread A is generating the NObj, thread B will block up to the duration specified by @timeout. If thread A finishes generating the NObj within that time, then thread B will get a reference to the newly-created RDD. If thread A does not finish generating the NObj within that time, then thread B will throw a timeout exception.

    O

    <: NamedObject the generic type of the named object.

    name

    the unique name of the NObj. The uniqueness is scoped to the current SparkContext.

    timeout

    if the named object isn't created within this timeout, an error will be thrown.

    returns

    the NObj with the given name.

    Exceptions thrown

    java.lang.RuntimeException wrapping any error that occurs within the generator function.

    java.util.concurrent.TimeoutException if the request times out.

  7. abstract def update[O <: NamedObject](name: String, objGen: ⇒ O)(implicit timeout: Timeout = defaultTimeout, persister: NamedObjectPersister[O]): O

    Permalink

    Replaces an existing named object (NObj) with a given name with a new object.

    Replaces an existing named object (NObj) with a given name with a new object. If an old named object for the given name existed, it is un-persisted (non-blocking) and destroyed. It is safe to call this method when there is no existing named object with the given name. If multiple threads call this around the same time, the end result is undefined - one of the generated RDDs will win and will be returned from future calls to get().

    The object generator function will be called from the caller's thread. Note that if this is called at the same time as getOrElseCreate() for the same name, and completes before the getOrElseCreate() call, then threads waiting for the result of getOrElseCreate() will unblock with the result of this update() call. When the getOrElseCreate() succeeds, it will replace the result of this update() call.

    O

    <: NamedObject the generic type of the object.

    name

    the unique name of the name object. The uniqueness is scoped to the current SparkContext.

    objGen

    a 0-ary function which will be called to generate the object in the caller's thread.

    returns

    the object with the given name.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  16. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped