Object

org.apache.flink.ml.preprocessing

Splitter

Related Doc: package preprocessing

Permalink

object Splitter

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Splitter
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class TrainTestDataSet[T](training: DataSet[T], testing: DataSet[T])(implicit evidence$1: TypeInformation[T], evidence$2: ClassTag[T]) extends Product with Serializable

    Permalink
  2. case class TrainTestHoldoutDataSet[T](training: DataSet[T], testing: DataSet[T], holdout: DataSet[T])(implicit evidence$3: TypeInformation[T], evidence$4: ClassTag[T]) extends Product with Serializable

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. def kFoldSplit[T](input: DataSet[T], kFolds: Int, seed: Long = Utils.RNG.nextLong())(implicit arg0: TypeInformation[T], arg1: ClassTag[T]): Array[TrainTestDataSet[T]]

    Permalink

    Split a DataSet into an array of TrainTest DataSets

    Split a DataSet into an array of TrainTest DataSets

    input

    DataSet to be split

    kFolds

    The number of TrainTest DataSets to be returns. Each 'testing' will be 1/k of the dataset, randomly sampled, the training will be the remainder of the dataset. The DataSet is split into kFolds first, so that no observation will occuring in multiple folds.

    seed

    Random number generator seed.

    returns

    An array of TrainTestDataSets

  13. def multiRandomSplit[T](input: DataSet[T], fracArray: Array[Double], seed: Long = Utils.RNG.nextLong())(implicit arg0: TypeInformation[T], arg1: ClassTag[T]): Array[DataSet[T]]

    Permalink

    Split a DataSet by the probability fraction of each element of a vector.

    Split a DataSet by the probability fraction of each element of a vector.

    input

    DataSet to be split

    fracArray

    An array of PROPORTIONS for splitting the DataSet. Unlike the randomSplit function, number greater than 1 do not lead to over sampling. The number of splits is dictated by the length of this array. The number are normalized, eg. Array(1.0, 2.0) would yield two data sets with a 33/66% split.

    seed

    Random number generator seed.

    returns

    An array of DataSets whose length is equal to the length of fracArray

  14. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. def randomSplit[T](input: DataSet[T], fraction: Double, precise: Boolean = false, seed: Long = Utils.RNG.nextLong())(implicit arg0: TypeInformation[T], arg1: ClassTag[T]): Array[DataSet[T]]

    Permalink

    Split a DataSet by the probability fraction of each element.

    Split a DataSet by the probability fraction of each element.

    input

    DataSet to be split

    fraction

    Probability that each element is chosen, should be [0,1] This fraction refers to the first element in the resulting array.

    precise

    Sampling by default is random and can result in slightly lop-sided sample sets. When precise is true, equal sample set size are forced, however this is somewhat less efficient.

    seed

    Random number generator seed.

    returns

    An array of two datasets

  18. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  19. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  20. def trainTestHoldoutSplit[T](input: DataSet[T], fracTuple: (Double, Double, Double) = (0.6,0.3,0.1), seed: Long = Utils.RNG.nextLong())(implicit arg0: TypeInformation[T], arg1: ClassTag[T]): TrainTestHoldoutDataSet[T]

    Permalink

    A wrapper for multiRandomSplit that yields a TrainTestHoldoutDataSet

    A wrapper for multiRandomSplit that yields a TrainTestHoldoutDataSet

    input

    DataSet to be split

    fracTuple

    A tuple of three doubles, where the first element specifies the size of the training set, the second element the testing set, and the third element is the holdout set. These are proportional and will be normalized internally.

    seed

    Random number generator seed.

    returns

    A TrainTestDataSet

  21. def trainTestSplit[T](input: DataSet[T], fraction: Double = 0.6, precise: Boolean = false, seed: Long = Utils.RNG.nextLong())(implicit arg0: TypeInformation[T], arg1: ClassTag[T]): TrainTestDataSet[T]

    Permalink

    A wrapper for randomSplit that yields a TrainTestDataSet

    A wrapper for randomSplit that yields a TrainTestDataSet

    input

    DataSet to be split

    fraction

    Probability that each element is chosen, should be [0,1]. This fraction refers to the training element in TrainTestSplit

    precise

    Sampling by default is random and can result in slightly lop-sided sample sets. When precise is true, equal sample set size are forced, however this is somewhat less efficient.

    seed

    Random number generator seed.

    returns

    A TrainTestDataSet

  22. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped