Package

org.apache.spark.ml.clustering

tupol

Permalink

package tupol

Visibility
  1. Public
  2. All

Type Members

  1. class XKMeans extends KMeans with XKMeansParams

    Permalink

    Extended KMeans algorithm.

    Extended KMeans algorithm.

    Calculates the following: - cluster (prediction); already available in the default KMeans algorithm. - distance to cluster - probability - probability by feature (dimension)

    Note: The probability by feature algorithm is based on the ideas presented in https://github.com/tupol/naive-ml; https://github.com/tupol/naive-ml/blob/master/src/main/scala/tupol/ml/clustering/KMeansGaussian.scala.

    Note: The probability by feature algorithm can be rendered useless if a feature/dimension reduction algorithm is used before applying XKMeans2, as we will be unable to track back the exact feature which contributed to a record being classified as an anomaly.

    Note: This is by far not a perfect solution yet, as the general assumption is that the data follows a normal distribution, which is not always the case.

    Annotations
    @Experimental()
  2. class XKMeansModel extends KMeansModel with XKMeansParams

    Permalink

Value Members

  1. object XKMeansModel extends Serializable

    Permalink
  2. object XKMeansReporting

    Permalink

    Defines a set of reports generated from a PipelineModel with XKMeansModel and proper feature names.

  3. package evaluation

    Permalink
  4. package implicits

    Permalink

  5. object vectorops

    Permalink

    Additional operations for linalg.Vector

Ungrouped