KNN

Implements a k-nearest neighbor join.

Calculates the k-nearest neighbor points in the training set for each point in the test set.

Example:

```
val trainingDS: DataSet[Vector] = ...
val testingDS: DataSet[Vector] = ...
val knn = KNN()
  .setK(10)
  .setBlocks(5)
  .setDistanceMetric(EuclideanDistanceMetric())
  knn.fit(trainingDS)
val predictionDS: DataSet[(Vector, Array[Vector])] = knn.predict(testingDS)
```
Parameters
- org.apache.flink.ml.nn.KNN.K Sets the K which is the number of selected points as neighbors. (Default value: 5) - org.apache.flink.ml.nn.KNN.DistanceMetric Sets the distance metric we use to calculate the distance between two points. If no metric is specified, then org.apache.flink.ml.metrics.distances.EuclideanDistanceMetric is used. (Default value: EuclideanDistanceMetric()) - org.apache.flink.ml.nn.KNN.Blocks Sets the number of blocks into which the input data will be split. This number should be set at least to the degree of parallelism. If no value is specified, then the parallelism of the input DataSet is used as the number of blocks. (Default value: None) - org.apache.flink.ml.nn.KNN.UseQuadTree A boolean variable that whether or not to use a quadtree to partition the training set to potentially simplify the KNN search. If no value is specified, the code will automatically decide whether or not to use a quadtree. Use of a quadtree scales well with the number of training and testing points, though poorly with the dimension. (Default value: None) - org.apache.flink.ml.nn.KNN.SizeHint Specifies whether the training set or test set is small to optimize the cross product operation needed for the KNN search. If the training set is small this should be CrossHint.FIRST_IS_SMALL and set to CrossHint.SECOND_IS_SMALL if the test set is small. (Default value: None)

Linear Supertypes

Predictor[KNN], Estimator[KNN], WithParameters, AnyRef, Any

Instance Constructors

new KNN()

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def evaluate[Testing, PredictionValue](testing: DataSet[Testing], evaluateParameters: ParameterMap = ParameterMap.Empty)(implicit evaluator: EvaluateDataSetOperation[KNN, Testing, PredictionValue]): DataSet[(PredictionValue, PredictionValue)]

Evaluates the testing data by computing the prediction value and returning a pair of true label value and prediction value.
Evaluates the testing data by computing the prediction value and returning a pair of true label value and prediction value. It is important that the implementation chooses a Testing type from which it can extract the true label value.

Definition Classes
Predictor
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def fit[Training](training: DataSet[Training], fitParameters: ParameterMap = ParameterMap.Empty)(implicit fitOperation: FitOperation[KNN, Training]): Unit

Fits the estimator to the given input data.
Fits the estimator to the given input data. The fitting logic is contained in the FitOperation. The computed state will be stored in the implementing class.
Training
Type of the training data
training
Training data
fitParameters
Additional parameters for the FitOperation
fitOperation
FitOperation which encapsulates the algorithm logic

Definition Classes
Estimator
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val parameters: ParameterMap

Definition Classes
WithParameters
def predict[Testing, Prediction](testing: DataSet[Testing], predictParameters: ParameterMap = ParameterMap.Empty)(implicit predictor: PredictDataSetOperation[KNN, Testing, Prediction]): DataSet[Prediction]

Predict testing data according the learned model.
Predict testing data according the learned model. The implementing class has to provide a corresponding implementation of PredictDataSetOperation which contains the prediction logic.
Testing
Type of the testing data
Prediction
Type of the prediction data
testing
Testing data which shall be predicted
predictParameters
Additional parameters for the prediction
predictor
PredictDataSetOperation which encapsulates the prediction logic

Definition Classes
Predictor
def setBlocks(n: Int): KNN

Sets the number of data blocks/partitions
Sets the number of data blocks/partitions
n
the number of data blocks
def setDistanceMetric(metric: DistanceMetric): KNN

Sets the distance metric
Sets the distance metric
metric
the distance metric to calculate distance between two points
def setK(k: Int): KNN

Sets K
Sets K
k
the number of selected points as neighbors
def setSizeHint(sizeHint: CrossHint): KNN

Parameter a user can specify if one of the training or test sets are small
Parameter a user can specify if one of the training or test sets are small
sizeHint
cross hint tells the system which sizes to expect from the data sets
def setUseQuadTree(useQuadTree: Boolean): KNN

Sets the Boolean variable that decides whether to use the QuadTree or not
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
var trainingSet: Option[DataSet[Block[Vector]]]
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object KNN | package nn

class KNN extends Predictor[KNN]

Parameters

Instance Constructors

new KNN()

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def evaluate[Testing, PredictionValue](testing: DataSet[Testing], evaluateParameters: ParameterMap = ParameterMap.Empty)(implicit evaluator: EvaluateDataSetOperation[KNN, Testing, PredictionValue]): DataSet[(PredictionValue, PredictionValue)]

def finalize(): Unit

def fit[Training](training: DataSet[Training], fitParameters: ParameterMap = ParameterMap.Empty)(implicit fitOperation: FitOperation[KNN, Training]): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val parameters: ParameterMap

def predict[Testing, Prediction](testing: DataSet[Testing], predictParameters: ParameterMap = ParameterMap.Empty)(implicit predictor: PredictDataSetOperation[KNN, Testing, Prediction]): DataSet[Prediction]

def setBlocks(n: Int): KNN

def setDistanceMetric(metric: DistanceMetric): KNN

def setK(k: Int): KNN

def setSizeHint(sizeHint: CrossHint): KNN

def setUseQuadTree(useQuadTree: Boolean): KNN

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

var trainingSet: Option[DataSet[Block[Vector]]]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Predictor[KNN]

Inherited from Estimator[KNN]

Inherited from WithParameters

Inherited from AnyRef

Inherited from Any

Ungrouped