类 OnlineKMeans

  • 所有已实现的接口:
    Serializable, org.apache.flink.ml.api.Estimator<OnlineKMeans,​OnlineKMeansModel>, org.apache.flink.ml.api.Stage<OnlineKMeans>, KMeansModelParams<OnlineKMeans>, OnlineKMeansParams<OnlineKMeans>, org.apache.flink.ml.common.param.HasBatchStrategy<OnlineKMeans>, org.apache.flink.ml.common.param.HasDecayFactor<OnlineKMeans>, org.apache.flink.ml.common.param.HasDistanceMeasure<OnlineKMeans>, org.apache.flink.ml.common.param.HasFeaturesCol<OnlineKMeans>, org.apache.flink.ml.common.param.HasGlobalBatchSize<OnlineKMeans>, org.apache.flink.ml.common.param.HasPredictionCol<OnlineKMeans>, org.apache.flink.ml.common.param.HasSeed<OnlineKMeans>, org.apache.flink.ml.param.WithParams<OnlineKMeans>

    public class OnlineKMeans
    extends Object
    implements org.apache.flink.ml.api.Estimator<OnlineKMeans,​OnlineKMeansModel>, OnlineKMeansParams<OnlineKMeans>
    OnlineKMeans extends the function of KMeans, supporting to train a K-Means model continuously according to an unbounded stream of train data.

    OnlineKMeans makes updates with the "mini-batch" KMeans rule, generalized to incorporate forgetfulness (i.e. decay). After the centroids estimated on the current batch are acquired, OnlineKMeans computes the new centroids from the weighted average between the original and the estimated centroids. The weight of the estimated centroids is the number of points assigned to them. The weight of the original centroids is also the number of points, but additionally multiplying with the decay factor.

    The decay factor scales the contribution of the clusters as estimated thus far. If the decay factor is 1, all batches are weighted equally. If the decay factor is 0, new centroids are determined entirely by recent data. Lower values correspond to more forgetting.

    另请参阅:
    序列化表格
    • 字段概要

      • 从接口继承的字段 org.apache.flink.ml.common.param.HasBatchStrategy

        BATCH_STRATEGY, COUNT_STRATEGY
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasDecayFactor

        DECAY_FACTOR
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasDistanceMeasure

        DISTANCE_MEASURE
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasFeaturesCol

        FEATURES_COL
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasGlobalBatchSize

        GLOBAL_BATCH_SIZE
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasPredictionCol

        PREDICTION_COL
      • 从接口继承的字段 org.apache.flink.ml.common.param.HasSeed

        SEED
    • 方法概要

      所有方法 静态方法 实例方法 具体方法 
      修饰符和类型 方法 说明
      OnlineKMeansModel fit​(org.apache.flink.table.api.Table... inputs)  
      Map<org.apache.flink.ml.param.Param<?>,​Object> getParamMap()  
      static OnlineKMeans load​(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path)  
      void save​(String path)
      Saves the metadata AND bounded model data table (if exists) to the given path.
      OnlineKMeans setInitialModelData​(org.apache.flink.table.api.Table initModelDataTable)
      Sets the initial model data of the online training process with the provided model data table.
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasBatchStrategy

        getBatchStrategy
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasDecayFactor

        getDecayFactor, setDecayFactor
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasDistanceMeasure

        getDistanceMeasure, setDistanceMeasure
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasFeaturesCol

        getFeaturesCol, setFeaturesCol
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasGlobalBatchSize

        getGlobalBatchSize, setGlobalBatchSize
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasPredictionCol

        getPredictionCol, setPredictionCol
      • 从接口继承的方法 org.apache.flink.ml.common.param.HasSeed

        getSeed, setSeed
      • 从接口继承的方法 org.apache.flink.ml.param.WithParams

        get, getParam, set
    • 构造器详细资料

      • OnlineKMeans

        public OnlineKMeans()
    • 方法详细资料

      • save

        public void save​(String path)
                  throws IOException
        Saves the metadata AND bounded model data table (if exists) to the given path.
        指定者:
        save 在接口中 org.apache.flink.ml.api.Stage<OnlineKMeans>
        抛出:
        IOException
      • getParamMap

        public Map<org.apache.flink.ml.param.Param<?>,​Object> getParamMap()
        指定者:
        getParamMap 在接口中 org.apache.flink.ml.param.WithParams<OnlineKMeans>
      • setInitialModelData

        public OnlineKMeans setInitialModelData​(org.apache.flink.table.api.Table initModelDataTable)
        Sets the initial model data of the online training process with the provided model data table.