Class WorkLoadManager


  • public class WorkLoadManager
    extends java.lang.Object
    WorkLoadManager is responsible to manage the work load of worker thread. the manage scales up/down the required amount of worker threads per model.
    • Constructor Detail

      • WorkLoadManager

        public WorkLoadManager()
        Constructs a WorkLoadManager instance.
    • Method Detail

      • registerModel

        public <I,​O> WorkerPool<I,​O> registerModel​(ModelInfo<I,​O> modelInfo)
        Registers a model and returns the WorkerPool for it.

        This operation is idempotent and will return the existing workerpool if the model was already registered.

        Type Parameters:
        I - the model input class
        O - the model output class
        Parameters:
        modelInfo - the model to create the worker pool for
        Returns:
        the WorkerPool
      • unregisterModel

        public void unregisterModel​(ModelInfo<?,​?> model)
        Removes a model from management.
        Parameters:
        model - the model to remove
      • runJob

        public <I,​O> java.util.concurrent.CompletableFuture<O> runJob​(Job<I,​O> job)
        Adds an inference job to the job queue of the next free worker. scales up worker if necessary.
        Type Parameters:
        I - the model input class
        O - the model output class
        Parameters:
        job - an inference job to be executed.
        Returns:
        true if submit success, false otherwise.
      • getNumRunningWorkers

        public int getNumRunningWorkers​(ModelInfo<?,​?> modelInfo)
        Returns the number of running workers of a model. running workers are workers which are not stopped, in error or scheduled to scale down.
        Parameters:
        modelInfo - the model we are interested in.
        Returns:
        number of running workers.
      • getWorkerPool

        public <I,​O> WorkerPool<I,​O> getWorkerPool​(ModelInfo<I,​O> modelInfo)
        Returns the WorkerPool for a model.
        Type Parameters:
        I - the model input class
        O - the model output class
        Parameters:
        modelInfo - the model to get the worker pool for
        Returns:
        the WorkerPool
      • close

        public void close()
        Close all models related to the WorkloadManager.