Package ai.djl.serving.wlm
Class WorkerPool<I,O>
- java.lang.Object
-
- ai.djl.serving.wlm.WorkerPool<I,O>
-
public class WorkerPool<I,O> extends java.lang.Object
Manages the work load for a single model.
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
cleanup()
removes all stopped workers and workers in state error from the pool.int
decreaseRef()
Decrease the reference count and return the current count.java.util.concurrent.LinkedBlockingDeque<WorkerJob<I,O>>
getJobQueue()
Returns theJobQueue
for this model.int
getMaxWorkers()
Returns the maximum number of workers for a model across all devices.ModelInfo<I,O>
getModel()
Returns the model of the worker pool.java.util.Map<ai.djl.Device,WorkerGroup<I,O>>
getWorkerGroups()
Returns a map ofWorkerGroup
.java.util.List<WorkerThread<I,O>>
getWorkers()
Returns a list of worker thread.void
increaseRef()
Increases the reference count.void
initWorkers(java.lang.String deviceName, int minWorkers, int maxWorkers)
Initializes new worker capacities for this model.boolean
isAllWorkerBusy()
Returnstrue
if all workers are busy.boolean
isAllWorkerDied()
Return if all workers died.boolean
isFullyScaled()
Returns if the worker groups is fully scaled.void
scaleWorkers(java.lang.String deviceName, int minWorkers, int maxWorkers)
Sets new worker capacities for this model.void
shutdown()
Shuts down all the worker threads in the work pool.void
shutdownWorkers()
Shutdown all works.
-
-
-
Method Detail
-
increaseRef
public void increaseRef()
Increases the reference count.
-
decreaseRef
public int decreaseRef()
Decrease the reference count and return the current count.- Returns:
- the current count
-
getModel
public ModelInfo<I,O> getModel()
Returns the model of the worker pool.- Returns:
- the model of the worker pool
-
getWorkerGroups
public java.util.Map<ai.djl.Device,WorkerGroup<I,O>> getWorkerGroups()
Returns a map ofWorkerGroup
.- Returns:
- a map of
WorkerGroup
-
getWorkers
public java.util.List<WorkerThread<I,O>> getWorkers()
Returns a list of worker thread.- Returns:
- the workers
-
getJobQueue
public java.util.concurrent.LinkedBlockingDeque<WorkerJob<I,O>> getJobQueue()
Returns theJobQueue
for this model.- Returns:
- the jobQueue
-
getMaxWorkers
public int getMaxWorkers()
Returns the maximum number of workers for a model across all devices.- Returns:
- the maximum number of workers for a model across all devices
-
isAllWorkerDied
public boolean isAllWorkerDied()
Return if all workers died.- Returns:
- true if all workers died
-
isAllWorkerBusy
public boolean isAllWorkerBusy()
Returnstrue
if all workers are busy.- Returns:
true
if all workers are busy
-
isFullyScaled
public boolean isFullyScaled()
Returns if the worker groups is fully scaled.- Returns:
- true if the worker groups is fully scaled
-
initWorkers
public void initWorkers(java.lang.String deviceName, int minWorkers, int maxWorkers)
Initializes new worker capacities for this model.- Parameters:
deviceName
- the device for the model, null for default devicesminWorkers
- minimum amount of workers.maxWorkers
- maximum amount of workers.
-
scaleWorkers
public void scaleWorkers(java.lang.String deviceName, int minWorkers, int maxWorkers)
Sets new worker capacities for this model.- Parameters:
deviceName
- the device for the model, null for all loaded devicesminWorkers
- minimum amount of workers.maxWorkers
- maximum amount of workers.
-
shutdownWorkers
public void shutdownWorkers()
Shutdown all works.
-
cleanup
public void cleanup()
removes all stopped workers and workers in state error from the pool.
-
shutdown
public void shutdown()
Shuts down all the worker threads in the work pool.
-
-