Package ai.djl.serving.wlm
Class WorkerPoolConfig<I,O>
java.lang.Object
ai.djl.serving.wlm.WorkerPoolConfig<I,O>
- Type Parameters:
I
- the input typeO
- the output type
- Direct Known Subclasses:
ModelInfo
A
WorkerPoolConfig
represents a task that could be run in the WorkLoadManager
.
Each WorkerThread
(also WorkerPool
and WorkerGroup
) focuses on
executing a single worker type. They contain the configuration for the thread, any persistent
data, and the code to run on the thread.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
An enum represents state of a worker type.static class
The part of theWorkerPoolConfig
for an individualWorkerThread
. -
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionabstract void
close()
Close all loaded workers.boolean
int
Returns the configured batch size.getId()
Returns the worker configs ID.abstract String[]
Returns the devices the worker type will be loaded on at startup.int
Returns the maximum delay in milliseconds to aggregate a batch.int
Returns the configured max idle time in seconds of workers.int
getMaxWorkers
(ai.djl.Device device) Returns the maximum number of workers.int
getMinWorkers
(ai.djl.Device device) Returns the minimum number of workers.Returns the worker type url.int
Returns the configured size of the workers queue.abstract WorkerPoolConfig.Status
Returns the worker type loading status.getUid()
Returns the worker configs unique ID.Returns the worker type version.int
hashCode()
abstract void
Initialize the worker.abstract boolean
Returns if the worker type can be load parallel on multiple devices.abstract void
load
(ai.djl.Device device) Loads the worker type to the specified device.abstract WorkerPoolConfig.ThreadConfig<I,
O> newThread
(ai.djl.Device device) Starts a newWorkerThread
for thisWorkerPoolConfig
.void
setBatchSize
(int batchSize) Sets the configured batch size.void
Sets the worker configs ID.void
setMaxBatchDelayMillis
(int maxBatchDelayMillis) Sets the maximum delay in milliseconds to aggregate a batch.void
setMaxIdleSeconds
(int maxIdleSeconds) Sets the configured max idle time in seconds of workers.void
setMaxWorkers
(int maxWorkers) Sets the starting number of max workers.void
setMinMaxWorkers
(int minWorkers, int maxWorkers) Sets the starting minimum and maximum number of workers.void
setMinWorkers
(int minWorkers) Sets the starting number of min workers.void
setQueueSize
(int queueSize) Sets the configured size of the workers queue.toString()
ai.djl.Device
withDefaultDevice
(String deviceName) Returns the default device for this model if device is null.
-
Field Details
-
id
-
uid
-
version
-
modelUrl
-
queueSize
protected int queueSize -
batchSize
protected int batchSize -
maxBatchDelayMillis
protected int maxBatchDelayMillis -
maxIdleSeconds
protected int maxIdleSeconds -
minWorkers
-
maxWorkers
-
-
Constructor Details
-
WorkerPoolConfig
protected WorkerPoolConfig()
-
-
Method Details
-
load
Loads the worker type to the specified device.- Parameters:
device
- the device to load worker type on- Throws:
IOException
- if failed to read worker type fileai.djl.ModelException
- if failed to load the specified model
-
newThread
Starts a newWorkerThread
for thisWorkerPoolConfig
.- Parameters:
device
- the device to run on- Returns:
- the new
WorkerPoolConfig.ThreadConfig
-
initialize
Initialize the worker.- Throws:
IOException
- if failed to download workerai.djl.repository.zoo.ModelNotFoundException
- if model not foundai.djl.ModelException
-
close
public abstract void close()Close all loaded workers. -
withDefaultDevice
Returns the default device for this model if device is null.- Parameters:
deviceName
- the device to use if it is not null- Returns:
- a non-null device
-
getStatus
Returns the worker type loading status.- Returns:
- the worker type loading status
-
isParallelLoading
public abstract boolean isParallelLoading()Returns if the worker type can be load parallel on multiple devices.- Returns:
- if the worker type can be load parallel on multiple devices
-
getLoadOnDevices
Returns the devices the worker type will be loaded on at startup.- Returns:
- the devices the worker type will be loaded on at startup
-
setId
Sets the worker configs ID.- Parameters:
id
- the worker configs ID
-
getUid
Returns the worker configs unique ID.- Returns:
- the worker configs unique ID
-
getId
Returns the worker configs ID.- Returns:
- the worker configs ID
-
getVersion
Returns the worker type version.- Returns:
- the worker type version
-
getModelUrl
Returns the worker type url.- Returns:
- the worker type url
-
setMaxIdleSeconds
public void setMaxIdleSeconds(int maxIdleSeconds) Sets the configured max idle time in seconds of workers.- Parameters:
maxIdleSeconds
- the configured max idle time in seconds of workers
-
getMaxIdleSeconds
public int getMaxIdleSeconds()Returns the configured max idle time in seconds of workers.- Returns:
- the max idle time in seconds
-
setBatchSize
public void setBatchSize(int batchSize) Sets the configured batch size.- Parameters:
batchSize
- the configured batch size
-
getBatchSize
public int getBatchSize()Returns the configured batch size.- Returns:
- the configured batch size
-
setMaxBatchDelayMillis
public void setMaxBatchDelayMillis(int maxBatchDelayMillis) Sets the maximum delay in milliseconds to aggregate a batch.- Parameters:
maxBatchDelayMillis
- the maximum delay in milliseconds to aggregate a batch
-
getMaxBatchDelayMillis
public int getMaxBatchDelayMillis()Returns the maximum delay in milliseconds to aggregate a batch.- Returns:
- the maximum delay in milliseconds to aggregate a batch
-
setQueueSize
public void setQueueSize(int queueSize) Sets the configured size of the workers queue.- Parameters:
queueSize
- the configured size of the workers queue
-
getQueueSize
public int getQueueSize()Returns the configured size of the workers queue.- Returns:
- requested size of the workers queue.
-
setMinWorkers
public void setMinWorkers(int minWorkers) Sets the starting number of min workers.- Parameters:
minWorkers
- Sets the starting number of min workers
-
getMinWorkers
public int getMinWorkers(ai.djl.Device device) Returns the minimum number of workers.- Parameters:
device
- the device to get the min workers for- Returns:
- the minimum number of workers
-
setMaxWorkers
public void setMaxWorkers(int maxWorkers) Sets the starting number of max workers.- Parameters:
maxWorkers
- Sets the starting number of max workers
-
getMaxWorkers
public int getMaxWorkers(ai.djl.Device device) Returns the maximum number of workers.- Parameters:
device
- the device to get the max workers for- Returns:
- the maximum number of workers
-
setMinMaxWorkers
public void setMinMaxWorkers(int minWorkers, int maxWorkers) Sets the starting minimum and maximum number of workers.- Parameters:
minWorkers
- the new minimum number of workersmaxWorkers
- the new maximum number of workers
-
equals
-
hashCode
public int hashCode() -
toString
-