Package ai.djl.serving.wlm
Class WorkerPoolConfig<I,O>
- java.lang.Object
-
- ai.djl.serving.wlm.WorkerPoolConfig<I,O>
-
- Type Parameters:
I
- the input typeO
- the output type
- Direct Known Subclasses:
ModelInfo
public abstract class WorkerPoolConfig<I,O> extends java.lang.Object
AWorkerPoolConfig
represents a task that could be run in theWorkLoadManager
.Each
WorkerThread
(alsoWorkerPool
andWorkerGroup
) focuses on executing a single worker type. They contain the configuration for the thread, any persistent data, and the code to run on the thread.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
WorkerPoolConfig.Status
An enum represents state of a worker type.static class
WorkerPoolConfig.ThreadConfig<I,O>
The part of theWorkerPoolConfig
for an individualWorkerThread
.
-
Field Summary
Fields Modifier and Type Field Description protected int
batchSize
protected java.lang.String
id
protected int
maxBatchDelayMillis
protected int
maxIdleSeconds
protected java.lang.Integer
maxWorkers
protected java.lang.Integer
minWorkers
protected java.lang.String
modelUrl
protected int
queueSize
protected java.lang.String
version
-
Constructor Summary
Constructors Constructor Description WorkerPoolConfig()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract void
close()
Close all loaded workers.boolean
equals(java.lang.Object o)
int
getBatchSize()
Returns the configured batch size.java.lang.String
getId()
Returns the worker type ID.abstract java.lang.String[]
getLoadOnDevices()
Returns the devices the worker type will be loaded on at startup.int
getMaxBatchDelayMillis()
Returns the maximum delay in milliseconds to aggregate a batch.int
getMaxIdleSeconds()
Returns the configured max idle time in seconds of workers.int
getMaxWorkers(ai.djl.Device device)
Returns the maximum number of workers.int
getMinWorkers(ai.djl.Device device)
Returns the minimum number of workers.java.lang.String
getModelUrl()
Returns the worker type url.int
getQueueSize()
Returns the configured size of the workers queue.abstract WorkerPoolConfig.Status
getStatus()
Returns the worker type loading status.java.lang.String
getVersion()
Returns the worker type version.int
hashCode()
abstract void
initialize()
Initialize the worker.abstract boolean
isParallelLoading()
Returns if the worker type can be load parallel on multiple devices.abstract void
load(ai.djl.Device device)
Loads the worker type to the specified device.abstract WorkerPoolConfig.ThreadConfig<I,O>
newThread(ai.djl.Device device)
Starts a newWorkerThread
for thisWorkerPoolConfig
.void
setBatchSize(int batchSize)
Sets the configured batch size.void
setId(java.lang.String id)
Sets the worker type ID.void
setMaxBatchDelayMillis(int maxBatchDelayMillis)
Sets the maximum delay in milliseconds to aggregate a batch.void
setMaxIdleSeconds(int maxIdleSeconds)
Sets the configured max idle time in seconds of workers.void
setMaxWorkers(int maxWorkers)
Sets the starting number of max workers.void
setMinMaxWorkers(int minWorkers, int maxWorkers)
Sets the starting minimum and maximum number of workers.void
setMinWorkers(int minWorkers)
Sets the starting number of min workers.void
setQueueSize(int queueSize)
Sets the configured size of the workers queue.java.lang.String
toString()
ai.djl.Device
withDefaultDevice(java.lang.String deviceName)
Returns the default device for this model if device is null.
-
-
-
Field Detail
-
id
protected transient java.lang.String id
-
version
protected java.lang.String version
-
modelUrl
protected java.lang.String modelUrl
-
queueSize
protected int queueSize
-
batchSize
protected int batchSize
-
maxBatchDelayMillis
protected int maxBatchDelayMillis
-
maxIdleSeconds
protected int maxIdleSeconds
-
minWorkers
protected java.lang.Integer minWorkers
-
maxWorkers
protected java.lang.Integer maxWorkers
-
-
Method Detail
-
load
public abstract void load(ai.djl.Device device) throws ai.djl.ModelException, java.io.IOException
Loads the worker type to the specified device.- Parameters:
device
- the device to load worker type on- Throws:
java.io.IOException
- if failed to read worker type fileai.djl.ModelException
- if failed to load the specified model
-
newThread
public abstract WorkerPoolConfig.ThreadConfig<I,O> newThread(ai.djl.Device device)
Starts a newWorkerThread
for thisWorkerPoolConfig
.- Parameters:
device
- the device to run on- Returns:
- the new
WorkerPoolConfig.ThreadConfig
-
initialize
public abstract void initialize() throws java.io.IOException, ai.djl.ModelException
Initialize the worker.- Throws:
java.io.IOException
- if failed to download workerai.djl.repository.zoo.ModelNotFoundException
- if model not foundai.djl.ModelException
-
close
public abstract void close()
Close all loaded workers.
-
withDefaultDevice
public ai.djl.Device withDefaultDevice(java.lang.String deviceName)
Returns the default device for this model if device is null.- Parameters:
deviceName
- the device to use if it is not null- Returns:
- a non-null device
-
getStatus
public abstract WorkerPoolConfig.Status getStatus()
Returns the worker type loading status.- Returns:
- the worker type loading status
-
isParallelLoading
public abstract boolean isParallelLoading()
Returns if the worker type can be load parallel on multiple devices.- Returns:
- if the worker type can be load parallel on multiple devices
-
getLoadOnDevices
public abstract java.lang.String[] getLoadOnDevices()
Returns the devices the worker type will be loaded on at startup.- Returns:
- the devices the worker type will be loaded on at startup
-
setId
public void setId(java.lang.String id)
Sets the worker type ID.- Parameters:
id
- the worker type ID
-
getId
public java.lang.String getId()
Returns the worker type ID.- Returns:
- the worker type ID
-
getVersion
public java.lang.String getVersion()
Returns the worker type version.- Returns:
- the worker type version
-
getModelUrl
public java.lang.String getModelUrl()
Returns the worker type url.- Returns:
- the worker type url
-
setMaxIdleSeconds
public void setMaxIdleSeconds(int maxIdleSeconds)
Sets the configured max idle time in seconds of workers.- Parameters:
maxIdleSeconds
- the configured max idle time in seconds of workers
-
getMaxIdleSeconds
public int getMaxIdleSeconds()
Returns the configured max idle time in seconds of workers.- Returns:
- the max idle time in seconds
-
setBatchSize
public void setBatchSize(int batchSize)
Sets the configured batch size.- Parameters:
batchSize
- the configured batch size
-
getBatchSize
public int getBatchSize()
Returns the configured batch size.- Returns:
- the configured batch size
-
setMaxBatchDelayMillis
public void setMaxBatchDelayMillis(int maxBatchDelayMillis)
Sets the maximum delay in milliseconds to aggregate a batch.- Parameters:
maxBatchDelayMillis
- the maximum delay in milliseconds to aggregate a batch
-
getMaxBatchDelayMillis
public int getMaxBatchDelayMillis()
Returns the maximum delay in milliseconds to aggregate a batch.- Returns:
- the maximum delay in milliseconds to aggregate a batch
-
setQueueSize
public void setQueueSize(int queueSize)
Sets the configured size of the workers queue.- Parameters:
queueSize
- the configured size of the workers queue
-
getQueueSize
public int getQueueSize()
Returns the configured size of the workers queue.- Returns:
- requested size of the workers queue.
-
setMinWorkers
public void setMinWorkers(int minWorkers)
Sets the starting number of min workers.- Parameters:
minWorkers
- Sets the starting number of min workers
-
getMinWorkers
public int getMinWorkers(ai.djl.Device device)
Returns the minimum number of workers.- Parameters:
device
- the device to get the min workers for- Returns:
- the minimum number of workers
-
setMaxWorkers
public void setMaxWorkers(int maxWorkers)
Sets the starting number of max workers.- Parameters:
maxWorkers
- Sets the starting number of max workers
-
getMaxWorkers
public int getMaxWorkers(ai.djl.Device device)
Returns the maximum number of workers.- Parameters:
device
- the device to get the max workers for- Returns:
- the maximum number of workers
-
setMinMaxWorkers
public void setMinMaxWorkers(int minWorkers, int maxWorkers)
Sets the starting minimum and maximum number of workers.- Parameters:
minWorkers
- the new minimum number of workersmaxWorkers
- the new maximum number of workers
-
equals
public boolean equals(java.lang.Object o)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-