A B C E F G H I J L M O P R S T U V W
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- ai.djl.serving.wlm - package ai.djl.serving.wlm
-
Contains the model server backend which manages worker threads and executes jobs on models.
- ai.djl.serving.wlm.util - package ai.djl.serving.wlm.util
-
Contains utilities to support the
WorkLoadManager
.
B
- build() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Builds the
WorkerThread
with the provided data. - builder(Class<I>, Class<O>) - Static method in class ai.djl.serving.wlm.WorkerThread
-
Creates a builder to build a
WorkerThread
.
C
- cleanup() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
removes all stopped workers and workers in state error from the pool.
- close() - Method in class ai.djl.serving.wlm.ModelInfo
- close() - Method in class ai.djl.serving.wlm.WorkLoadManager
- close() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
- configureModelBatch(int, int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets a new batchSize and returns a new configured ModelInfo object.
- configurePool(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object.
E
- equals(Object) - Method in class ai.djl.serving.wlm.ModelInfo
F
- FAILED - ai.djl.serving.wlm.ModelInfo.Status
- forDevice(Device) - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Returns the
WorkLoadManager.WorkerPool.WorkerPoolDevice
for a particularDevice
.
G
- generate() - Method in class ai.djl.serving.wlm.WorkerIdGenerator
-
generate a new worker id.
- getBatchSize() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured batch size.
- getBatchSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default batchSize for workers.
- getBegin() - Method in class ai.djl.serving.wlm.Job
-
Returns the job begin time.
- getDefaultMaxWorkers(ModelInfo<?, ?>, Device, int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default maximum number of workers for a new registered model.
- getDefaultMinWorkers(ModelInfo<?, ?>, Device, int, int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default minimum number of workers for a new registered model.
- getDevice() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the device used by the thread.
- getEngineName() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the engine name.
- getFuture() - Method in class ai.djl.serving.wlm.util.WorkerJob
-
Returns the future for the job.
- getInput() - Method in class ai.djl.serving.wlm.Job
-
Returns the input data.
- getInputClass() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model input class.
- getInstance() - Static method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the singleton
ConfigManager
instance. - getJob() - Method in class ai.djl.serving.wlm.util.WorkerJob
-
Returns the
Job
. - getJobQueue() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Returns the
JobQueue
for this model. - getJobQueueSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default job queue size.
- getMaxBatchDelay() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the maximum delay in milliseconds to aggregate a batch.
- getMaxBatchDelay() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default maxBatchDelay for the working queue.
- getMaxIdleTime() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured maxIdleTime of workers.
- getMaxIdleTime() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default max idle time for workers.
- getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Returns the maximum number of workers for a model across all devices.
- getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool.WorkerPoolDevice
-
Returns the max number of workers for the model and device.
- getMinWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Returns the minimum number of workers for a model across all devices.
- getMinWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool.WorkerPoolDevice
-
Returns the min number of workers for the model and device.
- getModel() - Method in class ai.djl.serving.wlm.Job
-
Returns the model that associated with this job.
- getModel(Device) - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the loaded
ZooModel
for a device. - getModelDir() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model cache directory.
- getModelId() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model ID.
- getModelUrl() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model url.
- getNumRunningWorkers(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the number of running workers of a model.
- getOutputClass() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model output class.
- getQueueLength(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the current number of request in the queue.
- getQueueSize() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured size of the workers queue.
- getStartTime() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the thread start time.
- getState() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the worker state.
- getStatus() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model loading status.
- getVersion() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model version.
- getWaitingTime() - Method in class ai.djl.serving.wlm.Job
-
Returns the wait time of this job.
- getWorkerId() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the worker thread ID.
- getWorkerPoolForModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the
WorkLoadManager.WorkerPool
for a model. - getWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Returns a list of worker thread.
- getWorkers(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the workers for the specific model.
H
- hashCode() - Method in class ai.djl.serving.wlm.ModelInfo
- hasInputOutputClass(Class<I>, Class<O>) - Method in class ai.djl.serving.wlm.ModelInfo
-
Clarifies the input and output class when not specified.
I
- inferModelNameFromUrl(String) - Static method in class ai.djl.serving.wlm.ModelInfo
-
Infer model name form model URL in case model name is not provided.
- isDebug() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns if debug is enabled.
- isFinished() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator
-
Checks if this
BatchAggregator
and the thread can be shutdown or if this aggregator waits for more data. - isFinished() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator
-
Checks if this
BatchAggregator
and the thread can be shutdown or if this aggregator waits for more data. - isFixPoolThread() - Method in class ai.djl.serving.wlm.WorkerThread
-
check if this worker is instantiate is one of the fix threads of a pool.
- isRunning() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns true if the worker thread is running.
J
- Job<I,O> - Class in ai.djl.serving.wlm
-
A class represents an inference job.
- Job(ModelInfo<I, O>, I) - Constructor for class ai.djl.serving.wlm.Job
-
Constructs a new
Job
instance.
L
- load(Device) - Method in class ai.djl.serving.wlm.ModelInfo
-
Loads the model to the specified device.
- log() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Logs the current state of this
WorkerPool
when level "Debug" is enabled.
M
- ModelInfo<I,O> - Class in ai.djl.serving.wlm
-
A class represent a loaded model and it's metadata.
- ModelInfo(String, Criteria<I, O>) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a
ModelInfo
based on aCriteria
. - ModelInfo(String, Class<I>, Class<O>) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a new
ModelInfo
instance. - ModelInfo(String, String, String, String, Class<I>, Class<O>, int, int, int, int) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a new
ModelInfo
instance. - ModelInfo.Status - Enum in ai.djl.serving.wlm
-
An enum represents state of a model.
O
- optAggregator(BatchAggregator<I, O>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets a
BatchAggregator
which overrides the instantiated defaultBatchAggregator
. - optFixPoolThread(boolean) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets if the workerThread should be part of the fixed pool.
P
- PENDING - ai.djl.serving.wlm.ModelInfo.Status
- PermanentBatchAggregator<I,O> - Class in ai.djl.serving.wlm
-
a batch aggregator that never terminates by itself.
- PermanentBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.PermanentBatchAggregator
-
Constructs a
PermanentBatchAggregator
instance. - pollBatch() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator
-
Fills in the list with a batch of jobs.
- pollBatch() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator
-
Fills in the list with a batch of jobs.
- preBuildProcessing() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
R
- READY - ai.djl.serving.wlm.ModelInfo.Status
- registerModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Registers a model and returns the
WorkLoadManager.WorkerPool
for it. - run() - Method in class ai.djl.serving.wlm.WorkerThread
- runJob(Job<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Adds an inference job to the job queue of the next free worker.
S
- scaleWorkers(Device, int, int) - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Sets new worker capcities for this model.
- self() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Returns self reference to this builder.
- setBatchSize(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured batch size.
- setBatchSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default batchSize for workers.
- setDevice(Device) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
RSets the device to run operations on.
- setJobQueue(LinkedBlockingDeque<WorkerJob<I, O>>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets the jobQueue used to poll for new jobs.
- setJobQueueSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default job queue size.
- setMaxBatchDelay(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the maximum delay in milliseconds to aggregate a batch.
- setMaxBatchDelay(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default maxBatchDelay for the working queue.
- setMaxIdleTime(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured maxIdleTime of workers.
- setMaxIdleTime(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default max idle time for workers.
- setModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets the
ModelInfo
the thread will be responsible for. - setModelId(String) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the model ID.
- setQueueSize(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured size of the workers queue.
- shutdown(WorkerState) - Method in class ai.djl.serving.wlm.WorkerThread
-
Shuts down the worker thread.
T
- TemporaryBatchAggregator<I,O> - Class in ai.djl.serving.wlm
-
a batch aggregator that terminates after a maximum idle time.
- TemporaryBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.TemporaryBatchAggregator
-
a batch aggregator that terminates after a maximum idle time.
- toString() - Method in class ai.djl.serving.wlm.ModelInfo
U
- unregisterModel(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Removes a model from management.
V
- validate() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
- valueOf(String) - Static method in enum ai.djl.serving.wlm.ModelInfo.Status
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum ai.djl.serving.wlm.WorkerState
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum ai.djl.serving.wlm.ModelInfo.Status
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum ai.djl.serving.wlm.WorkerState
-
Returns an array containing the constants of this enum type, in the order they are declared.
W
- withDefaultDevice(Device) - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the default device for this model if device is null.
- WlmCapacityException - Exception in ai.djl.serving.wlm.util
-
Thrown to throttle when a job is run but the job queue capacity is exceeded.
- WlmCapacityException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException
-
Constructs a
WlmCapacityException
with the specified detail message. - WlmCapacityException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException
-
Constructs a
WlmCapacityException
with the specified detail message and cause. - WlmConfigManager - Class in ai.djl.serving.wlm.util
-
This manages some configurations used by the
WorkLoadManager
. - WlmConfigManager() - Constructor for class ai.djl.serving.wlm.util.WlmConfigManager
- WlmException - Exception in ai.djl.serving.wlm.util
-
Thrown when an exception occurs inside the
WorkLoadManager
. - WlmException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmException
-
Constructs a
WlmException
with the specified detail message. - WlmException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmException
-
Constructs a
WlmException
with the specified detail message and cause. - WlmShutdownException - Exception in ai.djl.serving.wlm.util
-
Thrown when a job is run but all workers are shutdown.
- WlmShutdownException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException
-
Constructs a
WlmShutdownException
with the specified detail message. - WlmShutdownException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException
-
Constructs a
WlmShutdownException
with the specified detail message and cause. - WORKER_ERROR - ai.djl.serving.wlm.WorkerState
- WORKER_MODEL_LOADED - ai.djl.serving.wlm.WorkerState
- WORKER_SCALED_DOWN - ai.djl.serving.wlm.WorkerState
- WORKER_STARTED - ai.djl.serving.wlm.WorkerState
- WORKER_STOPPED - ai.djl.serving.wlm.WorkerState
- WorkerIdGenerator - Class in ai.djl.serving.wlm
-
class to generate an unique worker id.
- WorkerIdGenerator() - Constructor for class ai.djl.serving.wlm.WorkerIdGenerator
- WorkerJob<I,O> - Class in ai.djl.serving.wlm.util
-
A
Job
containing metadata from theWorkLoadManager
. - WorkerJob(Job<I, O>, CompletableFuture<O>) - Constructor for class ai.djl.serving.wlm.util.WorkerJob
-
Constructs a new
WorkerJob
. - WorkerPool(ModelInfo<I, O>) - Constructor for class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
-
Construct and initial data structure.
- WorkerState - Enum in ai.djl.serving.wlm
-
An enum represents state of a worker.
- WorkerThread<I,O> - Class in ai.djl.serving.wlm
-
The
WorkerThread
is the worker managed by theWorkLoadManager
. - WorkerThread.Builder<I,O> - Class in ai.djl.serving.wlm
-
A Builder to construct a
WorkerThread
. - WorkLoadManager - Class in ai.djl.serving.wlm
-
WorkLoadManager is responsible to manage the work load of worker thread.
- WorkLoadManager() - Constructor for class ai.djl.serving.wlm.WorkLoadManager
-
Constructs a
WorkLoadManager
instance. - WorkLoadManager.WorkerPool<I,O> - Class in ai.djl.serving.wlm
-
Manages the work load for a single model.
- WorkLoadManager.WorkerPool.WorkerPoolDevice - Class in ai.djl.serving.wlm
-
The
WorkLoadManager.WorkerPool.WorkerPoolDevice
manages theWorkLoadManager.WorkerPool
for a particularDevice
.
All Classes All Packages