A B C D E F G H I J L M O P R S T U V W
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- ai.djl.serving.wlm - package ai.djl.serving.wlm
-
Contains the model server backend which manages worker threads and executes jobs on models.
- ai.djl.serving.wlm.util - package ai.djl.serving.wlm.util
-
Contains utilities to support the
WorkLoadManager
.
B
- build() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Builds the
WorkerThread
with the provided data. - builder(ModelInfo<I, O>) - Static method in class ai.djl.serving.wlm.WorkerThread
-
Creates a builder to build a
WorkerThread
.
C
- cleanup() - Method in class ai.djl.serving.wlm.WorkerPool
-
removes all stopped workers and workers in state error from the pool.
- close() - Method in class ai.djl.serving.wlm.ModelInfo
-
Close all loaded models.
- close() - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Close all models related to the
WorkloadManager
. - configureWorkers(int, int) - Method in class ai.djl.serving.wlm.WorkerGroup
-
Configures minimum and maximum number of workers.
D
- decreaseRef() - Method in class ai.djl.serving.wlm.WorkerPool
-
Decrease the reference count and return the current count.
E
- equals(Object) - Method in class ai.djl.serving.wlm.ModelInfo
F
- FAILED - ai.djl.serving.wlm.ModelInfo.Status
G
- generate() - Method in class ai.djl.serving.wlm.WorkerIdGenerator
-
generate a new worker id.
- getBatchSize() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured batch size.
- getBatchSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default batchSize for workers.
- getDefaultMaxWorkers(Model) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default maximum number of workers for a new registered model.
- getDefaultMinWorkers(Model) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default minimum number of workers for a new registered model.
- getDevice() - Method in class ai.djl.serving.wlm.WorkerGroup
-
Returns the device of the worker group.
- getDevice() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the device used by the thread.
- getEngineName() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the engine name.
- getFuture() - Method in class ai.djl.serving.wlm.util.WorkerJob
-
Returns the future for the job.
- getId() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model ID.
- getInput() - Method in class ai.djl.serving.wlm.Job
-
Returns the input data.
- getInputClass() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model input class.
- getInstance() - Static method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the singleton
ConfigManager
instance. - getJob() - Method in class ai.djl.serving.wlm.util.WorkerJob
-
Returns the
Job
. - getJobQueue() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns the
JobQueue
for this model. - getJobQueueSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default job queue size.
- getLoadOnDevices() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the devices the model will be loaded on at startup.
- getLoadOnDevices() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the devices the model will be loaded on at startup.
- getMaxBatchDelayMillis() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the maximum delay in milliseconds to aggregate a batch.
- getMaxBatchDelayMillis() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default max batch delay in milliseconds for the working queue.
- getMaxIdleSeconds() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured max idle time in seconds of workers.
- getMaxIdleSeconds() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default max idle time for workers.
- getMaxWorkers() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the maximum number of workers.
- getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkerGroup
-
Returns the max number of workers for the model and device.
- getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns the maximum number of workers for a model across all devices.
- getMinWorkers() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the minimum number of workers.
- getMinWorkers() - Method in class ai.djl.serving.wlm.WorkerGroup
-
Returns the min number of workers for the model and device.
- getModel() - Method in class ai.djl.serving.wlm.Job
-
Returns the model that associated with this job.
- getModel() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns the model of the worker pool.
- getModel(Device) - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the loaded
ZooModel
for a device. - getModels() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns all loaded models.
- getModelUrl() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model url.
- getNumRunningWorkers(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the number of running workers of a model.
- getOutputClass() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model output class.
- getQueueSize() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the configured size of the workers queue.
- getReservedMemoryMb() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns the default reserved memory in MB.
- getStartTime() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the thread start time.
- getState() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the worker state.
- getStatus() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model loading status.
- getVersion() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the model version.
- getWaitingMicroSeconds() - Method in class ai.djl.serving.wlm.Job
-
Returns the wait time of this job.
- getWorkerGroups() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns a map of
WorkerGroup
. - getWorkerId() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns the worker thread ID.
- getWorkerPool(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Returns the
WorkerPool
for a model. - getWorkers() - Method in class ai.djl.serving.wlm.WorkerGroup
-
Returns a list of workers.
- getWorkers() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns a list of worker thread.
H
- hashCode() - Method in class ai.djl.serving.wlm.ModelInfo
- hasInputOutputClass(Class<I>, Class<O>) - Method in class ai.djl.serving.wlm.ModelInfo
-
Clarifies the input and output class when not specified.
I
- increaseRef() - Method in class ai.djl.serving.wlm.WorkerPool
-
Increases the reference count.
- inferModelNameFromUrl(String) - Static method in class ai.djl.serving.wlm.ModelInfo
-
Infer model name form model URL in case model name is not provided.
- initialize() - Method in class ai.djl.serving.wlm.ModelInfo
-
Initialize the model.
- initWorkers(String, int, int) - Method in class ai.djl.serving.wlm.WorkerPool
-
Initializes new worker capacities for this model.
- isAllWorkerBusy() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns
true
if all workers are busy. - isAllWorkerDied() - Method in class ai.djl.serving.wlm.WorkerPool
-
Return if all workers died.
- isDebug() - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Returns if debug is enabled.
- isFinished() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator
-
Checks if this
BatchAggregator
and the thread can be shutdown or if this aggregator waits for more data. - isFinished() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator
-
Checks if this
BatchAggregator
and the thread can be shutdown or if this aggregator waits for more data. - isFixPoolThread() - Method in class ai.djl.serving.wlm.WorkerThread
-
check if this worker is instantiate is one of the fix threads of a pool.
- isFullyScaled() - Method in class ai.djl.serving.wlm.WorkerPool
-
Returns if the worker groups is fully scaled.
- isParallelLoading() - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns if the model can be load parallel on multiple devices.
- isRunning() - Method in class ai.djl.serving.wlm.WorkerThread
-
Returns true if the worker thread is running.
J
- Job<I,O> - Class in ai.djl.serving.wlm
-
A class represents an inference job.
- Job(ModelInfo<I, O>, I) - Constructor for class ai.djl.serving.wlm.Job
-
Constructs a new
Job
instance.
L
- LmiUtils - Class in ai.djl.serving.wlm
-
A utility class to detect optimal engine for LMI model.
- load(Device) - Method in class ai.djl.serving.wlm.ModelInfo
-
Loads the model to the specified device.
M
- ModelInfo<I,O> - Class in ai.djl.serving.wlm
-
A class represent a loaded model and it's metadata.
- ModelInfo(String) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a new
ModelInfo
instance. - ModelInfo(String, String, Criteria<I, O>) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a
ModelInfo
based on aCriteria
. - ModelInfo(String, String, String, String, String, Class<I>, Class<O>, int, int, int, int, int, int) - Constructor for class ai.djl.serving.wlm.ModelInfo
-
Constructs a new
ModelInfo
instance. - ModelInfo.Status - Enum in ai.djl.serving.wlm
-
An enum represents state of a model.
O
- optFixPoolThread(boolean) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets if the workerThread should be part of the fixed pool.
P
- PENDING - ai.djl.serving.wlm.ModelInfo.Status
- PermanentBatchAggregator<I,O> - Class in ai.djl.serving.wlm
-
a batch aggregator that never terminates by itself.
- PermanentBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.PermanentBatchAggregator
-
Constructs a
PermanentBatchAggregator
instance. - pollBatch() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator
-
Fills in the list with a batch of jobs.
- pollBatch() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator
-
Fills in the list with a batch of jobs.
- postWorkflowParsing(String) - Method in class ai.djl.serving.wlm.ModelInfo
-
Performs post workflow parsing initialization.
R
- READY - ai.djl.serving.wlm.ModelInfo.Status
- registerModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Registers a model and returns the
WorkerPool
for it. - run() - Method in class ai.djl.serving.wlm.WorkerThread
- runJob(Job<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Adds an inference job to the job queue of the next free worker.
S
- scaleWorkers(String, int, int) - Method in class ai.djl.serving.wlm.WorkerPool
-
Sets new worker capacities for this model.
- setBatchSize(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured batch size.
- setBatchSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default batchSize for workers.
- setDevice(Device) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
RSets the device to run operations on.
- setId(String) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the model ID.
- setJobQueue(LinkedBlockingDeque<WorkerJob<I, O>>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder
-
Sets the jobQueue used to poll for new jobs.
- setJobQueueSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default job queue size.
- setLoadOnDevices(String) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the devices the model will be loaded on at startup.
- setMaxBatchDelayMillis(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the maximum delay in milliseconds to aggregate a batch.
- setMaxBatchDelayMillis(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default max batch delay in milliseconds for the working queue.
- setMaxIdleSeconds(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured max idle time in seconds of workers.
- setMaxIdleSeconds(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the default max idle time in seconds for workers.
- setQueueSize(int) - Method in class ai.djl.serving.wlm.ModelInfo
-
Sets the configured size of the workers queue.
- setReservedMemoryMb(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager
-
Sets the reserved memory in MB.
- shutdown() - Method in class ai.djl.serving.wlm.WorkerPool
-
Shuts down all the worker threads in the work pool.
- shutdown(WorkerState) - Method in class ai.djl.serving.wlm.WorkerThread
-
Shuts down the worker thread.
- shutdownWorkers() - Method in class ai.djl.serving.wlm.WorkerPool
-
Shutdown all works.
T
- TemporaryBatchAggregator<I,O> - Class in ai.djl.serving.wlm
-
a batch aggregator that terminates after a maximum idle time.
- TemporaryBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.TemporaryBatchAggregator
-
a batch aggregator that terminates after a maximum idle time.
- toString() - Method in class ai.djl.serving.wlm.ModelInfo
U
- unregisterModel(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager
-
Removes a model from management.
V
- valueOf(String) - Static method in enum ai.djl.serving.wlm.ModelInfo.Status
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum ai.djl.serving.wlm.WorkerState
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum ai.djl.serving.wlm.ModelInfo.Status
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum ai.djl.serving.wlm.WorkerState
-
Returns an array containing the constants of this enum type, in the order they are declared.
W
- withDefaultDevice(String) - Method in class ai.djl.serving.wlm.ModelInfo
-
Returns the default device for this model if device is null.
- WlmCapacityException - Exception in ai.djl.serving.wlm.util
-
Thrown to throttle when a job is run but the job queue capacity is exceeded.
- WlmCapacityException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException
-
Constructs a
WlmCapacityException
with the specified detail message. - WlmCapacityException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException
-
Constructs a
WlmCapacityException
with the specified detail message and cause. - WlmConfigManager - Class in ai.djl.serving.wlm.util
-
This manages some configurations used by the
WorkLoadManager
. - WlmException - Exception in ai.djl.serving.wlm.util
-
Thrown when an exception occurs inside the
WorkLoadManager
. - WlmException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmException
-
Constructs a
WlmException
with the specified detail message. - WlmException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmException
-
Constructs a
WlmException
with the specified detail message and cause. - WlmOutOfMemoryException - Exception in ai.djl.serving.wlm.util
-
Thrown when no enough memory to load the model.
- WlmOutOfMemoryException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmOutOfMemoryException
-
Constructs a
WlmOutOfMemoryException
with the specified detail message. - WlmShutdownException - Exception in ai.djl.serving.wlm.util
-
Thrown when a job is run but all workers are shutdown.
- WlmShutdownException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException
-
Constructs a
WlmShutdownException
with the specified detail message. - WlmShutdownException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException
-
Constructs a
WlmShutdownException
with the specified detail message and cause. - WORKER_BUSY - ai.djl.serving.wlm.WorkerState
- WORKER_ERROR - ai.djl.serving.wlm.WorkerState
- WORKER_MODEL_LOADED - ai.djl.serving.wlm.WorkerState
- WORKER_SCALED_DOWN - ai.djl.serving.wlm.WorkerState
- WORKER_STARTED - ai.djl.serving.wlm.WorkerState
- WORKER_STOPPED - ai.djl.serving.wlm.WorkerState
- WorkerGroup<I,O> - Class in ai.djl.serving.wlm
- WorkerIdGenerator - Class in ai.djl.serving.wlm
-
class to generate an unique worker id.
- WorkerIdGenerator() - Constructor for class ai.djl.serving.wlm.WorkerIdGenerator
- WorkerJob<I,O> - Class in ai.djl.serving.wlm.util
-
A
Job
containing metadata from theWorkLoadManager
. - WorkerJob(Job<I, O>, CompletableFuture<O>) - Constructor for class ai.djl.serving.wlm.util.WorkerJob
-
Constructs a new
WorkerJob
. - WorkerPool<I,O> - Class in ai.djl.serving.wlm
-
Manages the work load for a single model.
- WorkerState - Enum in ai.djl.serving.wlm
-
An enum represents state of a worker.
- WorkerThread<I,O> - Class in ai.djl.serving.wlm
-
The
WorkerThread
is the worker managed by theWorkLoadManager
. - WorkerThread.Builder<I,O> - Class in ai.djl.serving.wlm
-
A Builder to construct a
WorkerThread
. - WorkLoadManager - Class in ai.djl.serving.wlm
-
WorkLoadManager is responsible to manage the work load of worker thread.
- WorkLoadManager() - Constructor for class ai.djl.serving.wlm.WorkLoadManager
-
Constructs a
WorkLoadManager
instance.
All Classes All Packages