Index (DJL Serving WorkLoadManager 0.18.0)

A B C E F G H I J L M O P R S T U V W
All Classes All Packages

A

ai.djl.serving.wlm - package ai.djl.serving.wlm: Contains the model server backend which manages worker threads and executes jobs on models.
ai.djl.serving.wlm.util - package ai.djl.serving.wlm.util: Contains utilities to support the WorkLoadManager.

B

build() - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Builds the WorkerThread with the provided data.
builder(Class<I>, Class<O>) - Static method in class ai.djl.serving.wlm.WorkerThread: Creates a builder to build a WorkerThread.

C

cleanup() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: removes all stopped workers and workers in state error from the pool.
close() - Method in class ai.djl.serving.wlm.ModelInfo
close() - Method in class ai.djl.serving.wlm.WorkLoadManager
close() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool
configureModelBatch(int, int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets a new batchSize and returns a new configured ModelInfo object.
configurePool(int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object.

E

equals(Object) - Method in class ai.djl.serving.wlm.ModelInfo

F

FAILED - ai.djl.serving.wlm.ModelInfo.Status
forDevice(Device) - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Returns the WorkLoadManager.WorkerPool.WorkerPoolDevice for a particular Device.

G

generate() - Method in class ai.djl.serving.wlm.WorkerIdGenerator: generate a new worker id.
getBatchSize() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the configured batch size.
getBatchSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default batchSize for workers.
getBegin() - Method in class ai.djl.serving.wlm.Job: Returns the job begin time.
getDefaultMaxWorkers(ModelInfo<?, ?>, Device, int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default maximum number of workers for a new registered model.
getDefaultMinWorkers(ModelInfo<?, ?>, Device, int, int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default minimum number of workers for a new registered model.
getDevice() - Method in class ai.djl.serving.wlm.WorkerThread: Returns the device used by the thread.
getEngineName() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the engine name.
getFuture() - Method in class ai.djl.serving.wlm.util.WorkerJob: Returns the future for the job.
getInput() - Method in class ai.djl.serving.wlm.Job: Returns the input data.
getInputClass() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model input class.
getInstance() - Static method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the singleton ConfigManager instance.
getJob() - Method in class ai.djl.serving.wlm.util.WorkerJob: Returns the Job.
getJobQueue() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Returns the JobQueue for this model.
getJobQueueSize() - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default job queue size.
getMaxBatchDelay() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the maximum delay in milliseconds to aggregate a batch.
getMaxBatchDelay() - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default maxBatchDelay for the working queue.
getMaxIdleTime() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the configured maxIdleTime of workers.
getMaxIdleTime() - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns the default max idle time for workers.
getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Returns the maximum number of workers for a model across all devices.
getMaxWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool.WorkerPoolDevice: Returns the max number of workers for the model and device.
getMinWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Returns the minimum number of workers for a model across all devices.
getMinWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool.WorkerPoolDevice: Returns the min number of workers for the model and device.
getModel() - Method in class ai.djl.serving.wlm.Job: Returns the model that associated with this job.
getModel(Device) - Method in class ai.djl.serving.wlm.ModelInfo: Returns the loaded ZooModel for a device.
getModelDir() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model cache directory.
getModelId() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model ID.
getModelUrl() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model url.
getNumRunningWorkers(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Returns the number of running workers of a model.
getOutputClass() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model output class.
getQueueLength(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Returns the current number of request in the queue.
getQueueSize() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the configured size of the workers queue.
getStartTime() - Method in class ai.djl.serving.wlm.WorkerThread: Returns the thread start time.
getState() - Method in class ai.djl.serving.wlm.WorkerThread: Returns the worker state.
getStatus() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model loading status.
getVersion() - Method in class ai.djl.serving.wlm.ModelInfo: Returns the model version.
getWaitingTime() - Method in class ai.djl.serving.wlm.Job: Returns the wait time of this job.
getWorkerId() - Method in class ai.djl.serving.wlm.WorkerThread: Returns the worker thread ID.
getWorkerPoolForModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Returns the WorkLoadManager.WorkerPool for a model.
getWorkers() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Returns a list of worker thread.
getWorkers(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Returns the workers for the specific model.

H

hashCode() - Method in class ai.djl.serving.wlm.ModelInfo
hasInputOutputClass(Class<I>, Class<O>) - Method in class ai.djl.serving.wlm.ModelInfo: Clarifies the input and output class when not specified.

I

inferModelNameFromUrl(String) - Static method in class ai.djl.serving.wlm.ModelInfo: Infer model name form model URL in case model name is not provided.
isDebug() - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Returns if debug is enabled.
isFinished() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator: Checks if this BatchAggregator and the thread can be shutdown or if this aggregator waits for more data.
isFinished() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator: Checks if this BatchAggregator and the thread can be shutdown or if this aggregator waits for more data.
isFixPoolThread() - Method in class ai.djl.serving.wlm.WorkerThread: check if this worker is instantiate is one of the fix threads of a pool.
isRunning() - Method in class ai.djl.serving.wlm.WorkerThread: Returns true if the worker thread is running.

J

Job<I,O> - Class in ai.djl.serving.wlm: A class represents an inference job.
Job(ModelInfo<I, O>, I) - Constructor for class ai.djl.serving.wlm.Job: Constructs a new Job instance.

L

load(Device) - Method in class ai.djl.serving.wlm.ModelInfo: Loads the model to the specified device.
log() - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Logs the current state of this WorkerPool when level "Debug" is enabled.

M

ModelInfo<I,O> - Class in ai.djl.serving.wlm: A class represent a loaded model and it's metadata.
ModelInfo(String, Criteria<I, O>) - Constructor for class ai.djl.serving.wlm.ModelInfo: Constructs a ModelInfo based on a Criteria.
ModelInfo(String, Class<I>, Class<O>) - Constructor for class ai.djl.serving.wlm.ModelInfo: Constructs a new ModelInfo instance.
ModelInfo(String, String, String, String, Class<I>, Class<O>, int, int, int, int) - Constructor for class ai.djl.serving.wlm.ModelInfo: Constructs a new ModelInfo instance.
ModelInfo.Status - Enum in ai.djl.serving.wlm: An enum represents state of a model.

O

optAggregator(BatchAggregator<I, O>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Sets a BatchAggregator which overrides the instantiated default BatchAggregator.
optFixPoolThread(boolean) - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Sets if the workerThread should be part of the fixed pool.

P

PENDING - ai.djl.serving.wlm.ModelInfo.Status
PermanentBatchAggregator<I,O> - Class in ai.djl.serving.wlm: a batch aggregator that never terminates by itself.
PermanentBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.PermanentBatchAggregator: Constructs a PermanentBatchAggregator instance.
pollBatch() - Method in class ai.djl.serving.wlm.PermanentBatchAggregator: Fills in the list with a batch of jobs.
pollBatch() - Method in class ai.djl.serving.wlm.TemporaryBatchAggregator: Fills in the list with a batch of jobs.
preBuildProcessing() - Method in class ai.djl.serving.wlm.WorkerThread.Builder

R

READY - ai.djl.serving.wlm.ModelInfo.Status
registerModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Registers a model and returns the WorkLoadManager.WorkerPool for it.
run() - Method in class ai.djl.serving.wlm.WorkerThread
runJob(Job<I, O>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Adds an inference job to the job queue of the next free worker.

S

scaleWorkers(Device, int, int) - Method in class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Sets new worker capcities for this model.
self() - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Returns self reference to this builder.
setBatchSize(int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets the configured batch size.
setBatchSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Sets the default batchSize for workers.
setDevice(Device) - Method in class ai.djl.serving.wlm.WorkerThread.Builder: RSets the device to run operations on.
setJobQueue(LinkedBlockingDeque<WorkerJob<I, O>>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Sets the jobQueue used to poll for new jobs.
setJobQueueSize(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Sets the default job queue size.
setMaxBatchDelay(int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets the maximum delay in milliseconds to aggregate a batch.
setMaxBatchDelay(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Sets the default maxBatchDelay for the working queue.
setMaxIdleTime(int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets the configured maxIdleTime of workers.
setMaxIdleTime(int) - Method in class ai.djl.serving.wlm.util.WlmConfigManager: Sets the default max idle time for workers.
setModel(ModelInfo<I, O>) - Method in class ai.djl.serving.wlm.WorkerThread.Builder: Sets the ModelInfo the thread will be responsible for.
setModelId(String) - Method in class ai.djl.serving.wlm.ModelInfo: Sets the model ID.
setQueueSize(int) - Method in class ai.djl.serving.wlm.ModelInfo: Sets the configured size of the workers queue.
shutdown(WorkerState) - Method in class ai.djl.serving.wlm.WorkerThread: Shuts down the worker thread.

T

TemporaryBatchAggregator<I,O> - Class in ai.djl.serving.wlm: a batch aggregator that terminates after a maximum idle time.
TemporaryBatchAggregator(ModelInfo<I, O>, LinkedBlockingDeque<WorkerJob<I, O>>) - Constructor for class ai.djl.serving.wlm.TemporaryBatchAggregator: a batch aggregator that terminates after a maximum idle time.
toString() - Method in class ai.djl.serving.wlm.ModelInfo

U

unregisterModel(ModelInfo<?, ?>) - Method in class ai.djl.serving.wlm.WorkLoadManager: Removes a model from management.

V

validate() - Method in class ai.djl.serving.wlm.WorkerThread.Builder
valueOf(String) - Static method in enum ai.djl.serving.wlm.ModelInfo.Status: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum ai.djl.serving.wlm.WorkerState: Returns the enum constant of this type with the specified name.
values() - Static method in enum ai.djl.serving.wlm.ModelInfo.Status: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum ai.djl.serving.wlm.WorkerState: Returns an array containing the constants of this enum type, in the order they are declared.

W

withDefaultDevice(Device) - Method in class ai.djl.serving.wlm.ModelInfo: Returns the default device for this model if device is null.
WlmCapacityException - Exception in ai.djl.serving.wlm.util: Thrown to throttle when a job is run but the job queue capacity is exceeded.
WlmCapacityException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException: Constructs a WlmCapacityException with the specified detail message.
WlmCapacityException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmCapacityException: Constructs a WlmCapacityException with the specified detail message and cause.
WlmConfigManager - Class in ai.djl.serving.wlm.util: This manages some configurations used by the WorkLoadManager.
WlmConfigManager() - Constructor for class ai.djl.serving.wlm.util.WlmConfigManager
WlmException - Exception in ai.djl.serving.wlm.util: Thrown when an exception occurs inside the WorkLoadManager.
WlmException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmException: Constructs a WlmException with the specified detail message.
WlmException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmException: Constructs a WlmException with the specified detail message and cause.
WlmShutdownException - Exception in ai.djl.serving.wlm.util: Thrown when a job is run but all workers are shutdown.
WlmShutdownException(String) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException: Constructs a WlmShutdownException with the specified detail message.
WlmShutdownException(String, Throwable) - Constructor for exception ai.djl.serving.wlm.util.WlmShutdownException: Constructs a WlmShutdownException with the specified detail message and cause.
WORKER_ERROR - ai.djl.serving.wlm.WorkerState
WORKER_MODEL_LOADED - ai.djl.serving.wlm.WorkerState
WORKER_SCALED_DOWN - ai.djl.serving.wlm.WorkerState
WORKER_STARTED - ai.djl.serving.wlm.WorkerState
WORKER_STOPPED - ai.djl.serving.wlm.WorkerState
WorkerIdGenerator - Class in ai.djl.serving.wlm: class to generate an unique worker id.
WorkerIdGenerator() - Constructor for class ai.djl.serving.wlm.WorkerIdGenerator
WorkerJob<I,O> - Class in ai.djl.serving.wlm.util: A Job containing metadata from the WorkLoadManager.
WorkerJob(Job<I, O>, CompletableFuture<O>) - Constructor for class ai.djl.serving.wlm.util.WorkerJob: Constructs a new WorkerJob.
WorkerPool(ModelInfo<I, O>) - Constructor for class ai.djl.serving.wlm.WorkLoadManager.WorkerPool: Construct and initial data structure.
WorkerState - Enum in ai.djl.serving.wlm: An enum represents state of a worker.
WorkerThread<I,O> - Class in ai.djl.serving.wlm: The WorkerThread is the worker managed by the WorkLoadManager.
WorkerThread.Builder<I,O> - Class in ai.djl.serving.wlm: A Builder to construct a WorkerThread.
WorkLoadManager - Class in ai.djl.serving.wlm: WorkLoadManager is responsible to manage the work load of worker thread.
WorkLoadManager() - Constructor for class ai.djl.serving.wlm.WorkLoadManager: Constructs a WorkLoadManager instance.
WorkLoadManager.WorkerPool<I,O> - Class in ai.djl.serving.wlm: Manages the work load for a single model.
WorkLoadManager.WorkerPool.WorkerPoolDevice - Class in ai.djl.serving.wlm: The WorkLoadManager.WorkerPool.WorkerPoolDevice manages the WorkLoadManager.WorkerPool for a particular Device.

A B C E F G H I J L M O P R S T U V W
All Classes All Packages