ai.djl.serving.wlm (DJL Serving 0.27.0

package ai.djl.serving.wlm

Contains the model server backend which manages worker threads and executes jobs on models.

See Also:

WorkLoadManager

Related Packages

Package

Description

ai.djl.serving.wlm.util

Contains utilities to support the WorkLoadManager.
Class

Description

Adapter

An adapter is a modification producing a variation of a model that can be used during prediction.

Job<I,O>

A class represents an inference job.

JobFunction<I,O>

A function describing the action to take in a Job.

LmiConfigRecommender

A utility class to auto configure LMI model properties.

LmiUtils

A utility class to detect optimal engine for LMI model.

ModelInfo<I,O>

A class represent a loaded model and it's metadata.

PermanentBatchAggregator<I,O>

a batch aggregator that never terminates by itself.

PyAdapter

An overload of Adapter for the python engine.

SageMakerUtils

A utility class to detect optimal engine for SageMaker saved model.

TemporaryBatchAggregator<I,O>

a batch aggregator that terminates after a maximum idle time.

WorkerGroup<I,O>

The WorkerGroup manages the WorkerPool for a particular Device.

WorkerPool<I,O>

Manages the work load for a single model.

WorkerPoolConfig<I,O>

A WorkerPoolConfig represents a task that could be run in the WorkLoadManager.

WorkerPoolConfig.Status

An enum represents state of a worker type.

WorkerPoolConfig.ThreadConfig<I,O>

The part of the WorkerPoolConfig for an individual WorkerThread.

WorkerState

An enum represents state of a worker.

WorkerThread<I,O>

The WorkerThread is the worker managed by the WorkLoadManager.

WorkerThread.Builder<I,O>

A Builder to construct a WorkerThread.

WorkLoadManager

WorkLoadManager is responsible to manage the work load of worker thread.

Package ai.djl.serving.wlm