This document is the API specification for the DJL Serving WorkLoadManager.
This module provides the worker and thread management for a high-performance inference server. See here for more details.
Package | Description |
---|---|
ai.djl.serving.wlm |
Contains the model server backend which manages worker threads and executes jobs on models.
|
ai.djl.serving.wlm.util |
Contains utilities to support the
WorkLoadManager . |