Package ai.djl.serving.wlm
Class ModelInfo<I,O>
java.lang.Object
ai.djl.serving.wlm.WorkerPoolConfig<I,O>
ai.djl.serving.wlm.ModelInfo<I,O>
A class represent a loaded model and it's metadata.
-
Nested Class Summary
Nested classes/interfaces inherited from class ai.djl.serving.wlm.WorkerPoolConfig
WorkerPoolConfig.Status, WorkerPoolConfig.ThreadConfig<I,
O> -
Field Summary
Fields inherited from class ai.djl.serving.wlm.WorkerPoolConfig
batchSize, id, maxBatchDelayMillis, maxIdleSeconds, maxWorkers, minWorkers, modelUrl, queueSize, uid, version
-
Constructor Summary
ConstructorDescriptionConstructs a newModelInfo
instance.Constructs aModelInfo
based on aCriteria
.ModelInfo
(String id, String modelUrl, String version, String engineName, String loadOnDevices, Class<I> inputClass, Class<O> outputClass, int queueSize, int maxIdleSeconds, int maxBatchDelayMillis, int batchSize, int minWorkers, int maxWorkers) Constructs a newModelInfo
instance. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Close all loaded workers.getAdapter
(String name) Returns an adapter on thisModelInfo
.Returns the adapters for this model.ai.djl.engine.Engine
Returns the engine.Returns the engine name.Returns the model input class.String[]
Returns the devices the worker type will be loaded on at startup.int
getMaxWorkers
(ai.djl.Device device) Returns the maximum number of workers.int
getMinWorkers
(ai.djl.Device device) Returns the minimum number of workers.getModel
(ai.djl.Device device) Returns the loadedZooModel
for a device.Returns all loaded models.Returns the model output class.Returns the properties of the model.Returns the worker type loading status.void
hasInputOutputClass
(Class<I> inputClass, Class<O> outputClass) Clarifies the input and output class when not specified.static String
Infer model name form model URL in case model name is not provided.void
Initialize the worker.boolean
Returns if the worker type can be load parallel on multiple devices.void
load
(ai.djl.Device device) Loads the worker type to the specified device.newThread
(ai.djl.Device device) Starts a newWorkerThread
for thisWorkerPoolConfig
.void
registerAdapter
(Adapter adapter) Adds an adapter to thisModelInfo
.unregisterAdapter
(String name) Removes an adapter from thisModelInfo
.ai.djl.Device
withDefaultDevice
(String deviceName) Returns the default device for this model if device is null.Methods inherited from class ai.djl.serving.wlm.WorkerPoolConfig
equals, getBatchSize, getId, getMaxBatchDelayMillis, getMaxIdleSeconds, getModelUrl, getQueueSize, getUid, getVersion, hashCode, setBatchSize, setId, setMaxBatchDelayMillis, setMaxIdleSeconds, setMaxWorkers, setMinMaxWorkers, setMinWorkers, setQueueSize, toString
-
Constructor Details
-
ModelInfo
Constructs a newModelInfo
instance.- Parameters:
modelUrl
- the model Url
-
ModelInfo
Constructs aModelInfo
based on aCriteria
.- Parameters:
id
- the id for the createdModelInfo
modelUrl
- the model Urlcriteria
- the model criteria
-
ModelInfo
public ModelInfo(String id, String modelUrl, String version, String engineName, String loadOnDevices, Class<I> inputClass, Class<O> outputClass, int queueSize, int maxIdleSeconds, int maxBatchDelayMillis, int batchSize, int minWorkers, int maxWorkers) Constructs a newModelInfo
instance.- Parameters:
id
- the ID of the model that will be used by workflowmodelUrl
- the model urlversion
- the version of the modelengineName
- the engine to load the modelloadOnDevices
- the devices to load the model oninputClass
- the model input classoutputClass
- the model output classqueueSize
- the maximum request queue sizemaxIdleSeconds
- the initial maximum idle time for workersmaxBatchDelayMillis
- the initial maximum delay when scaling up before giving upbatchSize
- the batch size for this modelminWorkers
- the minimum number of workersmaxWorkers
- the maximum number of workers
-
-
Method Details
-
getProperties
Returns the properties of the model.- Returns:
- the properties of the model
-
load
Loads the worker type to the specified device.- Specified by:
load
in classWorkerPoolConfig<I,
O> - Parameters:
device
- the device to load worker type on- Throws:
ai.djl.ModelException
- if failed to load the specified modelIOException
- if failed to read worker type file
-
getModels
Returns all loaded models.- Returns:
- all loaded models
-
getModel
Returns the loadedZooModel
for a device.- Parameters:
device
- the device to return the model on- Returns:
- the loaded
ZooModel
-
newThread
Starts a newWorkerThread
for thisWorkerPoolConfig
.- Specified by:
newThread
in classWorkerPoolConfig<I,
O> - Parameters:
device
- the device to run on- Returns:
- the new
WorkerPoolConfig.ThreadConfig
-
getEngine
public ai.djl.engine.Engine getEngine()Returns the engine.- Returns:
- the engine
-
getEngineName
Returns the engine name.- Returns:
- the engine name
-
getStatus
Returns the worker type loading status.- Specified by:
getStatus
in classWorkerPoolConfig<I,
O> - Returns:
- the worker type loading status
-
getInputClass
Returns the model input class.- Returns:
- the model input class
-
getOutputClass
Returns the model output class.- Returns:
- the model output class
-
hasInputOutputClass
Clarifies the input and output class when not specified.Warning: This is intended for internal use with reflection.
- Parameters:
inputClass
- the model input classoutputClass
- the model output class
-
getMinWorkers
public int getMinWorkers(ai.djl.Device device) Returns the minimum number of workers.- Overrides:
getMinWorkers
in classWorkerPoolConfig<I,
O> - Parameters:
device
- the device to get the min workers for- Returns:
- the minimum number of workers
-
getMaxWorkers
public int getMaxWorkers(ai.djl.Device device) Returns the maximum number of workers.- Overrides:
getMaxWorkers
in classWorkerPoolConfig<I,
O> - Parameters:
device
- the device to get the max workers for- Returns:
- the maximum number of workers
-
initialize
Initialize the worker.- Specified by:
initialize
in classWorkerPoolConfig<I,
O> - Throws:
IOException
- if failed to download workerModelNotFoundException
- if model not foundai.djl.ModelException
-
registerAdapter
Adds an adapter to thisModelInfo
.- Parameters:
adapter
- the adapter to add
-
unregisterAdapter
Removes an adapter from thisModelInfo
.- Parameters:
name
- the adapter to remove- Returns:
- the removed adapter
-
getAdapters
Returns the adapters for this model.- Returns:
- the adapters for this model
-
getAdapter
Returns an adapter on thisModelInfo
.- Parameters:
name
- the adapter name to get- Returns:
- the adapter
-
close
public void close()Close all loaded workers.- Specified by:
close
in classWorkerPoolConfig<I,
O>
-
inferModelNameFromUrl
Infer model name form model URL in case model name is not provided.- Parameters:
url
- the model URL- Returns:
- the model name
-
withDefaultDevice
Returns the default device for this model if device is null.- Overrides:
withDefaultDevice
in classWorkerPoolConfig<I,
O> - Parameters:
deviceName
- the device to use if it is not null- Returns:
- a non-null device
-
getLoadOnDevices
Returns the devices the worker type will be loaded on at startup.- Specified by:
getLoadOnDevices
in classWorkerPoolConfig<I,
O> - Returns:
- the devices the worker type will be loaded on at startup
-
isParallelLoading
public boolean isParallelLoading()Returns if the worker type can be load parallel on multiple devices.- Specified by:
isParallelLoading
in classWorkerPoolConfig<I,
O> - Returns:
- if the worker type can be load parallel on multiple devices
-