Package ai.djl.serving.wlm
Class ModelInfo<I,O>
- java.lang.Object
-
- ai.djl.serving.wlm.WorkerPoolConfig<I,O>
-
- ai.djl.serving.wlm.ModelInfo<I,O>
-
public final class ModelInfo<I,O> extends WorkerPoolConfig<I,O>
A class represent a loaded model and it's metadata.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected class
ModelInfo.ModelThread
-
Nested classes/interfaces inherited from class ai.djl.serving.wlm.WorkerPoolConfig
WorkerPoolConfig.Status, WorkerPoolConfig.ThreadConfig<I,O>
-
-
Field Summary
-
Fields inherited from class ai.djl.serving.wlm.WorkerPoolConfig
batchSize, id, maxBatchDelayMillis, maxIdleSeconds, maxWorkers, minWorkers, modelUrl, queueSize, version
-
-
Constructor Summary
Constructors Constructor Description ModelInfo(java.lang.String modelUrl)
Constructs a newModelInfo
instance.ModelInfo(java.lang.String id, java.lang.String modelUrl, ai.djl.repository.zoo.Criteria<I,O> criteria)
Constructs aModelInfo
based on aCriteria
.ModelInfo(java.lang.String id, java.lang.String modelUrl, java.lang.String version, java.lang.String engineName, java.lang.String loadOnDevices, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass, int queueSize, int maxIdleSeconds, int maxBatchDelayMillis, int batchSize, int minWorkers, int maxWorkers)
Constructs a newModelInfo
instance.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Close all loaded workers.Adapter
getAdapter(java.lang.String name)
Returns an adapter on thisModelInfo
.java.util.Map<java.lang.String,Adapter>
getAdapters()
Returns the adapters for this model.ai.djl.engine.Engine
getEngine()
Returns the engine.java.lang.String
getEngineName()
Returns the engine name.java.lang.Class<I>
getInputClass()
Returns the model input class.java.lang.String[]
getLoadOnDevices()
Returns the devices the worker type will be loaded on at startup.int
getMaxWorkers(ai.djl.Device device)
Returns the maximum number of workers.int
getMinWorkers(ai.djl.Device device)
Returns the minimum number of workers.ai.djl.repository.zoo.ZooModel<I,O>
getModel(ai.djl.Device device)
Returns the loadedZooModel
for a device.java.util.Map<ai.djl.Device,ai.djl.repository.zoo.ZooModel<I,O>>
getModels()
Returns all loaded models.java.lang.Class<O>
getOutputClass()
Returns the model output class.WorkerPoolConfig.Status
getStatus()
Returns the worker type loading status.void
hasInputOutputClass(java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Clarifies the input and output class when not specified.static java.lang.String
inferModelNameFromUrl(java.lang.String url)
Infer model name form model URL in case model name is not provided.void
initialize()
Initialize the worker.boolean
isParallelLoading()
Returns if the worker type can be load parallel on multiple devices.void
load(ai.djl.Device device)
Loads the worker type to the specified device.WorkerPoolConfig.ThreadConfig<I,O>
newThread(ai.djl.Device device)
Starts a newWorkerThread
for thisWorkerPoolConfig
.void
postWorkflowParsing(java.lang.String workflowDir)
Performs post workflow parsing initialization.void
registerAdapter(Adapter adapter)
Adds an adapter to thisModelInfo
.Adapter
unregisterAdapter(java.lang.String name)
Removes an adapter from thisModelInfo
.ai.djl.Device
withDefaultDevice(java.lang.String deviceName)
Returns the default device for this model if device is null.-
Methods inherited from class ai.djl.serving.wlm.WorkerPoolConfig
equals, getBatchSize, getId, getMaxBatchDelayMillis, getMaxIdleSeconds, getModelUrl, getQueueSize, getVersion, hashCode, setBatchSize, setId, setMaxBatchDelayMillis, setMaxIdleSeconds, setMaxWorkers, setMinMaxWorkers, setMinWorkers, setQueueSize, toString
-
-
-
-
Constructor Detail
-
ModelInfo
public ModelInfo(java.lang.String modelUrl)
Constructs a newModelInfo
instance.- Parameters:
modelUrl
- the model Url
-
ModelInfo
public ModelInfo(java.lang.String id, java.lang.String modelUrl, ai.djl.repository.zoo.Criteria<I,O> criteria)
Constructs aModelInfo
based on aCriteria
.- Parameters:
id
- the id for the createdModelInfo
modelUrl
- the model Urlcriteria
- the model criteria
-
ModelInfo
public ModelInfo(java.lang.String id, java.lang.String modelUrl, java.lang.String version, java.lang.String engineName, java.lang.String loadOnDevices, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass, int queueSize, int maxIdleSeconds, int maxBatchDelayMillis, int batchSize, int minWorkers, int maxWorkers)
Constructs a newModelInfo
instance.- Parameters:
id
- the ID of the model that will be used by workflowmodelUrl
- the model urlversion
- the version of the modelengineName
- the engine to load the modelloadOnDevices
- the devices to load the model oninputClass
- the model input classoutputClass
- the model output classqueueSize
- the maximum request queue sizemaxIdleSeconds
- the initial maximum idle time for workersmaxBatchDelayMillis
- the initial maximum delay when scaling up before giving upbatchSize
- the batch size for this modelminWorkers
- the minimum number of workersmaxWorkers
- the maximum number of workers
-
-
Method Detail
-
postWorkflowParsing
public void postWorkflowParsing(java.lang.String workflowDir)
Performs post workflow parsing initialization.- Parameters:
workflowDir
- the workflow parent directory
-
load
public void load(ai.djl.Device device) throws ai.djl.ModelException, java.io.IOException
Loads the worker type to the specified device.- Specified by:
load
in classWorkerPoolConfig<I,O>
- Parameters:
device
- the device to load worker type on- Throws:
ai.djl.ModelException
- if failed to load the specified modeljava.io.IOException
- if failed to read worker type file
-
getModels
public java.util.Map<ai.djl.Device,ai.djl.repository.zoo.ZooModel<I,O>> getModels()
Returns all loaded models.- Returns:
- all loaded models
-
getModel
public ai.djl.repository.zoo.ZooModel<I,O> getModel(ai.djl.Device device)
Returns the loadedZooModel
for a device.- Parameters:
device
- the device to return the model on- Returns:
- the loaded
ZooModel
-
newThread
public WorkerPoolConfig.ThreadConfig<I,O> newThread(ai.djl.Device device)
Starts a newWorkerThread
for thisWorkerPoolConfig
.- Specified by:
newThread
in classWorkerPoolConfig<I,O>
- Parameters:
device
- the device to run on- Returns:
- the new
WorkerPoolConfig.ThreadConfig
-
getEngine
public ai.djl.engine.Engine getEngine()
Returns the engine.- Returns:
- the engine
-
getEngineName
public java.lang.String getEngineName()
Returns the engine name.- Returns:
- the engine name
-
getStatus
public WorkerPoolConfig.Status getStatus()
Returns the worker type loading status.- Specified by:
getStatus
in classWorkerPoolConfig<I,O>
- Returns:
- the worker type loading status
-
getInputClass
public java.lang.Class<I> getInputClass()
Returns the model input class.- Returns:
- the model input class
-
getOutputClass
public java.lang.Class<O> getOutputClass()
Returns the model output class.- Returns:
- the model output class
-
hasInputOutputClass
public void hasInputOutputClass(java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Clarifies the input and output class when not specified.Warning: This is intended for internal use with reflection.
- Parameters:
inputClass
- the model input classoutputClass
- the model output class
-
getMinWorkers
public int getMinWorkers(ai.djl.Device device)
Returns the minimum number of workers.- Overrides:
getMinWorkers
in classWorkerPoolConfig<I,O>
- Parameters:
device
- the device to get the min workers for- Returns:
- the minimum number of workers
-
getMaxWorkers
public int getMaxWorkers(ai.djl.Device device)
Returns the maximum number of workers.- Overrides:
getMaxWorkers
in classWorkerPoolConfig<I,O>
- Parameters:
device
- the device to get the max workers for- Returns:
- the maximum number of workers
-
initialize
public void initialize() throws java.io.IOException, ai.djl.ModelException
Initialize the worker.- Specified by:
initialize
in classWorkerPoolConfig<I,O>
- Throws:
java.io.IOException
- if failed to download workerai.djl.repository.zoo.ModelNotFoundException
- if model not foundai.djl.ModelException
-
registerAdapter
public void registerAdapter(Adapter adapter)
Adds an adapter to thisModelInfo
.- Parameters:
adapter
- the adapter to add
-
unregisterAdapter
public Adapter unregisterAdapter(java.lang.String name)
Removes an adapter from thisModelInfo
.- Parameters:
name
- the adapter to remove- Returns:
- the removed adapter
-
getAdapters
public java.util.Map<java.lang.String,Adapter> getAdapters()
Returns the adapters for this model.- Returns:
- the adapters for this model
-
getAdapter
public Adapter getAdapter(java.lang.String name)
Returns an adapter on thisModelInfo
.- Parameters:
name
- the adapter name to get- Returns:
- the adapter
-
close
public void close()
Close all loaded workers.- Specified by:
close
in classWorkerPoolConfig<I,O>
-
inferModelNameFromUrl
public static java.lang.String inferModelNameFromUrl(java.lang.String url)
Infer model name form model URL in case model name is not provided.- Parameters:
url
- the model URL- Returns:
- the model name
-
withDefaultDevice
public ai.djl.Device withDefaultDevice(java.lang.String deviceName)
Returns the default device for this model if device is null.- Overrides:
withDefaultDevice
in classWorkerPoolConfig<I,O>
- Parameters:
deviceName
- the device to use if it is not null- Returns:
- a non-null device
-
getLoadOnDevices
public java.lang.String[] getLoadOnDevices()
Returns the devices the worker type will be loaded on at startup.- Specified by:
getLoadOnDevices
in classWorkerPoolConfig<I,O>
- Returns:
- the devices the worker type will be loaded on at startup
-
isParallelLoading
public boolean isParallelLoading()
Returns if the worker type can be load parallel on multiple devices.- Specified by:
isParallelLoading
in classWorkerPoolConfig<I,O>
- Returns:
- if the worker type can be load parallel on multiple devices
-
-