Package ai.djl.serving.wlm
Class ModelInfo<I,O>
- java.lang.Object
-
- ai.djl.serving.wlm.ModelInfo<I,O>
-
- All Implemented Interfaces:
java.lang.AutoCloseable
public final class ModelInfo<I,O> extends java.lang.Object implements java.lang.AutoCloseable
A class represent a loaded model and it's metadata.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ModelInfo.Status
An enum represents state of a model.
-
Constructor Summary
Constructors Constructor Description ModelInfo(java.lang.String id, ai.djl.repository.zoo.Criteria<I,O> criteria)
Constructs aModelInfo
based on aCriteria
.ModelInfo(java.lang.String modelUrl, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Constructs a newModelInfo
instance.ModelInfo(java.lang.String id, java.lang.String modelUrl, java.lang.String version, java.lang.String engineName, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass, int queueSize, int maxIdleTime, int maxBatchDelay, int batchSize)
Constructs a newModelInfo
instance.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
ModelInfo<I,O>
configureModelBatch(int batchSize, int maxBatchDelay)
Sets a new batchSize and returns a new configured ModelInfo object.ModelInfo<I,O>
configurePool(int maxIdleTime)
Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object.boolean
equals(java.lang.Object o)
int
getBatchSize()
Returns the configured batch size.java.lang.String
getEngineName()
Returns the engine name.java.lang.Class<I>
getInputClass()
Returns the model input class.int
getMaxBatchDelay()
Returns the maximum delay in milliseconds to aggregate a batch.int
getMaxIdleTime()
Returns the configured maxIdleTime of workers.ai.djl.repository.zoo.ZooModel<I,O>
getModel(ai.djl.Device device)
Returns the loadedZooModel
for a device.java.nio.file.Path
getModelDir()
Returns the model cache directory.java.lang.String
getModelId()
Returns the model ID.java.lang.String
getModelUrl()
Returns the model url.java.lang.Class<O>
getOutputClass()
Returns the model output class.int
getQueueSize()
Returns the configured size of the workers queue.ModelInfo.Status
getStatus()
Returns the model loading status.java.lang.String
getVersion()
Returns the model version.int
hashCode()
void
hasInputOutputClass(java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Clarifies the input and output class when not specified.static java.lang.String
inferModelNameFromUrl(java.lang.String url)
Infer model name form model URL in case model name is not provided.void
load(ai.djl.Device device)
Loads the model to the specified device.void
setBatchSize(int batchSize)
Sets the configured batch size.void
setMaxBatchDelay(int maxBatchDelay)
Sets the maximum delay in milliseconds to aggregate a batch.void
setMaxIdleTime(int maxIdleTime)
Sets the configured maxIdleTime of workers.void
setModelId(java.lang.String id)
Sets the model ID.void
setQueueSize(int queueSize)
Sets the configured size of the workers queue.java.lang.String
toString()
ai.djl.Device
withDefaultDevice(ai.djl.Device device)
Returns the default device for this model if device is null.
-
-
-
Constructor Detail
-
ModelInfo
public ModelInfo(java.lang.String modelUrl, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Constructs a newModelInfo
instance.- Parameters:
inputClass
- the model input classoutputClass
- the model output classmodelUrl
- the model Url
-
ModelInfo
public ModelInfo(java.lang.String id, ai.djl.repository.zoo.Criteria<I,O> criteria)
Constructs aModelInfo
based on aCriteria
.- Parameters:
id
- the id for the createdModelInfo
criteria
- the model criteria
-
ModelInfo
public ModelInfo(java.lang.String id, java.lang.String modelUrl, java.lang.String version, java.lang.String engineName, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass, int queueSize, int maxIdleTime, int maxBatchDelay, int batchSize)
Constructs a newModelInfo
instance.- Parameters:
id
- the ID of the model that will be used by workflowmodelUrl
- the model urlversion
- the version of the modelengineName
- the engine to load the modelinputClass
- the model input classoutputClass
- the model output classqueueSize
- the maximum request queue sizemaxIdleTime
- the initial maximum idle time for workers.maxBatchDelay
- the initial maximum delay when scaling up before giving up.batchSize
- the batch size for this model.
-
-
Method Detail
-
load
public void load(ai.djl.Device device) throws ai.djl.ModelException, java.io.IOException
Loads the model to the specified device.- Parameters:
device
- the device to load model on- Throws:
java.io.IOException
- if failed to read model fileai.djl.ModelException
- if failed to load the specified model
-
configureModelBatch
public ModelInfo<I,O> configureModelBatch(int batchSize, int maxBatchDelay)
Sets a new batchSize and returns a new configured ModelInfo object. You have to triggerUpdates in theModelManager
using this new model.- Parameters:
batchSize
- the batchSize to setmaxBatchDelay
- maximum time to wait for a free space in worker queue after scaling up workers before giving up to offer the job to the queue.- Returns:
- new configured ModelInfo.
-
configurePool
public ModelInfo<I,O> configurePool(int maxIdleTime)
Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object. You have to triggerUpdates in theModelManager
using this new model.- Parameters:
maxIdleTime
- time a WorkerThread can be idle before scaling down this worker.- Returns:
- new configured ModelInfo.
-
getModel
public ai.djl.repository.zoo.ZooModel<I,O> getModel(ai.djl.Device device)
Returns the loadedZooModel
for a device.- Parameters:
device
- the device to return the model on- Returns:
- the loaded
ZooModel
-
setModelId
public void setModelId(java.lang.String id)
Sets the model ID.- Parameters:
id
- the model ID
-
getModelId
public java.lang.String getModelId()
Returns the model ID.- Returns:
- the model ID
-
getVersion
public java.lang.String getVersion()
Returns the model version.- Returns:
- the model version
-
getEngineName
public java.lang.String getEngineName()
Returns the engine name.- Returns:
- the engine name
-
getModelUrl
public java.lang.String getModelUrl()
Returns the model url.- Returns:
- the model url
-
getStatus
public ModelInfo.Status getStatus()
Returns the model loading status.- Returns:
- the model loading status
-
getModelDir
public java.nio.file.Path getModelDir()
Returns the model cache directory.- Returns:
- the model cache directory
-
getInputClass
public java.lang.Class<I> getInputClass()
Returns the model input class.- Returns:
- the model input class
-
getOutputClass
public java.lang.Class<O> getOutputClass()
Returns the model output class.- Returns:
- the model output class
-
hasInputOutputClass
public void hasInputOutputClass(java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
Clarifies the input and output class when not specified.Warning: This is intended for internal use with reflection.
- Parameters:
inputClass
- the model input classoutputClass
- the model output class
-
setMaxIdleTime
public void setMaxIdleTime(int maxIdleTime)
Sets the configured maxIdleTime of workers.- Parameters:
maxIdleTime
- the configured maxIdleTime of workers
-
getMaxIdleTime
public int getMaxIdleTime()
Returns the configured maxIdleTime of workers.- Returns:
- the maxIdleTime
-
setBatchSize
public void setBatchSize(int batchSize)
Sets the configured batch size.- Parameters:
batchSize
- the configured batch size
-
getBatchSize
public int getBatchSize()
Returns the configured batch size.- Returns:
- the configured batch size
-
setMaxBatchDelay
public void setMaxBatchDelay(int maxBatchDelay)
Sets the maximum delay in milliseconds to aggregate a batch.- Parameters:
maxBatchDelay
- the maximum delay in milliseconds to aggregate a batch
-
getMaxBatchDelay
public int getMaxBatchDelay()
Returns the maximum delay in milliseconds to aggregate a batch.- Returns:
- the maximum delay in milliseconds to aggregate a batch
-
setQueueSize
public void setQueueSize(int queueSize)
Sets the configured size of the workers queue.- Parameters:
queueSize
- the configured size of the workers queue
-
getQueueSize
public int getQueueSize()
Returns the configured size of the workers queue.- Returns:
- requested size of the workers queue.
-
close
public void close()
- Specified by:
close
in interfacejava.lang.AutoCloseable
-
inferModelNameFromUrl
public static java.lang.String inferModelNameFromUrl(java.lang.String url)
Infer model name form model URL in case model name is not provided.- Parameters:
url
- the model URL- Returns:
- the model name
-
withDefaultDevice
public ai.djl.Device withDefaultDevice(ai.djl.Device device)
Returns the default device for this model if device is null.- Parameters:
device
- the device to use if it is not null- Returns:
- a non-null device
-
equals
public boolean equals(java.lang.Object o)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-