Class ModelInfo<I,​O>

  • All Implemented Interfaces:
    java.lang.AutoCloseable

    public final class ModelInfo<I,​O>
    extends java.lang.Object
    implements java.lang.AutoCloseable
    A class represent a loaded model and it's metadata.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  ModelInfo.Status
      An enum represents state of a model.
    • Constructor Summary

      Constructors 
      Constructor Description
      ModelInfo​(java.lang.String id, ai.djl.repository.zoo.Criteria<I,​O> criteria)
      Constructs a ModelInfo based on a Criteria.
      ModelInfo​(java.lang.String modelUrl, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
      Constructs a new ModelInfo instance.
      ModelInfo​(java.lang.String id, java.lang.String modelUrl, java.lang.String version, java.lang.String engineName, java.lang.Class<I> inputClass, java.lang.Class<O> outputClass, int queueSize, int maxIdleTime, int maxBatchDelay, int batchSize)
      Constructs a new ModelInfo instance.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void close()
      ModelInfo<I,​O> configureModelBatch​(int batchSize, int maxBatchDelay)
      Sets a new batchSize and returns a new configured ModelInfo object.
      ModelInfo<I,​O> configurePool​(int maxIdleTime)
      Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object.
      boolean equals​(java.lang.Object o)
      int getBatchSize()
      Returns the configured batch size.
      java.lang.String getEngineName()
      Returns the engine name.
      java.lang.Class<I> getInputClass()
      Returns the model input class.
      int getMaxBatchDelay()
      Returns the maximum delay in milliseconds to aggregate a batch.
      int getMaxIdleTime()
      Returns the configured maxIdleTime of workers.
      ai.djl.repository.zoo.ZooModel<I,​O> getModel​(ai.djl.Device device)
      Returns the loaded ZooModel for a device.
      java.nio.file.Path getModelDir()
      Returns the model cache directory.
      java.lang.String getModelId()
      Returns the model ID.
      java.lang.String getModelUrl()
      Returns the model url.
      java.lang.Class<O> getOutputClass()
      Returns the model output class.
      int getQueueSize()
      Returns the configured size of the workers queue.
      ModelInfo.Status getStatus()
      Returns the model loading status.
      java.lang.String getVersion()
      Returns the model version.
      int hashCode()
      void hasInputOutputClass​(java.lang.Class<I> inputClass, java.lang.Class<O> outputClass)
      Clarifies the input and output class when not specified.
      static java.lang.String inferModelNameFromUrl​(java.lang.String url)
      Infer model name form model URL in case model name is not provided.
      void load​(ai.djl.Device device)
      Loads the model to the specified device.
      void setBatchSize​(int batchSize)
      Sets the configured batch size.
      void setMaxBatchDelay​(int maxBatchDelay)
      Sets the maximum delay in milliseconds to aggregate a batch.
      void setMaxIdleTime​(int maxIdleTime)
      Sets the configured maxIdleTime of workers.
      void setModelId​(java.lang.String id)
      Sets the model ID.
      void setQueueSize​(int queueSize)
      Sets the configured size of the workers queue.
      java.lang.String toString()
      ai.djl.Device withDefaultDevice​(ai.djl.Device device)
      Returns the default device for this model if device is null.
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • ModelInfo

        public ModelInfo​(java.lang.String modelUrl,
                         java.lang.Class<I> inputClass,
                         java.lang.Class<O> outputClass)
        Constructs a new ModelInfo instance.
        Parameters:
        inputClass - the model input class
        outputClass - the model output class
        modelUrl - the model Url
      • ModelInfo

        public ModelInfo​(java.lang.String id,
                         ai.djl.repository.zoo.Criteria<I,​O> criteria)
        Constructs a ModelInfo based on a Criteria.
        Parameters:
        id - the id for the created ModelInfo
        criteria - the model criteria
      • ModelInfo

        public ModelInfo​(java.lang.String id,
                         java.lang.String modelUrl,
                         java.lang.String version,
                         java.lang.String engineName,
                         java.lang.Class<I> inputClass,
                         java.lang.Class<O> outputClass,
                         int queueSize,
                         int maxIdleTime,
                         int maxBatchDelay,
                         int batchSize)
        Constructs a new ModelInfo instance.
        Parameters:
        id - the ID of the model that will be used by workflow
        modelUrl - the model url
        version - the version of the model
        engineName - the engine to load the model
        inputClass - the model input class
        outputClass - the model output class
        queueSize - the maximum request queue size
        maxIdleTime - the initial maximum idle time for workers.
        maxBatchDelay - the initial maximum delay when scaling up before giving up.
        batchSize - the batch size for this model.
    • Method Detail

      • load

        public void load​(ai.djl.Device device)
                  throws ai.djl.ModelException,
                         java.io.IOException
        Loads the model to the specified device.
        Parameters:
        device - the device to load model on
        Throws:
        java.io.IOException - if failed to read model file
        ai.djl.ModelException - if failed to load the specified model
      • configureModelBatch

        public ModelInfo<I,​O> configureModelBatch​(int batchSize,
                                                        int maxBatchDelay)
        Sets a new batchSize and returns a new configured ModelInfo object. You have to triggerUpdates in the ModelManager using this new model.
        Parameters:
        batchSize - the batchSize to set
        maxBatchDelay - maximum time to wait for a free space in worker queue after scaling up workers before giving up to offer the job to the queue.
        Returns:
        new configured ModelInfo.
      • configurePool

        public ModelInfo<I,​O> configurePool​(int maxIdleTime)
        Sets new configuration for the workerPool backing this model and returns a new configured ModelInfo object. You have to triggerUpdates in the ModelManager using this new model.
        Parameters:
        maxIdleTime - time a WorkerThread can be idle before scaling down this worker.
        Returns:
        new configured ModelInfo.
      • getModel

        public ai.djl.repository.zoo.ZooModel<I,​O> getModel​(ai.djl.Device device)
        Returns the loaded ZooModel for a device.
        Parameters:
        device - the device to return the model on
        Returns:
        the loaded ZooModel
      • setModelId

        public void setModelId​(java.lang.String id)
        Sets the model ID.
        Parameters:
        id - the model ID
      • getModelId

        public java.lang.String getModelId()
        Returns the model ID.
        Returns:
        the model ID
      • getVersion

        public java.lang.String getVersion()
        Returns the model version.
        Returns:
        the model version
      • getEngineName

        public java.lang.String getEngineName()
        Returns the engine name.
        Returns:
        the engine name
      • getModelUrl

        public java.lang.String getModelUrl()
        Returns the model url.
        Returns:
        the model url
      • getStatus

        public ModelInfo.Status getStatus()
        Returns the model loading status.
        Returns:
        the model loading status
      • getModelDir

        public java.nio.file.Path getModelDir()
        Returns the model cache directory.
        Returns:
        the model cache directory
      • getInputClass

        public java.lang.Class<I> getInputClass()
        Returns the model input class.
        Returns:
        the model input class
      • getOutputClass

        public java.lang.Class<O> getOutputClass()
        Returns the model output class.
        Returns:
        the model output class
      • hasInputOutputClass

        public void hasInputOutputClass​(java.lang.Class<I> inputClass,
                                        java.lang.Class<O> outputClass)
        Clarifies the input and output class when not specified.

        Warning: This is intended for internal use with reflection.

        Parameters:
        inputClass - the model input class
        outputClass - the model output class
      • setMaxIdleTime

        public void setMaxIdleTime​(int maxIdleTime)
        Sets the configured maxIdleTime of workers.
        Parameters:
        maxIdleTime - the configured maxIdleTime of workers
      • getMaxIdleTime

        public int getMaxIdleTime()
        Returns the configured maxIdleTime of workers.
        Returns:
        the maxIdleTime
      • setBatchSize

        public void setBatchSize​(int batchSize)
        Sets the configured batch size.
        Parameters:
        batchSize - the configured batch size
      • getBatchSize

        public int getBatchSize()
        Returns the configured batch size.
        Returns:
        the configured batch size
      • setMaxBatchDelay

        public void setMaxBatchDelay​(int maxBatchDelay)
        Sets the maximum delay in milliseconds to aggregate a batch.
        Parameters:
        maxBatchDelay - the maximum delay in milliseconds to aggregate a batch
      • getMaxBatchDelay

        public int getMaxBatchDelay()
        Returns the maximum delay in milliseconds to aggregate a batch.
        Returns:
        the maximum delay in milliseconds to aggregate a batch
      • setQueueSize

        public void setQueueSize​(int queueSize)
        Sets the configured size of the workers queue.
        Parameters:
        queueSize - the configured size of the workers queue
      • getQueueSize

        public int getQueueSize()
        Returns the configured size of the workers queue.
        Returns:
        requested size of the workers queue.
      • close

        public void close()
        Specified by:
        close in interface java.lang.AutoCloseable
      • inferModelNameFromUrl

        public static java.lang.String inferModelNameFromUrl​(java.lang.String url)
        Infer model name form model URL in case model name is not provided.
        Parameters:
        url - the model URL
        Returns:
        the model name
      • withDefaultDevice

        public ai.djl.Device withDefaultDevice​(ai.djl.Device device)
        Returns the default device for this model if device is null.
        Parameters:
        device - the device to use if it is not null
        Returns:
        a non-null device
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object