Class ModelInfo<I,O>

java.lang.Object
ai.djl.serving.wlm.WorkerPoolConfig<I,O>
ai.djl.serving.wlm.ModelInfo<I,O>

public final class ModelInfo<I,O> extends WorkerPoolConfig<I,O>
A class represent a loaded model and it's metadata.
  • Constructor Details

    • ModelInfo

      public ModelInfo(String modelUrl)
      Constructs a new ModelInfo instance.
      Parameters:
      modelUrl - the model Url
    • ModelInfo

      public ModelInfo(String id, String modelUrl, ai.djl.repository.zoo.Criteria<I,O> criteria)
      Constructs a ModelInfo based on a Criteria.
      Parameters:
      id - the id for the created ModelInfo
      modelUrl - the model Url
      criteria - the model criteria
    • ModelInfo

      public ModelInfo(String id, String modelUrl, String version, String engineName, String loadOnDevices, Class<I> inputClass, Class<O> outputClass, int queueSize, int maxIdleSeconds, int maxBatchDelayMillis, int batchSize, int minWorkers, int maxWorkers)
      Constructs a new ModelInfo instance.
      Parameters:
      id - the ID of the model that will be used by workflow
      modelUrl - the model url
      version - the version of the model
      engineName - the engine to load the model
      loadOnDevices - the devices to load the model on
      inputClass - the model input class
      outputClass - the model output class
      queueSize - the maximum request queue size
      maxIdleSeconds - the initial maximum idle time for workers
      maxBatchDelayMillis - the initial maximum delay when scaling up before giving up
      batchSize - the batch size for this model
      minWorkers - the minimum number of workers
      maxWorkers - the maximum number of workers
  • Method Details

    • getProperties

      public Properties getProperties()
      Returns the properties of the model.
      Returns:
      the properties of the model
    • load

      public void load(ai.djl.Device device) throws ai.djl.ModelException, IOException
      Loads the worker type to the specified device.
      Specified by:
      load in class WorkerPoolConfig<I,O>
      Parameters:
      device - the device to load worker type on
      Throws:
      ai.djl.ModelException - if failed to load the specified model
      IOException - if failed to read worker type file
    • getModels

      public Map<ai.djl.Device,ai.djl.repository.zoo.ZooModel<I,O>> getModels()
      Returns all loaded models.
      Returns:
      all loaded models
    • getModel

      public ai.djl.repository.zoo.ZooModel<I,O> getModel(ai.djl.Device device)
      Returns the loaded ZooModel for a device.
      Parameters:
      device - the device to return the model on
      Returns:
      the loaded ZooModel
    • newThread

      public WorkerPoolConfig.ThreadConfig<I,O> newThread(ai.djl.Device device)
      Starts a new WorkerThread for this WorkerPoolConfig.
      Specified by:
      newThread in class WorkerPoolConfig<I,O>
      Parameters:
      device - the device to run on
      Returns:
      the new WorkerPoolConfig.ThreadConfig
    • getEngine

      public ai.djl.engine.Engine getEngine()
      Returns the engine.
      Returns:
      the engine
    • getEngineName

      public String getEngineName()
      Returns the engine name.
      Returns:
      the engine name
    • getStatus

      public WorkerPoolConfig.Status getStatus()
      Returns the worker type loading status.
      Specified by:
      getStatus in class WorkerPoolConfig<I,O>
      Returns:
      the worker type loading status
    • getInputClass

      public Class<I> getInputClass()
      Returns the model input class.
      Returns:
      the model input class
    • getOutputClass

      public Class<O> getOutputClass()
      Returns the model output class.
      Returns:
      the model output class
    • hasInputOutputClass

      public void hasInputOutputClass(Class<I> inputClass, Class<O> outputClass)
      Clarifies the input and output class when not specified.

      Warning: This is intended for internal use with reflection.

      Parameters:
      inputClass - the model input class
      outputClass - the model output class
    • getMinWorkers

      public int getMinWorkers(ai.djl.Device device)
      Returns the minimum number of workers.
      Overrides:
      getMinWorkers in class WorkerPoolConfig<I,O>
      Parameters:
      device - the device to get the min workers for
      Returns:
      the minimum number of workers
    • getMaxWorkers

      public int getMaxWorkers(ai.djl.Device device)
      Returns the maximum number of workers.
      Overrides:
      getMaxWorkers in class WorkerPoolConfig<I,O>
      Parameters:
      device - the device to get the max workers for
      Returns:
      the maximum number of workers
    • initialize

      public void initialize() throws IOException, ai.djl.ModelException
      Initialize the worker.
      Specified by:
      initialize in class WorkerPoolConfig<I,O>
      Throws:
      IOException - if failed to download worker
      ModelNotFoundException - if model not found
      ai.djl.ModelException
    • registerAdapter

      public void registerAdapter(Adapter adapter)
      Adds an adapter to this ModelInfo.
      Parameters:
      adapter - the adapter to add
    • unregisterAdapter

      public Adapter unregisterAdapter(String name)
      Removes an adapter from this ModelInfo.
      Parameters:
      name - the adapter to remove
      Returns:
      the removed adapter
    • getAdapters

      public Map<String,Adapter> getAdapters()
      Returns the adapters for this model.
      Returns:
      the adapters for this model
    • getAdapter

      public Adapter getAdapter(String name)
      Returns an adapter on this ModelInfo.
      Parameters:
      name - the adapter name to get
      Returns:
      the adapter
    • close

      public void close()
      Close all loaded workers.
      Specified by:
      close in class WorkerPoolConfig<I,O>
    • inferModelNameFromUrl

      public static String inferModelNameFromUrl(String url)
      Infer model name form model URL in case model name is not provided.
      Parameters:
      url - the model URL
      Returns:
      the model name
    • withDefaultDevice

      public ai.djl.Device withDefaultDevice(String deviceName)
      Returns the default device for this model if device is null.
      Overrides:
      withDefaultDevice in class WorkerPoolConfig<I,O>
      Parameters:
      deviceName - the device to use if it is not null
      Returns:
      a non-null device
    • getLoadOnDevices

      public String[] getLoadOnDevices()
      Returns the devices the worker type will be loaded on at startup.
      Specified by:
      getLoadOnDevices in class WorkerPoolConfig<I,O>
      Returns:
      the devices the worker type will be loaded on at startup
    • isParallelLoading

      public boolean isParallelLoading()
      Returns if the worker type can be load parallel on multiple devices.
      Specified by:
      isParallelLoading in class WorkerPoolConfig<I,O>
      Returns:
      if the worker type can be load parallel on multiple devices