public class WorkLoadManager
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
class |
WorkLoadManager.WorkerPool
Manages the work load for a single model.
|
Constructor and Description |
---|
WorkLoadManager()
Constructs a
WorkLoadManager instance. |
Modifier and Type | Method and Description |
---|---|
int |
getNumRunningWorkers(ModelInfo modelInfo)
Returns the number of running workers of a model.
|
int |
getQueueLength(ModelInfo modelInfo)
Returns the current number of request in the queue.
|
WorkLoadManager.WorkerPool |
getWorkerPoolForModel(ModelInfo modelInfo)
Returns the
WorkLoadManager.WorkerPool for a model. |
java.util.List<WorkerThread> |
getWorkers(ModelInfo modelInfo)
Returns the workers for the specific model.
|
java.util.concurrent.CompletableFuture<ai.djl.modality.Output> |
runJob(Job job)
Adds an inference job to the job queue of the next free worker.
|
void |
unregisterModel(ModelInfo model)
Removes a model from management.
|
public WorkLoadManager()
WorkLoadManager
instance.public java.util.List<WorkerThread> getWorkers(ModelInfo modelInfo)
modelInfo
- the name of the model we are looking for.public void unregisterModel(ModelInfo model)
model
- the model to removepublic java.util.concurrent.CompletableFuture<ai.djl.modality.Output> runJob(Job job)
job
- an inference job to be executed.true
if submit success, false otherwise.public int getNumRunningWorkers(ModelInfo modelInfo)
modelInfo
- the model we are interested in.public int getQueueLength(ModelInfo modelInfo)
modelInfo
- the modelpublic WorkLoadManager.WorkerPool getWorkerPoolForModel(ModelInfo modelInfo)
WorkLoadManager.WorkerPool
for a model.modelInfo
- the model to get the worker pool forWorkLoadManager.WorkerPool