Interface Repository
- All Known Implementing Classes:
AbstractRepository
,JarRepository
,LocalRepository
,RemoteRepository
,SimpleRepository
,SimpleUrlRepository
Repository
is a format for storing data Artifact
s for various uses including deep
learning models and datasets.
This repository format is based off of the design of the Maven Repository format (See maven).
Unlike in Maven, the data doesn't need to be located within the repository. Instead, the
repository only stores metadata including the URL and checksum of the actual data. When the
artifact is prepared, the data is downloaded, checked, and then stored in the
~/.djl.ai/cache
folder.
The artifacts are first divided into a number of Metadata
files that can each have
multiple artifacts. The metadata files are identified by an MRL
which contains:
- type - The resource type, e.g. model or dataset.
- Application - The resource application (See
Application
). - Group Id - The group id identifies the group publishing the artifacts using a reverse domain name system.
- Artifact Id - The artifact id identifies the different artifacts published by a single group.
Within each metadata are a number of artifacts that share the same groupId, artifactId, name, description, website, and update date. The artifacts within the metadata differ primarily based on name and properties. Note that there is a metadata name and a separate artifact name. The properties are a map with string property names and string property values that can be used to represent key differentiators between artifacts such as dataset, flavors, and image sizes. For example, you might have a ResNet metadata file with different artifacts to represent different hyperparameters and datasets used for training the ResNet.
Each artifact contains a Version
number (which can be a snapshot version). The data in
the artifacts are represented by files in the format of an Artifact.Item
and a parsed
JSON object of arguments. The files can either by a single file, an automatically extracted gzip
file, or an automatically extracted zip file that will be treated as a directory. These can be
used to store data such as the dataset, model parameters, and synset files. The arguments can be
used to store data about the model used for initialization. For example, it can store the image
size which can be used by the model loader for both initializing the block and setting up
resizing in the translator.
There are three kinds of repositories: a LocalRepository
, RemoteRepository
,
and SimpleRepository
. For all three kinds, new repositories should be created by calling
newInstance(String, String)
with the location of the repository.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addResource
(MRL mrl) Adds resource to the repository.default MRL
dataset
(Application application, String groupId, String artifactId) Creates a datasetMRL
with specified application.default MRL
dataset
(Application application, String groupId, String artifactId, String version) Creates a datasetMRL
with specified application.Returns the URI to the base of the repository.Returns the cache directory for the repository.getFile
(Artifact.Item item, String path) Returns the path to a file for the item.getName()
Returns the repository name.default Path
getResourceDirectory
(Artifact artifact) Returns the resource directory for the an artifact.Returns a list ofMRL
s in the repository.boolean
isRemote()
Returns whether the repository is remote repository.String[]
listDirectory
(Artifact.Item item, String path) Returns the list of files directly within a specified directory in a zipped directory item.Returns the metadata at a mrl.default MRL
model
(Application application, String groupId, String artifactId) Creates a modelMRL
with specified application.default MRL
model
(Application application, String groupId, String artifactId, String version) Creates a modelMRL
with specified application.default MRL
model
(Application application, String groupId, String artifactId, String version, String artifactName) Creates a modelMRL
with specified application.static Repository
newInstance
(String name, String url) Creates a new instance of a repository with a name and url.static Repository
newInstance
(String name, Path path) Creates a new instance of a repository with a name and url.openStream
(Artifact.Item item, String path) Returns anInputStream
for an item in a repository.default void
Prepares the artifact for use.void
Prepares the artifact for use with progress tracking.static void
Registers aRepositoryFactory
to handle the specified url scheme.Returns the artifact matching a mrl, version, and property filter.
-
Method Details
-
newInstance
Creates a new instance of a repository with a name and url.- Parameters:
name
- the repository namepath
- the repository location- Returns:
- the new repository
-
newInstance
Creates a new instance of a repository with a name and url.- Parameters:
name
- the repository nameurl
- the repository location- Returns:
- the new repository
-
registerRepositoryFactory
Registers aRepositoryFactory
to handle the specified url scheme.- Parameters:
factory
- theRepositoryFactory
to be registered
-
model
Creates a modelMRL
with specified application.- Parameters:
application
- the desired applicationgroupId
- the desired groupIdartifactId
- the desired artifactId- Returns:
- a model
MRL
-
model
Creates a modelMRL
with specified application.- Parameters:
application
- the desired applicationgroupId
- the desired groupIdartifactId
- the desired artifactIdversion
- the resource version- Returns:
- a model
MRL
-
model
default MRL model(Application application, String groupId, String artifactId, String version, String artifactName) Creates a modelMRL
with specified application.- Parameters:
application
- the desired applicationgroupId
- the desired groupIdartifactId
- the desired artifactIdversion
- the resource versionartifactName
- the desired artifact name- Returns:
- a model
MRL
-
dataset
Creates a datasetMRL
with specified application.- Parameters:
application
- the desired applicationgroupId
- the desired groupIdartifactId
- the desired artifactId- Returns:
- a dataset
MRL
-
dataset
Creates a datasetMRL
with specified application.- Parameters:
application
- the desired applicationgroupId
- the desired groupIdartifactId
- the desired artifactIdversion
- the resource version- Returns:
- a dataset
MRL
-
isRemote
boolean isRemote()Returns whether the repository is remote repository.- Returns:
- whether the repository is remote repository
-
getName
String getName()Returns the repository name.- Returns:
- the repository name
-
getBaseUri
URI getBaseUri()Returns the URI to the base of the repository.- Returns:
- the URI
-
locate
Returns the metadata at a mrl.- Parameters:
mrl
- the mrl of the metadata to retrieve- Returns:
- the metadata
- Throws:
IOException
- if it failed to load the metadata
-
resolve
Returns the artifact matching a mrl, version, and property filter.- Parameters:
mrl
- the mrl to match the artifact againstfilter
- the property filter- Returns:
- the matched artifact
- Throws:
IOException
- if it failed to load the artifact
-
openStream
Returns anInputStream
for an item in a repository.- Parameters:
item
- the item to openpath
- the path to a file if the item is a zipped directory. Otherwise, pass null- Returns:
- the file stream
- Throws:
IOException
- if it failed to open the stream
-
getFile
Returns the path to a file for the item.- Parameters:
item
- the item to find the path forpath
- the path to a file if the item is a zipped directory. Otherwise, pass null- Returns:
- the file path
- Throws:
IOException
- if it failed to find the path
-
listDirectory
Returns the list of files directly within a specified directory in a zipped directory item.- Parameters:
item
- the zipped directory itempath
- the path within the zip directory- Returns:
- the list of files/directories
- Throws:
IOException
- if it failed to list the directory
-
prepare
Prepares the artifact for use.- Parameters:
artifact
- the artifact to prepare- Throws:
IOException
- if it failed to prepare
-
prepare
Prepares the artifact for use with progress tracking.- Parameters:
artifact
- the artifact to prepareprogress
- the progress tracker- Throws:
IOException
- if it failed to prepare
-
getCacheDirectory
Returns the cache directory for the repository.- Returns:
- the cache directory path
- Throws:
IOException
- if it failed to ensure the creation of the cache directory
-
getResourceDirectory
Returns the resource directory for the an artifact.- Parameters:
artifact
- the artifact whose resource directory to return- Returns:
- the resource directory path
- Throws:
IOException
- if it failed to ensure the creation of the cache directory
-
getResources
Returns a list ofMRL
s in the repository.An empty list will be returned if underlying
Repository
implementation does not support this feature.- Returns:
- a list of
MRL
s in the repository
-
addResource
Adds resource to the repository.- Parameters:
mrl
- the resource to add
-