public class DataManager extends Object
Constructor and Description |
---|
DataManager(RestClient client)
Construct a
DataManager using the given
RestClient to communicate with the API. |
DataManager(String apiKeyId,
String apiPassword)
Construct a
DataManager accessing the GATE Cloud
public API with the given credentials. |
Modifier and Type | Method and Description |
---|---|
DataBundle |
createARCBundleFromS3(String bundleName,
String accessKeyId,
String secretKey,
InputType inputType,
String encoding,
String mimeTypeOverride,
String mimeTypes,
String... locations)
Create a new data bundle from a list of
s3://bucket/key URLs that point to ARC or WARC
archives that are already hosted on Amazon S3. |
DataBundle |
createARCBundleFromUploads(String bundleName,
InputType inputType,
String encoding,
String mimeTypeOverride,
String mimeTypes,
File... localFiles)
Create a new data bundle by uploading local ARC or WARC archives to
GATE Cloud managed storage.
|
DataBundle |
createArchiveBundleFromS3(String bundleName,
String accessKeyId,
String secretKey,
InputType inputType,
String encoding,
String mimeTypeOverride,
String fileExtensions,
String... locations)
Create a new data bundle from a list of
s3://bucket/key URLs that point to ZIP or TAR archives
or Twitter JSON files that are already hosted on Amazon S3. |
DataBundle |
createArchiveBundleFromUploads(String bundleName,
InputType inputType,
String encoding,
String mimeTypeOverride,
String fileExtensions,
File... localFiles)
Create a new data bundle by uploading local ZIP or TAR archives or
Twitter JSON files to GATE Cloud managed storage.
|
protected DataBundle |
createS3Bundle(String bundleName,
String accessKeyId,
String secretKey,
InputType inputType,
String encoding,
String mimeTypeOverride,
String fileExtensions,
String mimeTypes,
String... locations)
Common logic for creating bundles from S3.
|
DataBundle |
getBundle(long id)
Get details of a particular bundle given its ID.
|
DataBundle |
getBundle(String url)
Get details of a specific bundle given its detail URL (which will
have been received from an earlier API call).
|
List<DataBundleSummary> |
listBundles()
List all the data bundles that are owned by the authenticating user
|
protected DataBundle |
uploadBundle(String bundleName,
InputType inputType,
String encoding,
String mimeTypeOverride,
String fileExtensions,
String mimeTypes,
File... localFiles)
Common logic for creating bundles from uploads.
|
public DataManager(RestClient client)
DataManager
using the given
RestClient
to communicate with the API.client
- the client object used for communicationpublic List<DataBundleSummary> listBundles()
DataBundleSummary
objects representing the
user's bundles - call the details
method to
get the full detail (which requires another API call).public DataBundle getBundle(long id)
id
- the ID of the required bundlepublic DataBundle getBundle(String url)
url
- the detail URL for the required bundlepublic DataBundle createArchiveBundleFromS3(String bundleName, String accessKeyId, String secretKey, InputType inputType, String encoding, String mimeTypeOverride, String fileExtensions, String... locations)
s3://bucket/key
URLs that point to ZIP or TAR archives
or Twitter JSON files that are already hosted on Amazon S3. Note
that all the files in a bundle must be the same type - one bundle
cannot contain a mixture of ZIP and TAR archives, for example. Job
parameters for the file encoding, MIME type, and the file
extensions used to filter the entries from the archives can also be
supplied, and will feed through to any annotation jobs that take
their input from the created bundle.bundleName
- a name for the new bundle.accessKeyId
- an AWS access key ID (typically a limited IAM
user) with permission to get the specified objectsecretKey
- the corresponding AWS secret key.inputType
- the type of the inputencoding
- character encoding to use when reading entries from
the archive. If null
, UTF-8 will be used.
Should be left as null
for Twitter input
types.mimeTypeOverride
- the MIME type to use when parsing entries
from the archive. If null
the appropriate
type will be guessed based on the file name extension.
Should be left as null
for Twitter input
types.fileExtensions
- comma-separated list of file extensions that
will be processed. Entries that do not match any of these
extensions will be ignored. If null
all
entries that represent files (as opposed to directories)
will be processed. Should be left as null
for
Twitter input types.locations
- "URLs" of the form
s3://bucketname/key
denoting the target
objects in Amazon S3public DataBundle createARCBundleFromS3(String bundleName, String accessKeyId, String secretKey, InputType inputType, String encoding, String mimeTypeOverride, String mimeTypes, String... locations)
s3://bucket/key
URLs that point to ARC or WARC
archives that are already hosted on Amazon S3. Note that all the
files in a bundle must be the same type - one bundle cannot contain
a mixture of ARC and WARC archives, for example. Job parameters for
the file encoding, MIME type, and the MIME types used to filter the
entries from the archives can also be supplied, and will feed
through to any annotation jobs that take their input from the
created bundle.bundleName
- a name for the new bundle.accessKeyId
- an AWS access key ID (typically a limited IAM
user) with permission to get the specified objectsecretKey
- the corresponding AWS secret key.inputType
- the type of the inputencoding
- character encoding to use when reading entries from
the archive. If null
, UTF-8 will be used.
Should be left as null
for Twitter input
types.mimeTypeOverride
- the MIME type to use when parsing entries
from the archive. If null
the appropriate
type will be guessed based on the file name extension.
Should be left as null
for Twitter input
types.mimeTypes
- space-separated list of MIME types used to filter
the entries of interest from the ARC file. Entries whose
MIME type does not match any of these will be ignored.locations
- "URLs" of the form
s3://bucketname/key
denoting the target
objects in Amazon S3protected DataBundle createS3Bundle(String bundleName, String accessKeyId, String secretKey, InputType inputType, String encoding, String mimeTypeOverride, String fileExtensions, String mimeTypes, String... locations)
public DataBundle createArchiveBundleFromUploads(String bundleName, InputType inputType, String encoding, String mimeTypeOverride, String fileExtensions, File... localFiles)
bundleName
- a name for the new bundle.inputType
- the type of the inputencoding
- character encoding to use when reading entries from
the archive. If null
, UTF-8 will be used.
Should be left as null
for Twitter input
types.mimeTypeOverride
- the MIME type to use when parsing entries
from the archive. If null
the appropriate
type will be guessed based on the file name extension.
Should be left as null
for Twitter input
types.fileExtensions
- comma-separated list of file extensions that
will be processed. Entries that do not match any of these
extensions will be ignored. If null
all
entries that represent files (as opposed to directories)
will be processed. Should be left as null
for
Twitter input types.localFiles
- the files to uploadpublic DataBundle createARCBundleFromUploads(String bundleName, InputType inputType, String encoding, String mimeTypeOverride, String mimeTypes, File... localFiles)
bundleName
- a name for the new bundle.inputType
- the type of the inputencoding
- character encoding to use when reading entries from
the archive. If null
, UTF-8 will be used.
Should be left as null
for Twitter input
types.mimeTypeOverride
- the MIME type to use when parsing entries
from the archive. If null
the appropriate
type will be guessed based on the file name extension.
Should be left as null
for Twitter input
types.mimeTypes
- space-separated list of MIME types used to filter
the entries of interest from the ARC file. Entries whose
MIME type does not match any of these will be ignored.localFiles
- the files to uploadCopyright © 2023 GATE. All rights reserved.