@InterfaceAudience.Private @InterfaceStability.Evolving public final class S3AUtils extends Object
Modifier and Type | Class and Description |
---|---|
static interface |
S3AUtils.CallOnLocatedFileStatus
An interface for use in lambda-expressions working with
directory tree listings.
|
static interface |
S3AUtils.LocatedFileStatusMap<T>
An interface for use in lambda-expressions working with
directory tree listings.
|
Modifier and Type | Field and Description |
---|---|
static org.apache.hadoop.fs.PathFilter |
ACCEPT_ALL
A Path filter which accepts all filenames.
|
static String |
EOF_MESSAGE_IN_XML_PARSER |
static String |
EOF_READ_DIFFERENT_LENGTH |
static org.apache.hadoop.fs.PathFilter |
HIDDEN_FILE_FILTER
Path filter which ignores any file which starts with .
|
static String |
SSE_C_NO_KEY_ERROR
Encryption SSE-C used but the config lacks an encryption key.
|
static String |
SSE_S3_WITH_KEY_ERROR
Encryption SSE-S3 is used but the caller also set an encryption key.
|
Modifier and Type | Method and Description |
---|---|
static long |
applyLocatedFiles(org.apache.hadoop.fs.RemoteIterator<? extends org.apache.hadoop.fs.LocatedFileStatus> iterator,
S3AUtils.CallOnLocatedFileStatus eval)
Apply an operation to every
LocatedFileStatus in a remote
iterator. |
static EncryptionSecrets |
buildEncryptionSecrets(String bucket,
org.apache.hadoop.conf.Configuration conf)
Get the server-side encryption or client side encryption algorithm.
|
static boolean |
checkDiskBuffer(org.apache.hadoop.conf.Configuration conf)
Check whether the configuration for S3ABlockOutputStream is
consistent or not.
|
static void |
clearBucketOption(org.apache.hadoop.conf.Configuration conf,
String bucket,
String genericKey)
Clear a bucket-specific property.
|
static void |
closeAll(org.slf4j.Logger log,
Closeable... closeables)
Deprecated.
|
static void |
closeAutocloseables(org.slf4j.Logger log,
AutoCloseable... closeables)
Close the Closeable objects and ignore any Exception or
null pointers.
|
static S3AFileStatus |
createFileStatus(org.apache.hadoop.fs.Path keyPath,
software.amazon.awssdk.services.s3.model.S3Object s3Object,
long blockSize,
String owner,
String eTag,
String versionId,
boolean isCSEEnabled)
Create a files status instance from a listing.
|
static S3AFileStatus |
createUploadFileStatus(org.apache.hadoop.fs.Path keyPath,
boolean isDir,
long size,
long blockSize,
String owner,
String eTag,
String versionId)
Create a file status for object we just uploaded.
|
static long |
dateToLong(Date date)
Date to long conversion.
|
static void |
deleteQuietly(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path,
boolean recursive)
Delete a path quietly: failures are logged at DEBUG.
|
static void |
deleteWithWarning(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path,
boolean recursive)
Delete a path: failures are logged at WARN.
|
static int |
ensureOutputParameterInRange(String name,
long size)
Ensure that the long value is in the range of an integer.
|
static IOException |
extractException(String operation,
String path,
CompletionException ce)
Extract an exception from a failed future, and convert to an IOE.
|
static IOException |
extractException(String operation,
String path,
ExecutionException ee)
Extract an exception from a failed future, and convert to an IOE.
|
static <T> List<T> |
flatmapLocatedFiles(org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> iterator,
S3AUtils.LocatedFileStatusMap<Optional<T>> eval)
Map an operation to every
LocatedFileStatus in a remote
iterator, returning a list of the all results which were not empty. |
static String |
formatRange(long rangeStart,
long rangeEnd)
Format a byte range for a request header.
|
static S3xLoginHelper.Login |
getAWSAccessKeys(URI name,
org.apache.hadoop.conf.Configuration conf)
Return the access key and secret for S3 API use.
|
static String |
getBucketOption(org.apache.hadoop.conf.Configuration conf,
String bucket,
String genericKey)
Get a bucket-specific property.
|
static S3AEncryptionMethods |
getEncryptionAlgorithm(String bucket,
org.apache.hadoop.conf.Configuration conf)
Get the server-side encryption or client side encryption algorithm.
|
static <InstanceT> |
getInstanceFromReflection(String className,
org.apache.hadoop.conf.Configuration conf,
URI uri,
Class<? extends InstanceT> interfaceImplemented,
String methodName,
String configKey)
Creates an instance of a class using reflection.
|
static long |
getMultipartSizeProperty(org.apache.hadoop.conf.Configuration conf,
String property,
long defVal)
Get a size property from the configuration: this property must
be at least equal to
Constants.MULTIPART_MIN_SIZE . |
static String |
getS3EncryptionKey(String bucket,
org.apache.hadoop.conf.Configuration conf)
Get any S3 encryption key, without propagating exceptions from
JCEKs files.
|
static String |
getS3EncryptionKey(String bucket,
org.apache.hadoop.conf.Configuration conf,
boolean propagateExceptions)
Get any SSE/CSE key from a configuration/credential provider.
|
static Map<String,String> |
getTrimmedStringCollectionSplitByEquals(org.apache.hadoop.conf.Configuration configuration,
String name)
Get the equal op (=) delimited key-value pairs of the
name property as
a collection of pair of String s, trimmed of the leading and trailing whitespace
after delimiting the name by comma and new line separator. |
static int |
intOption(org.apache.hadoop.conf.Configuration conf,
String key,
int defVal,
int min)
Get a integer option >= the minimum allowed value.
|
static boolean |
isMessageTranslatableToEOF(software.amazon.awssdk.core.exception.SdkException ex)
Cue that an AWS exception is likely to be an EOF Exception based
on the message coming back from the client.
|
static boolean |
isThrottleException(Exception ex)
Is the exception an instance of a throttling exception.
|
static S3AFileStatus[] |
iteratorToStatuses(org.apache.hadoop.fs.RemoteIterator<S3AFileStatus> iterator)
Convert the data of an iterator of
S3AFileStatus to
an array. |
static org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> |
listAndFilter(org.apache.hadoop.fs.FileSystem fileSystem,
org.apache.hadoop.fs.Path path,
boolean recursive,
org.apache.hadoop.fs.PathFilter filter)
List located files and filter them as a classic listFiles(path, filter)
would do.
|
static long |
longBytesOption(org.apache.hadoop.conf.Configuration conf,
String key,
long defVal,
long min)
Get a long option >= the minimum allowed value, supporting memory
prefixes K,M,G,T,P.
|
static long |
longOption(org.apache.hadoop.conf.Configuration conf,
String key,
long defVal,
long min)
Get a long option >= the minimum allowed value.
|
static String |
lookupPassword(String bucket,
org.apache.hadoop.conf.Configuration conf,
String baseKey)
Get a password from a configuration, including JCEKS files, handling both
the absolute key and bucket override.
|
static String |
lookupPassword(String bucket,
org.apache.hadoop.conf.Configuration conf,
String baseKey,
String overrideVal)
Deprecated.
|
static String |
lookupPassword(String bucket,
org.apache.hadoop.conf.Configuration conf,
String baseKey,
String overrideVal,
String defVal)
Get a password from a configuration, including JCEKS files, handling both
the absolute key and bucket override.
|
static <T> List<T> |
mapLocatedFiles(org.apache.hadoop.fs.RemoteIterator<? extends org.apache.hadoop.fs.LocatedFileStatus> iterator,
S3AUtils.LocatedFileStatusMap<T> eval)
Map an operation to every
LocatedFileStatus in a remote
iterator, returning a list of the results. |
static <T> Optional<T> |
maybe(boolean include,
T value)
Convert a value into a non-empty Optional instance if
the value of
include is true. |
static String |
maybeAddTrailingSlash(String key)
Turns a path (relative or otherwise) into an S3 key, adding a trailing
"/" if the path is not the root and does not already have a "/"
at the end.
|
static boolean |
objectRepresentsDirectory(String name)
Predicate: does the object represent a directory?.
|
static org.apache.hadoop.conf.Configuration |
propagateBucketOptions(org.apache.hadoop.conf.Configuration source,
String bucket)
Propagates bucket-specific settings into generic S3A configuration keys.
|
static void |
setBucketOption(org.apache.hadoop.conf.Configuration conf,
String bucket,
String genericKey,
String value)
Set a bucket-specific property to a particular value.
|
static boolean |
setIfDefined(org.apache.hadoop.conf.Configuration config,
String key,
String val,
String origin)
Set a key if the value is non-empty.
|
static String |
stringify(software.amazon.awssdk.awscore.exception.AwsServiceException e)
Get low level details of an amazon exception for logging; multi-line.
|
static String |
stringify(software.amazon.awssdk.services.s3.model.S3Object s3Object)
String information about a summary entry for debug messages.
|
static IOException |
translateException(String operation,
org.apache.hadoop.fs.Path path,
software.amazon.awssdk.core.exception.SdkException exception)
Translate an exception raised in an operation into an IOException.
|
static IOException |
translateException(String operation,
String path,
software.amazon.awssdk.core.exception.SdkException exception)
Translate an exception raised in an operation into an IOException.
|
static void |
validateOutputStreamConfiguration(org.apache.hadoop.fs.Path path,
org.apache.hadoop.conf.Configuration conf)
Validates the output stream configuration.
|
public static final String SSE_C_NO_KEY_ERROR
public static final String SSE_S3_WITH_KEY_ERROR
public static final String EOF_MESSAGE_IN_XML_PARSER
public static final String EOF_READ_DIFFERENT_LENGTH
public static final org.apache.hadoop.fs.PathFilter HIDDEN_FILE_FILTER
public static final org.apache.hadoop.fs.PathFilter ACCEPT_ALL
public static IOException translateException(String operation, org.apache.hadoop.fs.Path path, software.amazon.awssdk.core.exception.SdkException exception)
SdkException
passed in, and any status codes included
in the operation. That is: HTTP error codes are examined and can be
used to build a more specific response.operation
- operationpath
- path operated on (must not be null)exception
- amazon exception raisedpublic static IOException translateException(@Nullable String operation, @Nullable String path, software.amazon.awssdk.core.exception.SdkException exception)
SdkException
passed in, and any status codes included
in the operation. That is: HTTP error codes are examined and can be
used to build a more specific response.operation
- operationpath
- path operated on (may be null)exception
- amazon exception raisedpublic static IOException extractException(String operation, String path, ExecutionException ee)
operation
- operation which failedpath
- path operated on (may be null)ee
- execution exceptionpublic static IOException extractException(String operation, String path, CompletionException ce)
operation
- operation which failedpath
- path operated on (may be null)ce
- completion exceptionpublic static boolean isThrottleException(Exception ex)
AWSServiceThrottledException
,
or anything which the AWS SDK's RetryUtils considers to be
a throttling exception.ex
- exception to examinepublic static boolean isMessageTranslatableToEOF(software.amazon.awssdk.core.exception.SdkException ex)
ex
- exceptionpublic static String stringify(software.amazon.awssdk.awscore.exception.AwsServiceException e)
e
- exceptionpublic static S3AFileStatus createFileStatus(org.apache.hadoop.fs.Path keyPath, software.amazon.awssdk.services.s3.model.S3Object s3Object, long blockSize, String owner, String eTag, String versionId, boolean isCSEEnabled)
keyPath
- path to entrys3Object
- s3Object entryblockSize
- block size to declare.owner
- owner of the fileeTag
- S3 object eTag or null if unavailableversionId
- S3 object versionId or null if unavailableisCSEEnabled
- is client side encryption enabled?public static S3AFileStatus createUploadFileStatus(org.apache.hadoop.fs.Path keyPath, boolean isDir, long size, long blockSize, String owner, String eTag, String versionId)
keyPath
- path for created objectisDir
- true iff directorysize
- file lengthblockSize
- block size for file statusowner
- Hadoop usernameeTag
- S3 object eTag or null if unavailableversionId
- S3 object versionId or null if unavailablepublic static boolean objectRepresentsDirectory(String name)
name
- object namepublic static long dateToLong(Date date)
date
- date from AWS querypublic static <InstanceT> InstanceT getInstanceFromReflection(String className, org.apache.hadoop.conf.Configuration conf, @Nullable URI uri, Class<? extends InstanceT> interfaceImplemented, String methodName, String configKey) throws IOException
InstanceT
- Instance of classclassName
- name of class for which instance is to be createdconf
- configurationuri
- URI of the FSinterfaceImplemented
- interface that this class implementsmethodName
- name of factory method to be invokedconfigKey
- config key under which this class is specifiedIOException
- on any problempublic static boolean setIfDefined(org.apache.hadoop.conf.Configuration config, String key, String val, String origin)
config
- config to patchkey
- key to setval
- value to probe and setorigin
- originpublic static S3xLoginHelper.Login getAWSAccessKeys(URI name, org.apache.hadoop.conf.Configuration conf) throws IOException
name
- the URI for which we need the access keys; may be nullconf
- the Configuration object to interrogate for keys.IOException
- problems retrieving passwords from KMS.@Deprecated public static String lookupPassword(String bucket, org.apache.hadoop.conf.Configuration conf, String baseKey, String overrideVal) throws IOException
bucket
- bucket or "" if none knownconf
- configurationbaseKey
- base key to look up, e.g "fs.s3a.secret.key"overrideVal
- override value: if non empty this is used instead of
querying the configuration.IOException
- on any IO problemIllegalArgumentException
- bad argumentspublic static String lookupPassword(String bucket, org.apache.hadoop.conf.Configuration conf, String baseKey) throws IOException
bucket
- bucket or "" if none knownconf
- configurationbaseKey
- base key to look up, e.g "fs.s3a.secret.key"IOException
- on any IO problemIllegalArgumentException
- bad arguments@InterfaceAudience.LimitedPrivate(value="Ranger") public static String lookupPassword(String bucket, org.apache.hadoop.conf.Configuration conf, String baseKey, String overrideVal, String defVal) throws IOException
bucket
- bucket or "" if none knownconf
- configurationbaseKey
- base key to look up, e.g "fs.s3a.secret.key"overrideVal
- override value: if non empty this is used instead of
querying the configuration.defVal
- value to return if there is no passwordIOException
- on any IO problemIllegalArgumentException
- bad argumentspublic static String stringify(software.amazon.awssdk.services.s3.model.S3Object s3Object)
s3Object
- s3Object entrypublic static int intOption(org.apache.hadoop.conf.Configuration conf, String key, int defVal, int min)
conf
- configurationkey
- key to look updefVal
- default valuemin
- minimum valueIllegalArgumentException
- if the value is below the minimumpublic static long longOption(org.apache.hadoop.conf.Configuration conf, String key, long defVal, long min)
conf
- configurationkey
- key to look updefVal
- default valuemin
- minimum valueIllegalArgumentException
- if the value is below the minimumpublic static long longBytesOption(org.apache.hadoop.conf.Configuration conf, String key, long defVal, long min)
conf
- configurationkey
- key to look updefVal
- default valuemin
- minimum valueIllegalArgumentException
- if the value is below the minimumpublic static long getMultipartSizeProperty(org.apache.hadoop.conf.Configuration conf, String property, long defVal)
Constants.MULTIPART_MIN_SIZE
.
If it is too small, it is rounded up to that minimum, and a warning
printed.conf
- configurationproperty
- property namedefVal
- default valuepublic static void validateOutputStreamConfiguration(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) throws org.apache.hadoop.fs.PathIOException
path
- path: for error messagesconf
- : configuration object for the given contextorg.apache.hadoop.fs.PathIOException
- Unsupported configuration.public static boolean checkDiskBuffer(org.apache.hadoop.conf.Configuration conf)
conf
- : configuration object for the given contextpublic static int ensureOutputParameterInRange(String name, long size)
name
- property name for error messagessize
- original size@InterfaceAudience.LimitedPrivate(value="Ranger") public static org.apache.hadoop.conf.Configuration propagateBucketOptions(org.apache.hadoop.conf.Configuration source, String bucket)
fs.s3a.bucket.${bucket}.key
to
fs.s3a.key
, for all values of "key" other than a small set
of unmodifiable values.
The source of the updated property is set to the key name of the bucket
property, to aid in diagnostics of where things came from.
Returns a new configuration. Why the clone?
You can use the same conf for different filesystems, and the original
values are not updated.
The fs.s3a.impl
property cannot be set, nor can
any with the prefix fs.s3a.bucket
.
This method does not propagate security provider path information from
the S3A property into the Hadoop common provider: callers must call
patchSecurityCredentialProviders(Configuration)
explicitly.
source
- Source Configuration object.bucket
- bucket name. Must not be empty.public static void deleteQuietly(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive)
fs
- filesystempath
- pathrecursive
- recursive?public static void deleteWithWarning(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive)
fs
- filesystempath
- pathrecursive
- recursive?public static S3AFileStatus[] iteratorToStatuses(org.apache.hadoop.fs.RemoteIterator<S3AFileStatus> iterator) throws IOException
S3AFileStatus
to
an array.iterator
- a non-null iteratorIOException
- failurepublic static long applyLocatedFiles(org.apache.hadoop.fs.RemoteIterator<? extends org.apache.hadoop.fs.LocatedFileStatus> iterator, S3AUtils.CallOnLocatedFileStatus eval) throws IOException
LocatedFileStatus
in a remote
iterator.iterator
- iterator from a listeval
- closure to evaluateIOException
- anything in the closure, or iteration logic.public static <T> List<T> mapLocatedFiles(org.apache.hadoop.fs.RemoteIterator<? extends org.apache.hadoop.fs.LocatedFileStatus> iterator, S3AUtils.LocatedFileStatusMap<T> eval) throws IOException
LocatedFileStatus
in a remote
iterator, returning a list of the results.T
- return type of mapiterator
- iterator from a listeval
- closure to evaluateIOException
- anything in the closure, or iteration logic.public static <T> List<T> flatmapLocatedFiles(org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> iterator, S3AUtils.LocatedFileStatusMap<Optional<T>> eval) throws IOException
LocatedFileStatus
in a remote
iterator, returning a list of the all results which were not empty.T
- return type of mapiterator
- iterator from a listeval
- closure to evaluateIOException
- anything in the closure, or iteration logic.public static org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> listAndFilter(org.apache.hadoop.fs.FileSystem fileSystem, org.apache.hadoop.fs.Path path, boolean recursive, org.apache.hadoop.fs.PathFilter filter) throws IOException
fileSystem
- filesystempath
- path to listrecursive
- recursive listing?filter
- filter for the filenameIOException
- IO failure.public static <T> Optional<T> maybe(boolean include, T value)
include
is true.T
- type of option.include
- flag to indicate the value is to be included.value
- value to returnpublic static String getS3EncryptionKey(String bucket, org.apache.hadoop.conf.Configuration conf)
bucket
- bucket to query forconf
- configuration to examineIllegalArgumentException
- bad arguments.public static String getS3EncryptionKey(String bucket, org.apache.hadoop.conf.Configuration conf, boolean propagateExceptions) throws IOException
SERVER_SIDE_ENCRYPTION_KEY
.
IOExceptions raised during retrieval are swallowed.bucket
- bucket to query forconf
- configuration to examinepropagateExceptions
- should IO exceptions be rethrown?IllegalArgumentException
- bad arguments.IOException
- if propagateExceptions==true and reading a JCEKS file raised an IOEpublic static S3AEncryptionMethods getEncryptionAlgorithm(String bucket, org.apache.hadoop.conf.Configuration conf) throws IOException
bucket
- bucket to query forconf
- configuration to scanNONE
unless
one is set.IOException
- on JCKES lookup or invalid method/key configuration.public static EncryptionSecrets buildEncryptionSecrets(String bucket, org.apache.hadoop.conf.Configuration conf) throws IOException
bucket
- bucket to query forconf
- configuration to scanNONE
unless
one is set and secrets.IOException
- on JCKES lookup or invalid method/key configuration.@Deprecated public static void closeAll(org.slf4j.Logger log, Closeable... closeables)
IOUtils.cleanupWithLogger(Logger, Closeable...)
log
- the log to log at debug level. Can be null.closeables
- the objects to closepublic static void closeAutocloseables(org.slf4j.Logger log, AutoCloseable... closeables)
IOUtils
).log
- the log to log at debug level. Can be null.closeables
- the objects to closepublic static void setBucketOption(org.apache.hadoop.conf.Configuration conf, String bucket, String genericKey, String value)
fs.s3a. prefix
,
that's stripped off, so that when the the bucket properties are propagated
down to the generic values, that value gets copied down.conf
- configuration to setbucket
- bucket namegenericKey
- key; can start with "fs.s3a."value
- value to setpublic static void clearBucketOption(org.apache.hadoop.conf.Configuration conf, String bucket, String genericKey)
fs.s3a. prefix
,
that's stripped off, so that when the the bucket properties are propagated
down to the generic values, that value gets copied down.conf
- configuration to setbucket
- bucket namegenericKey
- key; can start with "fs.s3a."public static String getBucketOption(org.apache.hadoop.conf.Configuration conf, String bucket, String genericKey)
fs.s3a. prefix
,
that's stripped off.conf
- configuration to setbucket
- bucket namegenericKey
- key; can start with "fs.s3a."public static String maybeAddTrailingSlash(String key)
key
- s3 key or ""public static String formatRange(long rangeStart, long rangeEnd)
rangeStart
- the start byte offsetrangeEnd
- the end byte offset (inclusive)public static Map<String,String> getTrimmedStringCollectionSplitByEquals(org.apache.hadoop.conf.Configuration configuration, String name)
name
property as
a collection of pair of String
s, trimmed of the leading and trailing whitespace
after delimiting the name
by comma and new line separator.
If no such property is specified then empty Map
is returned.configuration
- the configuration object.name
- property name.Map
of String
s, or empty
Map
.Copyright © 2008–2024 Apache Software Foundation. All rights reserved.