Create an absolute path from child
using the basePath
if the child is a relative path.
Create an absolute path from child
using the basePath
if the child is a relative path.
Return child
if it is an absolute path.
Base path to prepend to child
if child is a relative path.
Note: It is assumed that the basePath do not have any escaped characters and
is directly readable by Hadoop APIs.
Child path to append to basePath
if child is a relative path.
Note: t is assumed that the child is escaped, that is, all special chars that
need escaping by URI standards are already escaped.
Absolute path without escaped chars that is directly readable by Hadoop APIs.
The default filter for hidden files.
The default filter for hidden files. Files names beginning with _ or . are considered hidden.
true if the file is hidden
Returns all the levels of sub directories that path
has with respect to base
.
Returns all the levels of sub directories that path
has with respect to base
. For example:
getAllSubDirectories("/base", "/base/a/b/c") =>
(Iterator("/base/a", "/base/a/b"), "/base/a/b/c")
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to report the duration as well as the success or failure of an operation.
Used to report the duration as well as the success or failure of an operation.
Recursively lists all the files and directories for the given subDirs
in a scalable manner.
Recursively lists all the files and directories for the given subDirs
in a scalable manner.
The SparkSession
Absolute path of the subdirectories to list
The Hadoop Configuration to get a FileSystem instance
A function that returns true when the file should be considered hidden and excluded from results. Defaults to checking for prefixes of "." or "_".
Register a task failure listener to delete a temp file in our best effort.
Tries deleting a file or directory non-recursively.
Tries deleting a file or directory non-recursively. If the file/folder doesn't exist,
that's fine, a separate operation may be deleting files/folders. If a directory is non-empty,
we shouldn't delete it. FileSystem implementations throw an IOException
in those cases,
which we return as a "we failed to delete".
Listing on S3 is not consistent after deletes, therefore in case the delete
returns false
,
because the file didn't exist, then we still return true
. Retries on S3 rate limits up to 3
times.
Given a path child
:
Given a path child
:
child
if the path is already relative
2. Tries relativizing child
with respect to basePath
a) If the child
doesn't live within the same base path, returns child
as is
b) If child
lives in a different FileSystem, throws an exception
Note that child
may physically be pointing to a path within basePath
, but may logically
belong to a different FileSystem, e.g. DBFS mount points and direct S3 paths.
Report a log to indicate some command is running.
Report a log to indicate some command is running.
Some utility methods on files, directories, and paths.