String jobName
The name of a job to be executed.
Map<K,V> arguments
Arguments to be passed to the job.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
String catalogId
The ID of the catalog in which the partion is to be created. Currently, this should be the AWS account ID.
String databaseName
The name of the metadata database in which the partition is to be created.
String tableName
The name of the metadata table in which the partition is to be created.
List<E> partitionInputList
A list of PartitionInput
structures that define the partitions to be created.
String catalogId
The ID of the Data Catalog where the partition to be deleted resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database in which the table in question resides.
String tableName
The name of the table where the partitions to be deleted is located.
List<E> partitionsToDelete
A list of PartitionInput
structures that define the partitions to be deleted.
String catalogId
The ID of the Data Catalog where the table resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the tables to delete reside.
List<E> tablesToDelete
A list of the table to delete.
String catalogId
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the partitions reside.
String tableName
The name of the partitions' table.
List<E> partitionsToGet
A list of partition values identifying the partitions to retrieve.
String jobName
The name of the Job in question.
String jobRunId
The JobRunId of the JobRun in question.
ErrorDetail errorDetail
Specifies details about the error that was encountered.
List<E> successfulSubmissions
A list of the JobRuns that were successfully submitted for stopping.
List<E> errors
A list of the errors that were encountered in tryng to stop JobRuns, including the JobRunId for which each error was encountered and details about the error.
GrokClassifier grokClassifier
A GrokClassifier
object.
XMLClassifier xMLClassifier
An XMLClassifier
object.
String name
The name of the connection definition.
String description
Description of the connection.
String connectionType
The type of the connection. Currently, only JDBC is supported; SFTP is not supported.
List<E> matchCriteria
A list of criteria that can be used in selecting this connection.
Map<K,V> connectionProperties
A list of key-value pairs used as parameters for this connection.
PhysicalConnectionRequirements physicalConnectionRequirements
A map of physical connection requirements, such as VPC and SecurityGroup, needed for making this connection successfully.
Date creationTime
The time this connection definition was created.
Date lastUpdatedTime
The last time this connection definition was updated.
String lastUpdatedBy
The user, group or role that last updated this connection definition.
String name
The name of the connection.
String description
Description of the connection.
String connectionType
The type of the connection. Currently, only JDBC is supported; SFTP is not supported.
List<E> matchCriteria
A list of criteria that can be used in selecting this connection.
Map<K,V> connectionProperties
A list of key-value pairs used as parameters for this connection.
PhysicalConnectionRequirements physicalConnectionRequirements
A map of physical connection requirements, such as VPC and SecurityGroup, needed for making this connection successfully.
String name
The crawler name.
String role
The IAM role (or ARN of an IAM role) used to access customer resources, such as data in Amazon S3.
CrawlerTargets targets
A collection of targets to crawl.
String databaseName
The database where metadata is written by this crawler.
String description
A description of the crawler.
List<E> classifiers
A list of custom classifiers associated with the crawler.
SchemaChangePolicy schemaChangePolicy
Sets the behavior when the crawler finds a changed or deleted object.
String state
Indicates whether the crawler is running, or whether a run is pending.
String tablePrefix
The prefix added to the names of tables that are created.
Schedule schedule
For scheduled crawlers, the schedule when the crawler runs.
Long crawlElapsedTime
If the crawler is running, contains the total time elapsed since the last crawl began.
Date creationTime
The time when the crawler was created.
Date lastUpdated
The time the crawler was last updated.
LastCrawlInfo lastCrawl
The status of the last crawl, and potentially error information if an error occurred.
Long version
The version of the crawler.
String configuration
Crawler configuration information. This versioned JSON string allows users to specify aspects of a Crawler's behavior.
You can use this field to force partitions to inherit metadata such as classification, input format, output format, serde information, and schema from their parent table, rather than detect this information separately for each partition. Use the following JSON string to specify that behavior:
Example:
'{ "Version": 1.0, "CrawlerOutput": { "Partitions": { "AddOrUpdateBehavior": "InheritFromTable" } } }'
String crawlerName
The name of the crawler.
Double timeLeftSeconds
The estimated time left to complete a running crawl.
Boolean stillEstimating
True if the crawler is still estimating how long it will take to complete this run.
Double lastRuntimeSeconds
The duration of the crawler's most recent run, in seconds.
Double medianRuntimeSeconds
The median duration of this crawler's runs, in seconds.
Integer tablesCreated
The number of tables created by this crawler.
Integer tablesUpdated
The number of tables updated by this crawler.
Integer tablesDeleted
The number of tables deleted by this crawler.
CreateGrokClassifierRequest grokClassifier
A GrokClassifier
object specifying the classifier to create.
CreateXMLClassifierRequest xMLClassifier
An XMLClassifier
object specifying the classifier to create.
String catalogId
The ID of the Data Catalog in which to create the connection. If none is supplied, the AWS account ID is used by default.
ConnectionInput connectionInput
A ConnectionInput
object defining the connection to create.
String name
Name of the new crawler.
String role
The IAM role (or ARN of an IAM role) used by the new crawler to access customer resources.
String databaseName
The AWS Glue database where results are written, such as:
arn:aws:daylight:us-east-1::database/sometable/*
.
String description
A description of the new crawler.
CrawlerTargets targets
A list of collection of targets to crawl.
String schedule
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
List<E> classifiers
A list of custom classifiers that the user has registered. By default, all AWS classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
String tablePrefix
The table prefix used for catalog tables that are created.
SchemaChangePolicy schemaChangePolicy
Policy for the crawler's update and deletion behavior.
String configuration
Crawler configuration information. This versioned JSON string allows users to specify aspects of a Crawler's behavior.
You can use this field to force partitions to inherit metadata such as classification, input format, output format, serde information, and schema from their parent table, rather than detect this information separately for each partition. Use the following JSON string to specify that behavior:
Example:
'{ "Version": 1.0, "CrawlerOutput": { "Partitions": { "AddOrUpdateBehavior": "InheritFromTable" } } }'
String catalogId
The ID of the Data Catalog in which to create the database. If none is supplied, the AWS account ID is used by default.
DatabaseInput databaseInput
A DatabaseInput
object defining the metadata database to create in the catalog.
String endpointName
The name to be assigned to the new DevEndpoint.
String roleArn
The IAM role for the DevEndpoint.
List<E> securityGroupIds
Security group IDs for the security groups to be used by the new DevEndpoint.
String subnetId
The subnet ID for the new DevEndpoint to use.
String publicKey
The public key to use for authentication.
Integer numberOfNodes
The number of AWS Glue Data Processing Units (DPUs) to allocate to this DevEndpoint.
String extraPythonLibsS3Path
Path(s) to one or more Python libraries in an S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.
Please note that only pure Python libraries can currently be used on a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.
String extraJarsS3Path
Path to one or more Java Jars in an S3 bucket that should be loaded in your DevEndpoint.
String endpointName
The name assigned to the new DevEndpoint.
String status
The current status of the new DevEndpoint.
List<E> securityGroupIds
The security groups assigned to the new DevEndpoint.
String subnetId
The subnet ID assigned to the new DevEndpoint.
String roleArn
The AWS ARN of the role assigned to the new DevEndpoint.
String yarnEndpointAddress
The address of the YARN endpoint used by this DevEndpoint.
Integer zeppelinRemoteSparkInterpreterPort
The Apache Zeppelin port for the remote Apache Spark interpreter.
Integer numberOfNodes
The number of AWS Glue Data Processing Units (DPUs) allocated to this DevEndpoint.
String availabilityZone
The AWS availability zone where this DevEndpoint is located.
String vpcId
The ID of the VPC used by this DevEndpoint.
String extraPythonLibsS3Path
Path(s) to one or more Python libraries in an S3 bucket that will be loaded in your DevEndpoint.
String extraJarsS3Path
Path to one or more Java Jars in an S3 bucket that will be loaded in your DevEndpoint.
String failureReason
The reason for a current failure in this DevEndpoint.
Date createdTimestamp
The point in time at which this DevEndpoint was created.
String classification
An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on.
String name
The name of the new classifier.
String grokPattern
The grok pattern used by this classifier.
String customPatterns
Optional custom grok patterns used by this classifier.
String name
The name you assign to this job. It must be unique in your account.
String description
Description of the job.
String logUri
This field is reserved for future use.
String role
The name of the IAM role associated with this job.
ExecutionProperty executionProperty
An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.
JobCommand command
The JobCommand that executes this job.
Map<K,V> defaultArguments
The default arguments for this job.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
ConnectionsList connections
The connections used for this job.
Integer maxRetries
The maximum number of times to retry this job if it fails.
Integer allocatedCapacity
The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
String name
The unique name that was provided.
String catalogId
The ID of the catalog in which the partion is to be created. Currently, this should be the AWS account ID.
String databaseName
The name of the metadata database in which the partition is to be created.
String tableName
The name of the metadata table in which the partition is to be created.
PartitionInput partitionInput
A PartitionInput
structure defining the partition to be created.
String catalogId
The ID of the Data Catalog in which to create the Table
. If none is supplied, the AWS account ID is
used by default.
String databaseName
The catalog database in which to create the new table.
TableInput tableInput
The TableInput
object that defines the metadata table to create in the catalog.
String name
The name of the trigger.
String type
The type of the new trigger.
String schedule
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
This field is required when the trigger type is SCHEDULED.
Predicate predicate
A predicate to specify when the new trigger should fire.
This field is required when the trigger type is CONDITIONAL.
List<E> actions
The actions initiated by this trigger when it fires.
String description
A description of the new trigger.
String name
The name of the trigger.
String catalogId
The ID of the Data Catalog in which to create the function. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database in which to create the function.
UserDefinedFunctionInput functionInput
A FunctionInput
object that defines the function to create in the Data Catalog.
String classification
An identifier of the data format that the classifier matches.
String name
The name of the classifier.
String rowTag
The XML tag designating the element that contains each record in an XML document being parsed. Note that this
cannot identify a self-closing element (closed by />
). An empty row element that contains only
attributes can be parsed as long as it ends with a closing tag (for example,
<row item_a="A" item_b="B"></row>
is okay, but
<row item_a="A" item_b="B" />
is not).
String name
Name of the database.
String description
Description of the database.
String locationUri
The location of the database (for example, an HDFS path).
Map<K,V> parameters
A list of key-value pairs that define parameters and properties of the database.
Date createTime
The time at which the metadata database was created in the catalog.
String name
Name of the classifier to remove.
String name
Name of the crawler to remove.
String endpointName
The name of the DevEndpoint.
String jobName
The name of the job to delete.
String jobName
The name of the job that was deleted.
String catalogId
The ID of the Data Catalog where the partition to be deleted resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database in which the table in question resides.
String tableName
The name of the table where the partition to be deleted is located.
List<E> partitionValues
The values that define the partition.
String name
The name of the trigger to delete.
String name
The name of the trigger that was deleted.
String catalogId
The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the function is located.
String functionName
The name of the function definition to be deleted.
String endpointName
The name of the DevEndpoint.
String roleArn
The AWS ARN of the IAM role used in this DevEndpoint.
List<E> securityGroupIds
A list of security group identifiers used in this DevEndpoint.
String subnetId
The subnet ID for this DevEndpoint.
String yarnEndpointAddress
The YARN endpoint address used by this DevEndpoint.
Integer zeppelinRemoteSparkInterpreterPort
The Apache Zeppelin port for the remote Apache Spark interpreter.
String publicAddress
The public address used by this DevEndpoint.
String status
The current status of this DevEndpoint.
Integer numberOfNodes
The number of AWS Glue Data Processing Units (DPUs) allocated to this DevEndpoint.
String availabilityZone
The AWS availability zone where this DevEndpoint is located.
String vpcId
The ID of the virtual private cloud (VPC) used by this DevEndpoint.
String extraPythonLibsS3Path
Path(s) to one or more Python libraries in an S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.
Please note that only pure Python libraries can currently be used on a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.
String extraJarsS3Path
Path to one or more Java Jars in an S3 bucket that should be loaded in your DevEndpoint.
Please note that only pure Java/Scala libraries can currently be used on a DevEndpoint.
String failureReason
The reason for a current failure in this DevEndpoint.
String lastUpdateStatus
The status of the last update.
Date createdTimestamp
The point in time at which this DevEndpoint was created.
Date lastModifiedTimestamp
The point in time at which this DevEndpoint was last modified.
String publicKey
The public key to be used by this DevEndpoint for authentication.
String extraPythonLibsS3Path
Path(s) to one or more Python libraries in an S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.
Please note that only pure Python libraries can currently be used on a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.
String extraJarsS3Path
Path to one or more Java Jars in an S3 bucket that should be loaded in your DevEndpoint.
Please note that only pure Java/Scala libraries can currently be used on a DevEndpoint.
Integer maxConcurrentRuns
The maximum number of concurrent runs allowed for a job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit.
String catalogId
The ID of the catalog to migrate. Currently, this should be the AWS account ID.
CatalogImportStatus importStatus
The status of the specified catalog migration.
String name
Name of the classifier to retrieve.
Classifier classifier
The requested classifier.
Connection connection
The requested connection definition.
String catalogId
The ID of the Data Catalog in which the connections reside. If none is supplied, the AWS account ID is used by default.
GetConnectionsFilter filter
A filter that controls which connections will be returned.
String nextToken
A continuation token, if this is a continuation call.
Integer maxResults
The maximum number of connections to return in one response.
String name
Name of the crawler to retrieve metadata for.
Crawler crawler
The metadata for the specified crawler.
Database database
The definition of the specified database in the catalog.
String catalogId
The ID of the Data Catalog from which to retrieve Databases
. If none is supplied, the AWS account ID
is used by default.
String nextToken
A continuation token, if this is a continuation call.
Integer maxResults
The maximum number of databases to return in one response.
String pythonScript
The Python script to transform.
String endpointName
Name of the DevEndpoint for which to retrieve information.
DevEndpoint devEndpoint
A DevEndpoint definition.
String jobName
The name of the job to retrieve.
Job job
The requested job definition.
JobRun jobRun
The requested job-run metadata.
CatalogEntry source
Specifies the source table.
List<E> sinks
A list of target tables.
Location location
Parameters for the mapping.
String catalogId
The ID of the Data Catalog where the partition in question resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the partition resides.
String tableName
The name of the partition's table.
List<E> partitionValues
The values that define the partition.
Partition partition
The requested information, in the form of a Partition
object.
String catalogId
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the partitions reside.
String tableName
The name of the partitions' table.
String expression
An expression filtering the partitions to be returned.
String nextToken
A continuation token, if this is not the first call to retrieve these partitions.
Segment segment
The segment of the table's partitions to scan in this request.
Integer maxResults
The maximum number of partitions to return in a single response.
List<E> mapping
The list of mappings from a source table to target tables.
CatalogEntry source
The source table.
List<E> sinks
The target tables.
Location location
Parameters for the mapping.
String language
The programming language of the code to perform the mapping.
String catalogId
The ID of the Data Catalog where the table resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the database in the catalog in which the table resides.
String name
The name of the table for which to retrieve the definition.
Table table
The Table
object that defines the specified table.
String catalogId
The ID of the Data Catalog where the tables reside. If none is supplied, the AWS account ID is used by default.
String databaseName
The database in the catalog whose tables to list.
String expression
A regular expression pattern. If present, only those tables whose names match the pattern are returned.
String nextToken
A continuation token, included if this is a continuation call.
Integer maxResults
The maximum number of tables to return in a single response.
String catalogId
The ID of the Data Catalog where the tables reside. If none is supplied, the AWS account ID is used by default.
String databaseName
The database in the catalog in which the table resides.
String tableName
The name of the table.
String nextToken
A continuation token, if this is not the first call.
Integer maxResults
The maximum number of table versions to return in one response.
String name
The name of the trigger to retrieve.
Trigger trigger
The requested trigger definition.
String nextToken
A continuation token, if this is a continuation call.
String dependentJobName
The name of the job for which to retrieve triggers. The trigger that can start this job will be returned, and if there is no such trigger, all triggers will be returned.
Integer maxResults
The maximum size of the response.
String catalogId
The ID of the Data Catalog where the function to be retrieved is located. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the function is located.
String functionName
The name of the function.
UserDefinedFunction userDefinedFunction
The requested function definition.
String catalogId
The ID of the Data Catalog where the functions to be retrieved are located. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the functions are located.
String pattern
An optional function-name pattern string that filters the function definitions returned.
String nextToken
A continuation token, if this is a continuation call.
Integer maxResults
The maximum number of functions to return in one response.
String name
The name of the classifier.
String classification
An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, and so on.
Date creationTime
The time this classifier was registered.
Date lastUpdated
The time this classifier was last updated.
Long version
The version of this classifier.
String grokPattern
The grok pattern applied to a data store by this classifier. For more information, see built-in patterns in Writing Custom Classifers.
String customPatterns
Optional custom grok patterns defined by this classifier. For more information, see custom patterns in Writing Custom Classifers.
String catalogId
The ID of the catalog to import. Currently, this should be the AWS account ID.
String connectionName
The name of the connection to use to connect to the JDBC target.
String path
The path of the JDBC target.
List<E> exclusions
A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
String name
The name you assign to this job.
String description
Description of this job.
String logUri
This field is reserved for future use.
String role
The name of the IAM role associated with this job.
Date createdOn
The time and date that this job specification was created.
Date lastModifiedOn
The last point in time when this job specification was modified.
ExecutionProperty executionProperty
An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.
JobCommand command
The JobCommand that executes this job.
Map<K,V> defaultArguments
The default arguments for this job, specified as name-value pairs.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
ConnectionsList connections
The connections used for this job.
Integer maxRetries
The maximum number of times to retry this job if it fails.
Integer allocatedCapacity
The number of AWS Glue data processing units (DPUs) allocated to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
String id
The ID of this job run.
Integer attempt
The number of the attempt to run this job.
String previousRunId
The ID of the previous run of this job. For example, the JobRunId specified in the StartJobRun action.
String triggerName
The name of the trigger that started this job run.
String jobName
The name of the job being run.
Date startedOn
The date and time at which this job run was started.
Date lastModifiedOn
The last time this job run was modified.
Date completedOn
The date and time this job run completed.
String jobRunState
The current state of the job run.
Map<K,V> arguments
The job arguments associated with this run. These override equivalent default arguments set for the job.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
String errorMessage
An error message associated with this job run.
List<E> predecessorRuns
A list of predecessors to this job run.
Integer allocatedCapacity
The number of AWS Glue data processing units (DPUs) allocated to this JobRun. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
String description
Description of the job.
String logUri
This field is reserved for future use.
String role
The name of the IAM role associated with this job (required).
ExecutionProperty executionProperty
An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.
JobCommand command
The JobCommand that executes this job (required).
Map<K,V> defaultArguments
The default arguments for this job.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
ConnectionsList connections
The connections used for this job.
Integer maxRetries
The maximum number of times to retry this job if it fails.
Integer allocatedCapacity
The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
String status
Status of the last crawl.
String errorMessage
If an error occurred, the error information about the last crawl.
String logGroup
The log group for the last crawl.
String logStream
The log stream for the last crawl.
String messagePrefix
The prefix for a message about this crawl.
Date startTime
The time at which the crawl started.
List<E> values
The values of the partition.
String databaseName
The name of the catalog database where the table in question is located.
String tableName
The name of the table in question.
Date creationTime
The time at which the partition was created.
Date lastAccessTime
The last time at which the partition was accessed.
StorageDescriptor storageDescriptor
Provides information about the physical location where the partition is stored.
Map<K,V> parameters
Partition parameters, in the form of a list of key-value pairs.
Date lastAnalyzedTime
The last time at which column statistics were computed for this partition.
List<E> partitionValues
The values that define the partition.
ErrorDetail errorDetail
Details about the partition error.
List<E> values
The values of the partition.
Date lastAccessTime
The last time at which the partition was accessed.
StorageDescriptor storageDescriptor
Provides information about the physical location where the partition is stored.
Map<K,V> parameters
Partition parameters, in the form of a list of key-value pairs.
Date lastAnalyzedTime
The last time at which column statistics were computed for this partition.
String jobName
The name of the job in question.
JobBookmarkEntry jobBookmarkEntry
The reset bookmark entry.
String path
The path to the Amazon S3 target.
List<E> exclusions
A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.
String scheduleExpression
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
String state
The state of the schedule.
List<E> skewedColumnNames
A list of names of columns that contain skewed values.
List<E> skewedColumnValues
A list of values that appear so frequently as to be considered skewed.
Map<K,V> skewedColumnValueLocationMaps
A mapping of skewed values to the columns that contain them.
String name
Name of the crawler to start.
String crawlerName
Name of the crawler to schedule.
String jobName
The name of the job to start.
String jobRunId
The ID of a previous JobRun to retry.
Map<K,V> arguments
The job arguments specifically for this run. They override the equivalent default arguments set for the job itself.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
Integer allocatedCapacity
The number of AWS Glue data processing units (DPUs) to allocate to this JobRun. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
String jobRunId
The ID assigned to this job run.
String name
The name of the trigger to start.
String name
The name of the trigger that was started.
String name
Name of the crawler to stop.
String crawlerName
Name of the crawler whose schedule state to set.
String name
The name of the trigger to stop.
String name
The name of the trigger that was stopped.
List<E> columns
A list of the Columns
in the table.
String location
The physical location of the table. By default this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.
String inputFormat
The input format: SequenceFileInputFormat
(binary), or TextInputFormat
, or a custom
format.
String outputFormat
The output format: SequenceFileOutputFormat
(binary), or IgnoreKeyTextOutputFormat
, or
a custom format.
Boolean compressed
True if the data in the table is compressed, or False if not.
Integer numberOfBuckets
Must be specified if the table contains any dimension columns.
SerDeInfo serdeInfo
Serialization/deserialization (SerDe) information.
List<E> bucketColumns
A list of reducer grouping columns, clustering columns, and bucketing columns in the table.
List<E> sortColumns
A list specifying the sort order of each bucket in the table.
Map<K,V> parameters
User-supplied properties in key-value form.
SkewedInfo skewedInfo
Information about values that appear very frequently in a column (skewed values).
Boolean storedAsSubDirectories
True if the table data is stored in subdirectories, or False if not.
String name
Name of the table.
String databaseName
Name of the metadata database where the table metadata resides.
String description
Description of the table.
String owner
Owner of the table.
Date createTime
Time when the table definition was created in the Data Catalog.
Date updateTime
Last time the table was updated.
Date lastAccessTime
Last time the table was accessed. This is usually taken from HDFS, and may not be reliable.
Date lastAnalyzedTime
Last time column statistics were computed for this table.
Integer retention
Retention time for this table.
StorageDescriptor storageDescriptor
A storage descriptor containing information about the physical storage of this table.
List<E> partitionKeys
A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.
String viewOriginalText
If the table is a view, the original text of the view; otherwise null
.
String viewExpandedText
If the table is a view, the expanded text of the view; otherwise null
.
String tableType
The type of this table (EXTERNAL_TABLE
, VIRTUAL_VIEW
, etc.).
Map<K,V> parameters
Properties associated with this table, as a list of key-value pairs.
String createdBy
Person or entity who created the table.
String tableName
Name of the table.
ErrorDetail errorDetail
Detail about the error.
String name
Name of the table.
String description
Description of the table.
String owner
Owner of the table.
Date lastAccessTime
Last time the table was accessed.
Date lastAnalyzedTime
Last time column statistics were computed for this table.
Integer retention
Retention time for this table.
StorageDescriptor storageDescriptor
A storage descriptor containing information about the physical storage of this table.
List<E> partitionKeys
A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.
String viewOriginalText
If the table is a view, the original text of the view; otherwise null
.
String viewExpandedText
If the table is a view, the expanded text of the view; otherwise null
.
String tableType
The type of this table (EXTERNAL_TABLE
, VIRTUAL_VIEW
, etc.).
Map<K,V> parameters
Properties associated with this table, as a list of key-value pairs.
String name
Name of the trigger.
String id
Reserved for future use.
String type
The type of trigger that this is.
String state
The current state of the trigger.
String description
A description of this trigger.
String schedule
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
List<E> actions
The actions initiated by this trigger.
Predicate predicate
The predicate of this trigger, which defines when it will fire.
String name
Reserved for future use.
String description
A description of this trigger.
String schedule
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
List<E> actions
The actions initiated by this trigger.
Predicate predicate
The predicate of this trigger, which defines when it will fire.
UpdateGrokClassifierRequest grokClassifier
A GrokClassifier
object with updated fields.
UpdateXMLClassifierRequest xMLClassifier
An XMLClassifier
object with updated fields.
String catalogId
The ID of the Data Catalog in which the connection resides. If none is supplied, the AWS account ID is used by default.
String name
The name of the connection definition to update.
ConnectionInput connectionInput
A ConnectionInput
object that redefines the connection in question.
String name
Name of the new crawler.
String role
The IAM role (or ARN of an IAM role) used by the new crawler to access customer resources.
String databaseName
The AWS Glue database where results are stored, such as:
arn:aws:daylight:us-east-1::database/sometable/*
.
String description
A description of the new crawler.
CrawlerTargets targets
A list of targets to crawl.
String schedule
A cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
List<E> classifiers
A list of custom classifiers that the user has registered. By default, all classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
String tablePrefix
The table prefix used for catalog tables that are created.
SchemaChangePolicy schemaChangePolicy
Policy for the crawler's update and deletion behavior.
String configuration
Crawler configuration information. This versioned JSON string allows users to specify aspects of a Crawler's behavior.
You can use this field to force partitions to inherit metadata such as classification, input format, output format, serde information, and schema from their parent table, rather than detect this information separately for each partition. Use the following JSON string to specify that behavior:
Example:
'{ "Version": 1.0, "CrawlerOutput": { "Partitions": { "AddOrUpdateBehavior": "InheritFromTable" } } }'
String crawlerName
Name of the crawler whose schedule to update.
String schedule
The updated cron
expression used to specify the schedule (see Time-Based Schedules for
Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:
cron(15 12 * * ? *)
.
String catalogId
The ID of the Data Catalog in which the metadata database resides. If none is supplied, the AWS account ID is used by default.
String name
The name of the metadata database to update in the catalog.
DatabaseInput databaseInput
A DatabaseInput
object specifying the new definition of the metadata database in the catalog.
String endpointName
The name of the DevEndpoint to be updated.
String publicKey
The public key for the DevEndpoint to use.
DevEndpointCustomLibraries customLibraries
Custom Python or Java libraries to be loaded in the DevEndpoint.
Boolean updateEtlLibraries
True if the list of custom libraries to be loaded in the development endpoint needs to be updated, or False otherwise.
String name
The name of the GrokClassifier
.
String classification
An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on.
String grokPattern
The grok pattern used by this classifier.
String customPatterns
Optional custom grok patterns used by this classifier.
String jobName
Returns the name of the updated job.
String catalogId
The ID of the Data Catalog where the partition to be updated resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database in which the table in question resides.
String tableName
The name of the table where the partition to be updated is located.
List<E> partitionValueList
A list of the values defining the partition.
PartitionInput partitionInput
The new partition object to which to update the partition.
String catalogId
The ID of the Data Catalog where the table resides. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database in which the table resides.
TableInput tableInput
An updated TableInput
object to define the metadata table in the catalog.
String name
The name of the trigger to update.
TriggerUpdate triggerUpdate
The new values with which to update the trigger.
Trigger trigger
The resulting trigger definition.
String catalogId
The ID of the Data Catalog where the function to be updated is located. If none is supplied, the AWS account ID is used by default.
String databaseName
The name of the catalog database where the function to be updated is located.
String functionName
The name of the function.
UserDefinedFunctionInput functionInput
A FunctionInput
object that re-defines the function in the Data Catalog.
String name
The name of the classifier.
String classification
An identifier of the data format that the classifier matches.
String rowTag
The XML tag designating the element that contains each record in an XML document being parsed. Note that this
cannot identify a self-closing element (closed by />
). An empty row element that contains only
attributes can be parsed as long as it ends with a closing tag (for example,
<row item_a="A" item_b="B"></row>
is okay, but
<row item_a="A" item_b="B" />
is not).
String functionName
The name of the function.
String className
The Java class that contains the function code.
String ownerName
The owner of the function.
String ownerType
The owner type.
Date createTime
The time at which the function was created.
List<E> resourceUris
The resource URIs for the function.
String functionName
The name of the function.
String className
The Java class that contains the function code.
String ownerName
The owner of the function.
String ownerType
The owner type.
List<E> resourceUris
The resource URIs for the function.
String name
The name of the classifier.
String classification
An identifier of the data format that the classifier matches.
Date creationTime
The time this classifier was registered.
Date lastUpdated
The time this classifier was last updated.
Long version
The version of this classifier.
String rowTag
The XML tag designating the element that contains each record in an XML document being parsed. Note that this
cannot identify a self-closing element (closed by />
). An empty row element that contains only
attributes can be parsed as long as it ends with a closing tag (for example,
<row item_a="A" item_b="B"></row>
is okay, but
<row item_a="A" item_b="B" />
is not).
Copyright © 2018. All rights reserved.