S3DataSource (AWS SDK for Java

java.lang.Object
- com.amazonaws.services.sagemaker.model.S3DataSource

All Implemented Interfaces:: StructuredPojo, Serializable, Cloneable

@Generated(value="com.amazonaws:aws-java-sdk-code-generator")
public class S3DataSource
extends Object
implements Serializable, Cloneable, StructuredPojo

Describes the S3 data source.

See Also:: AWS API Documentation, Serialized Form

Constructor Summary

Constructors
Constructor and Description

S3DataSource()

Constructors
Constructor and Description
`S3DataSource()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`S3DataSource`	`clone()`
`boolean`	`equals(Object obj)`
`String`	`getS3DataDistributionType()` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`String`	`getS3DataType()` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`String`	`getS3Uri()` Depending on the value specified for the `S3DataType`, identifies either a key name prefix or a manifest.
`int`	`hashCode()`
`void`	`marshall(ProtocolMarshaller protocolMarshaller)` Marshalls this structured data using the given `ProtocolMarshaller`.
`void`	`setS3DataDistributionType(String s3DataDistributionType)` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`void`	`setS3DataType(String s3DataType)` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`void`	`setS3Uri(String s3Uri)` Depending on the value specified for the `S3DataType`, identifies either a key name prefix or a manifest.
`String`	`toString()` Returns a string representation of this object; useful for testing and debugging.
`S3DataSource`	`withS3DataDistributionType(S3DataDistribution s3DataDistributionType)` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`S3DataSource`	`withS3DataDistributionType(String s3DataDistributionType)` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`S3DataSource`	`withS3DataType(S3DataType s3DataType)` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`S3DataSource`	`withS3DataType(String s3DataType)` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`S3DataSource`	`withS3Uri(String s3Uri)` Depending on the value specified for the `S3DataType`, identifies either a key name prefix or a manifest.

Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - S3DataSource
```
public S3DataSource()
```
- Method Detail
  - setS3DataType
```
public void setS3DataType(String s3DataType)
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Parameters:
    
    s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    See Also:
    
    S3DataType
  - getS3DataType
```
public String getS3DataType()
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Returns:
    
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    See Also:
    
    S3DataType
  - withS3DataType
```
public S3DataSource withS3DataType(String s3DataType)
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Parameters:
    
    s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    S3DataType
  - withS3DataType
```
public S3DataSource withS3DataType(S3DataType s3DataType)
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Parameters:
    
    s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    S3DataType
  - setS3Uri
```
public void setS3Uri(String s3Uri)
```
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    - A key name prefix might look like this: s3://bucketname/exampleprefix.
    - A manifest might look like this: s3://bucketname/example.manifest
      
      The manifest is an S3 object which is a JSON file with the following format:
      
      [
      
      {"prefix": "s3://customer_bucket/some/prefix/"},
      
      "relative/path/to/custdata-1",
      
      "relative/path/custdata-2",
      
      ...
      
      ]
      
      The preceding JSON matches the following s3Uris:
      
      s3://customer_bucket/some/prefix/relative/path/to/custdata-1
      
      s3://customer_bucket/some/prefix/relative/path/custdata-1
      
      ...
      
      The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
    Parameters:
    s3Uri - Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    
    A key name prefix might look like this: s3://bucketname/exampleprefix.
    
    A manifest might look like this: s3://bucketname/example.manifest
    
    The manifest is an S3 object which is a JSON file with the following format:
    
    [
    
    {"prefix": "s3://customer_bucket/some/prefix/"},
    
    "relative/path/to/custdata-1",
    
    "relative/path/custdata-2",
    
    ...
    
    ]
    
    The preceding JSON matches the following s3Uris:
    
    s3://customer_bucket/some/prefix/relative/path/to/custdata-1
    
    s3://customer_bucket/some/prefix/relative/path/custdata-1
    
    ...
    
    The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
  - getS3Uri
```
public String getS3Uri()
```
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    - A key name prefix might look like this: s3://bucketname/exampleprefix.
    - A manifest might look like this: s3://bucketname/example.manifest
      
      The manifest is an S3 object which is a JSON file with the following format:
      
      [
      
      {"prefix": "s3://customer_bucket/some/prefix/"},
      
      "relative/path/to/custdata-1",
      
      "relative/path/custdata-2",
      
      ...
      
      ]
      
      The preceding JSON matches the following s3Uris:
      
      s3://customer_bucket/some/prefix/relative/path/to/custdata-1
      
      s3://customer_bucket/some/prefix/relative/path/custdata-1
      
      ...
      
      The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
    Returns:
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    
    A key name prefix might look like this: s3://bucketname/exampleprefix.
    
    A manifest might look like this: s3://bucketname/example.manifest
    
    The manifest is an S3 object which is a JSON file with the following format:
    
    [
    
    {"prefix": "s3://customer_bucket/some/prefix/"},
    
    "relative/path/to/custdata-1",
    
    "relative/path/custdata-2",
    
    ...
    
    ]
    
    The preceding JSON matches the following s3Uris:
    
    s3://customer_bucket/some/prefix/relative/path/to/custdata-1
    
    s3://customer_bucket/some/prefix/relative/path/custdata-1
    
    ...
    
    The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
  - withS3Uri
```
public S3DataSource withS3Uri(String s3Uri)
```
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    - A key name prefix might look like this: s3://bucketname/exampleprefix.
    - A manifest might look like this: s3://bucketname/example.manifest
      
      The manifest is an S3 object which is a JSON file with the following format:
      
      [
      
      {"prefix": "s3://customer_bucket/some/prefix/"},
      
      "relative/path/to/custdata-1",
      
      "relative/path/custdata-2",
      
      ...
      
      ]
      
      The preceding JSON matches the following s3Uris:
      
      s3://customer_bucket/some/prefix/relative/path/to/custdata-1
      
      s3://customer_bucket/some/prefix/relative/path/custdata-1
      
      ...
      
      The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
    Parameters:
    s3Uri - Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    
    A key name prefix might look like this: s3://bucketname/exampleprefix.
    
    A manifest might look like this: s3://bucketname/example.manifest
    
    The manifest is an S3 object which is a JSON file with the following format:
    
    [
    
    {"prefix": "s3://customer_bucket/some/prefix/"},
    
    "relative/path/to/custdata-1",
    
    "relative/path/custdata-2",
    
    ...
    
    ]
    
    The preceding JSON matches the following s3Uris:
    
    s3://customer_bucket/some/prefix/relative/path/to/custdata-1
    
    s3://customer_bucket/some/prefix/relative/path/custdata-1
    
    ...
    
    The complete set of s3uris in this manifest constitutes the input data for the channel for this datasource. The object that each s3uris points to must readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
  - setS3DataDistributionType
```
public void setS3DataDistributionType(String s3DataDistributionType)
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Parameters:
    
    s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    See Also:
    
    S3DataDistribution
  - getS3DataDistributionType
```
public String getS3DataDistributionType()
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Returns:
    
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    See Also:
    
    S3DataDistribution
  - withS3DataDistributionType
```
public S3DataSource withS3DataDistributionType(String s3DataDistributionType)
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Parameters:
    
    s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    S3DataDistribution
  - withS3DataDistributionType
```
public S3DataSource withS3DataDistributionType(S3DataDistribution s3DataDistributionType)
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Parameters:
    
    s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both FILE and PIPE modes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    Returns:
    
    Returns a reference to this object so that method calls can be chained together.
    
    See Also:
    
    S3DataDistribution
  - toString
```
public String toString()
```
    Returns a string representation of this object; useful for testing and debugging.
    
    Overrides:
    
    toString in class Object
    
    Returns:
    
    A string representation of this object.
    
    See Also:
    
    Object.toString()
  - equals
```
public boolean equals(Object obj)
```
    Overrides:
    
    equals in class Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object
  - clone
```
public S3DataSource clone()
```
    Overrides:
    
    clone in class Object
  - marshall
```
public void marshall(ProtocolMarshaller protocolMarshaller)
```
    Description copied from interface: StructuredPojo
    
    Marshalls this structured data using the given ProtocolMarshaller.
    
    Specified by:
    
    marshall in interface StructuredPojo
    
    Parameters:
    
    protocolMarshaller - Implementation of ProtocolMarshaller used to marshall this object's data.

Did this page help you?

Class S3DataSource

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

S3DataSource

Method Detail

setS3DataType

getS3DataType

withS3DataType

withS3DataType

setS3Uri

getS3Uri

withS3Uri

setS3DataDistributionType

getS3DataDistributionType

withS3DataDistributionType

withS3DataDistributionType

toString

equals

hashCode

clone

marshall