Did this page help you?

   Yes   No   Tell us about it...

com.amazonaws.services.elasticmapreduce.util
Class StreamingStep

java.lang.Object
  extended by com.amazonaws.services.elasticmapreduce.util.StreamingStep

public class StreamingStep
extends Object

Class that makes it easy to define Hadoop Streaming steps.

See also: Hadoop Streaming

 AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
 AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials);

 HadoopJarStepConfig config = new StreamingStep()
     .withInputs("s3://elasticmapreduce/samples/wordcount/input")
     .withOutput("s3://my-bucket/output/")
     .withMapper("s3://elasticmapreduce/samples/wordcount/wordSplitter.py")
     .withReducer("aggregate")
     .toHadoopJarStepConfig();

 StepConfig wordCount = new StepConfig()
     .withName("Word Count")
     .withActionOnFailure("TERMINATE_JOB_FLOW")
     .withHadoopJarStep(config);

 RunJobFlowRequest request = new RunJobFlowRequest()
     .withName("Word Count")
     .withSteps(wordCount)
     .withLogUri("s3://log-bucket/")
     .withInstances(new JobFlowInstancesConfig()
         .withEc2KeyName("keypairt")
         .withHadoopVersion("0.20")
         .withInstanceCount(5)
         .withKeepJobFlowAliveWhenNoSteps(true)
         .withMasterInstanceType("m1.small")
         .withSlaveInstanceType("m1.small"));

 RunJobFlowResult result = emr.runJobFlow(request);
 


Constructor Summary
StreamingStep()
          Creates a new default StreamingStep.
 
Method Summary
 Map<String,String> getHadoopConfig()
          Get the Hadoop config overrides (-D values).
 List<String> getInputs()
          Get list of step input paths.
 String getMapper()
          Get the mapper.
 String getOutput()
          Get output path.
 String getReducer()
          Get the reducer
 void setHadoopConfig(Map<String,String> hadoopConfig)
          Set the Hadoop config overrides (-D values).
 void setInputs(Collection<String> inputs)
          Set the list of step input paths.
 void setMapper(String mapper)
          Set the mapper.
 void setOutput(String output)
          Set the output path for this step.
 void setReducer(String reducer)
          Set the reducer
 HadoopJarStepConfig toHadoopJarStepConfig()
          Creates the final HadoopJarStepConfig once you are done configuring the step.
 StreamingStep withHadoopConfig(String key, String value)
          Add a Hadoop config override (-D value).
 StreamingStep withInputs(String... inputs)
          Add more input paths to this step.
 StreamingStep withMapper(String mapper)
          Set the mapper
 StreamingStep withOutput(String output)
          Set the output path for this step.
 StreamingStep withReducer(String reducer)
          Set the reducer
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StreamingStep

public StreamingStep()
Creates a new default StreamingStep.

Method Detail

getInputs

public List<String> getInputs()
Get list of step input paths.

Returns:
List of step inputs

setInputs

public void setInputs(Collection<String> inputs)
Set the list of step input paths.

Parameters:
inputs - List of step inputs.

withInputs

public StreamingStep withInputs(String... inputs)
Add more input paths to this step.

Parameters:
inputs - A list of inputs to this step.
Returns:
A reference to this updated object so that method calls can be chained together.

getOutput

public String getOutput()
Get output path.

Returns:
Output path.

setOutput

public void setOutput(String output)
Set the output path for this step.

Parameters:
output - Output path.

withOutput

public StreamingStep withOutput(String output)
Set the output path for this step.

Parameters:
output - Output path
Returns:
A reference to this updated object so that method calls can be chained together.

getMapper

public String getMapper()
Get the mapper.

Returns:
Mapper.

setMapper

public void setMapper(String mapper)
Set the mapper.

Parameters:
mapper - Mapper

withMapper

public StreamingStep withMapper(String mapper)
Set the mapper

Parameters:
mapper - Mapper
Returns:
A reference to this updated object so that method calls can be chained together.

getReducer

public String getReducer()
Get the reducer

Returns:
Reducer

setReducer

public void setReducer(String reducer)
Set the reducer

Parameters:
reducer - Reducer

withReducer

public StreamingStep withReducer(String reducer)
Set the reducer

Parameters:
reducer - Reducer
Returns:
A reference to this updated object so that method calls can be chained together.

getHadoopConfig

public Map<String,String> getHadoopConfig()
Get the Hadoop config overrides (-D values).

Returns:
Hadoop config.

setHadoopConfig

public void setHadoopConfig(Map<String,String> hadoopConfig)
Set the Hadoop config overrides (-D values).

Parameters:
hadoopConfig - Hadoop config.

withHadoopConfig

public StreamingStep withHadoopConfig(String key,
                                      String value)
Add a Hadoop config override (-D value).

Parameters:
key - Hadoop configuration key.
value - Configuration value.
Returns:
A reference to this updated object so that method calls can be chained together.

toHadoopJarStepConfig

public HadoopJarStepConfig toHadoopJarStepConfig()
Creates the final HadoopJarStepConfig once you are done configuring the step. You can use this as you would any other HadoopJarStepConfig.

Returns:
HadoopJarStepConfig representing this streaming step.


Copyright © 2010 Amazon Web Services, Inc. All Rights Reserved.