Did this page help you?

   Yes   No   Tell us about it...

com.amazonaws.services.elasticmapreduce.util
Class ResizeJobFlowStep

java.lang.Object
  extended by com.amazonaws.services.elasticmapreduce.util.ResizeJobFlowStep

public class ResizeJobFlowStep
extends Object

This class provides some helper methods for creating a Resize Job Flow step as part of your job flow. The resize step can be used to automatically adjust the composition of your cluster while it is running. For example, if you have a large workflow with different compute requirements, you can use this step to automatically add a task instance group before your most compute intensive step.

 AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
 AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials);

 HadoopJarStepConfig config = new ResizeJobFlowStep()
     .withResizeAction(new ModifyInstanceGroup()
         .withInstanceGroup("core")
         .withInstanceCount(10))
     .withResizeAction(new AddInstanceGroup()
         .withInstanceGroup("task")
         .withInstanceCount(10)
         .withInstanceType("m1.small"))
     .withOnArrested(OnArrested.Continue)
     .withOnFailure(OnFailure.Continue)
     .toHadoopJarStepConfig();

 StepConfig resizeJobFlow = new StepConfig()
     .withName("Resize job flow")
     .withActionOnFailure("TERMINATE_JOB_FLOW")
     .withHadoopJarStep(config);

 RunJobFlowRequest request = new RunJobFlowRequest()
     .withName("Resize job flow")
     .withSteps(resizeJobFlow)
     .withLogUri("s3://log-bucket/")
     .withInstances(new JobFlowInstancesConfig()
         .withEc2KeyName("keypair")
         .withHadoopVersion("0.20")
         .withInstanceCount(5)
         .withKeepJobFlowAliveWhenNoSteps(true)
         .withMasterInstanceType("m1.small")
         .withSlaveInstanceType("m1.small"));

 RunJobFlowResult result = emr.runJobFlow(request);
 


Nested Class Summary
static class ResizeJobFlowStep.AddInstanceGroup
          Class representing creating a new instance group.
static class ResizeJobFlowStep.ModifyInstanceGroup
          Class representing a change to an existing instance group.
static class ResizeJobFlowStep.OnArrested
          The action to take if your step is waiting for the instance group to start and it enters the Arrested state.
static class ResizeJobFlowStep.OnFailure
          Action to take if there is a failure modifying your cluster composition.
static interface ResizeJobFlowStep.ResizeAction
           
 
Constructor Summary
ResizeJobFlowStep()
          Creates a new ResizeJobFlowStep using the default Elastic Map Reduce bucket (us-east-1.elasticmapreduce) for the default (us-east-1) region.
ResizeJobFlowStep(String bucket)
          Creates a new ResizeJobFlowStep using the specified Amazon S3 bucket to load resources.
 
Method Summary
 HadoopJarStepConfig toHadoopJarStepConfig()
          Creates the final HadoopJarStepConfig once you are done configuring the step.
 ResizeJobFlowStep withOnArrested(ResizeJobFlowStep.OnArrested onArrested)
          What action this step should take if any of the instance group modifications result in the instance group entering Arrested state.
 ResizeJobFlowStep withOnFailure(ResizeJobFlowStep.OnFailure onFailure)
          What action this step should take if the modification fails.
 ResizeJobFlowStep withResizeAction(ResizeJobFlowStep.ResizeAction resizeAction)
          Add a new action for this step to perform.
 ResizeJobFlowStep withWait(boolean wait)
          Specifies whether the step should wait for the modification to complete or if it should just continue onto the next step once the modification request is received.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ResizeJobFlowStep

public ResizeJobFlowStep()
Creates a new ResizeJobFlowStep using the default Elastic Map Reduce bucket (us-east-1.elasticmapreduce) for the default (us-east-1) region.


ResizeJobFlowStep

public ResizeJobFlowStep(String bucket)
Creates a new ResizeJobFlowStep using the specified Amazon S3 bucket to load resources.

The official bucket format is "<region>.elasticmapreduce", so if you're using the us-east-1 region, you should use the bucket "us-east-1.elasticmapreduce".

Parameters:
bucket - The Amazon S3 bucket from which to load resources.
Method Detail

withResizeAction

public ResizeJobFlowStep withResizeAction(ResizeJobFlowStep.ResizeAction resizeAction)
Add a new action for this step to perform. These actions can be to modify or add instance groups. This step supports multiple actions, but requires at least one be specified.

Parameters:
resizeAction - An instance of ResizeAction defining the change.
Returns:
A reference to this updated object so that method calls can be chained together.

withWait

public ResizeJobFlowStep withWait(boolean wait)
Specifies whether the step should wait for the modification to complete or if it should just continue onto the next step once the modification request is received. Defaults to true.

Parameters:
wait - Whether this step should wait for the modification to complete.
Returns:
A reference to this updated object so that method calls can be chained together.

withOnArrested

public ResizeJobFlowStep withOnArrested(ResizeJobFlowStep.OnArrested onArrested)
What action this step should take if any of the instance group modifications result in the instance group entering Arrested state. This can happen when the bootstrap actions on the newly launched instances are continuously failing.

Parameters:
onArrested - Enum specifying which action to take.
Returns:
A reference to this updated object so that method calls can be chained together.

withOnFailure

public ResizeJobFlowStep withOnFailure(ResizeJobFlowStep.OnFailure onFailure)
What action this step should take if the modification fails. This can happen when you request to perform an invalid action, such as shrink a core instance group.

Parameters:
onFailure - Enum specifying which action to take.
Returns:
A reference to this updated object so that method calls can be chained together.

toHadoopJarStepConfig

public HadoopJarStepConfig toHadoopJarStepConfig()
Creates the final HadoopJarStepConfig once you are done configuring the step. You can use this as you would any other HadoopJarStepConfig.

Returns:
HadoopJarStepConfig configured to perform the specified actions.


Copyright © 2010 Amazon Web Services, Inc. All Rights Reserved.