Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stepfunctions-tasks): add step concurrency level to EmrCreateCluster #15242

Merged
merged 26 commits into from
Oct 5, 2021

Conversation

MousaZeidBaker
Copy link
Contributor

@MousaZeidBaker MousaZeidBaker commented Jun 22, 2021

Added support for step concurrency when creating EMR clusters through Step Functions. This feature allows users to run multiple steps in parallel on a cluster created through SFN.

closes #15223.

As a byproduct, adds validation for releaseLabel to ensure that it follows the correct format laid out here.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Jun 22, 2021

@mergify
Copy link
Contributor

mergify bot commented Jun 22, 2021

Title does not follow the guidelines of Conventional Commits. Please adjust title before merge.

@MousaZeidBaker MousaZeidBaker changed the title feat(aws-stepfunctions-tasks) set StepConcurrencyLevel when creating EMR cluster feat(aws-stepfunctions-tasks) add StepConcurrencyLevel property Jun 22, 2021
Copy link
Contributor

@shivlaks shivlaks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MousaZeidBaker thanks for opening a contribution.

I think we also need to add some tests to go with this added feature.

@mergify mergify bot dismissed shivlaks’s stale review June 22, 2021 19:58

Pull request has been modified.

@MousaZeidBaker
Copy link
Contributor Author

Thanks for your comments @shivlaks. They should now all be resolved.

@mergify mergify bot dismissed BenChaimberg’s stale review July 12, 2021 18:49

Pull request has been modified.

@MousaZeidBaker
Copy link
Contributor Author

Thanks for the suggestions @BenChaimberg, they should now all be committed. Do you know why the AWS CodeBuild is failing?

@BenChaimberg
Copy link
Contributor

Click "Build Logs" on the aws-cdk-automation comment. You have a compilation error

@BenChaimberg BenChaimberg changed the title feat(aws-stepfunctions-tasks) add StepConcurrencyLevel property feat(stepfunctions-tasks): add step concurrency level to EmrCreateCluster Jul 21, 2021
Copy link
Contributor

@BenChaimberg BenChaimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a small blurb to the README

@HsiehShuJeng
Copy link

For those who have been waiting for this feature by accident, they could consider implementing the following class in advance:

import * as sfn from '@aws-cdk/aws-stepfunctions';
import * as tasks from '@aws-cdk/aws-stepfunctions-tasks';

interface ExtendedEmrCreateClusterProps extends tasks.EmrCreateClusterProps {
  /**
   * Specifies the step concurrency level to allow multiple steps to run in parallel
   *
   * Requires EMR release label 5.28.0 or above.
   * Must be in range [1, 256].
   *
   * @default 1 - no step concurrency allowed
   */
  readonly stepConcurrencyLevel?: number;
}

/**
 * A Step Functions Task to create an EMR Cluster.
 * 
 * The ClusterConfiguration is defined as Parameters in the state machine definition.
 * 
 * OUTPUT: the ClusterId.
 * 
 * @see https:/aws/aws-cdk/issues/14408
 * @see https:/aws/aws-cdk/issues/15223
 * @see https:/aws/aws-cdk/pull/15242/
 */
class ExtendedEmrCreateCluster extends tasks.EmrCreateCluster {
  protected readonly stepConcurrencyLevel: number;
  constructor(scope: cdk.Construct, id: string, props: ExtendedEmrCreateClusterProps) {
    super(scope, id, props);
    this.stepConcurrencyLevel = props.stepConcurrencyLevel ?? 1;
  }
  protected _renderTask(): any {
    const originalObject = super._renderTask();
    const extensionObject = {};
    Object.assign(extensionObject, originalObject, { Parameters: { StepConcurrencyLevel: cdk.numberToCloudFormation(this.stepConcurrencyLevel), ...originalObject.Parameters } })
    return extensionObject
  }
}

@kaizencc kaizencc requested a review from a team September 30, 2021 17:26
@kaizencc
Copy link
Contributor

Cleaning up a stale PR inherited from @BenChaimberg. I can't request your review Ben, but I'm happy to take comments if you have them.

Copy link
Contributor

@BenChaimberg BenChaimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment, rest looks great!

Copy link
Contributor

@madeline-k madeline-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Nice work, @kaizen3031593 and @MousaZeidBaker!

@mergify
Copy link
Contributor

mergify bot commented Oct 4, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: 4e75b74
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit 1deea90 into aws:master Oct 5, 2021
@mergify
Copy link
Contributor

mergify bot commented Oct 5, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

xykkong added a commit to xykkong/aws-cdk that referenced this pull request Oct 6, 2021
* '15588' of https:/xykkong/aws-cdk: (47 commits)
  chore: rollback `GenericSSMParameterImage` deprecation (backport aws#16798) (aws#16800)
  chore(deps): bump actions/setup-node from 2.4.0 to 2.4.1 (aws#16778)
  Update CHANGELOG.md
  chore(release): 1.126.0
  feat(assertions): matcher support for `templateMatches()` API (aws#16789)
  feat(stepfunctions-tasks): add step concurrency level to EmrCreateCluster (aws#15242)
  docs(s3): correct heading levels Object Ownership / Bucket deletion (aws#16790)
  chore(individual-pkg-gen): fix bug in setting alpha package visibility (aws#16787)
  fix(s3): setting `autoDeleteObjects` to `false` empties the bucket (aws#16756)
  fix(iam): `User.fromUserArn` does not work for ARNs that include a path (aws#16269)
  fix(cli): progress bar overshoots count by 1 for stack updates (aws#16168)
  fix(config): add SourceAccount condition to Lambda permission (aws#16617)
  docs(events): add a note about not using `EventPattern` with `CfnRule` (aws#16715)
  docs(core): fix reference to nonexistant enum value (aws#16716)
  chore(s3-deployments): update python version on BucketDeployment handler (aws#16771)
  chore: set response-requested length to 2 and closing-soon to 5 (aws#16763)
  fix(revert): "fix: CDK does not honor NO_PROXY settings (aws#16751)" (aws#16761)
  docs(GitHub issue templates): Upgrade to GitHub Issues v2 (aws#16592)
  chore: reset jsii-rosetta worker count to default (aws#16755)
  fix: CDK does not honor NO_PROXY settings (aws#16751)
  ...
njlynch pushed a commit that referenced this pull request Oct 11, 2021
…ster (#15242)

Added support for step concurrency when creating EMR clusters through Step Functions. This feature allows users to run multiple steps in parallel on a cluster created through SFN.

closes #15223.

As a byproduct, adds validation for `releaseLabel` to ensure that it follows the correct format laid out [here](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-components.html).

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
TikiTDO pushed a commit to TikiTDO/aws-cdk that referenced this pull request Feb 21, 2022
…ster (aws#15242)

Added support for step concurrency when creating EMR clusters through Step Functions. This feature allows users to run multiple steps in parallel on a cluster created through SFN.

closes aws#15223.

As a byproduct, adds validation for `releaseLabel` to ensure that it follows the correct format laid out [here](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-components.html).

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(aws-stepfunctions-tasks): Set StepConcurrencyLevel when creating EMR clusters through Step Functions
8 participants