Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(glue): validate maxCapacity, workerCount, and workerType #26241

Merged
merged 3 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/job.ts
Original file line number Diff line number Diff line change
Expand Up @@ -719,6 +719,16 @@ export class Job extends JobBase {
}
}

if (props.maxCapacity !== undefined && (props.workerType && props.workerCount !== undefined)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not thinking of this earlier, but I have one more thing I think we should add. There is a small chance that maxCapacity and workerCount could be Tokens. Could we wrap this validation in an if statement that first checks if both maxCapacity and workerCount are resolved. As an example:

if (!Token.isUnresolved(props.maxCapacity) && !Token.isUnresolved(props.workerCount)) {
  // your code here
}

We should probably also include an associated unit test for this as well. I hope this makes sense. Let me know if you have any questions.

Copy link
Contributor Author

@go-to-k go-to-k Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colifran

There is a small chance that maxCapacity and workerCount could be Tokens.

I see. But in what case would props.maxCapacity and props.workerCount become Unresolved, since I assume they are input values from the user, not values generated from the resource? Depending on that, I will consider new unit tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@go-to-k I realize it's a small chance, but it's better to add the check for it now since we're adding new code rather than having to go back and fix a bug. Basically, this would only occur if someone were to provide those as Lazy values. As an example:

{
  workerCount: cdk.Lazy.number({ produce: () => 5 }),
  maxCapacity: cdk.Lazy.number({ produce: () => 5 }),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or when using a Parameter it can also happen.

Copy link
Contributor Author

@go-to-k go-to-k Jul 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colifran

Thanks, I understand the Lazy case, but I may still not agree with validation only when they are not Token.

In the following code, when maxCapacity and workerCount are Token (Unresolved), they are not validated.

if (!Token.isUnresolved(props.maxCapacity) && !Token.isUnresolved(props.workerCount)) {
  if (props.maxCapacity !== undefined && (props.workerType && props.workerCount !== undefined)) {
    throw new Error('maxCapacity cannot be used when setting workerType and workerCount');
  }
}

However, even if the value is a Token and Unresolved, the fact that you are specifying that combination of the parameters is itself a problem (i.e., specifying both maxCapacity and workerCount (and workerType), regardless of the value, is itself a problem).

Therefore, shouldn't they be validated regardless of whether they are tokens or not? In fact, wouldn't it also validate and prevent cases like the following?

new glue.Job(stack, 'Job', {
  workerType: glue.WorkerType.G_1X,
  workerCount: cdk.Lazy.number({ produce: () => 5 }),
  maxCapacity: cdk.Lazy.number({ produce: () => 5 }),
})

I think we can check if the parameters are undefined or not, even if encoded numbers as Token.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@go-to-k good catch. I agree with this. We're checking for just the existence of the properties and not actually validating the value of the properties. That said, I think this is good to go. Thanks for your effort on this!

throw new Error('maxCapacity cannot be used when setting workerType and workerCount');
}
if (props.maxCapacity !== undefined && ![GlueVersion.V0_9, GlueVersion.V1_0].includes(executable.glueVersion)) {
throw new Error('maxCapacity cannot be used when GlueVersion 2.0 or later');
}
if ((!props.workerType && props.workerCount !== undefined) || (props.workerType && props.workerCount === undefined)) {
throw new Error('Both workerType and workerCount must be set');
}

const jobResource = new CfnJob(this, 'Resource', {
name: props.jobName,
description: props.description,
Expand Down
48 changes: 48 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/test/job.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1049,5 +1049,53 @@ describe('Job', () => {
}));
});
});

describe('validation for maxCapacity and workerType', () => {
test('maxCapacity with workerType and workerCount should throw', () => {
expect(() => new glue.Job(stack, 'Job', {
executable: glue.JobExecutable.pythonEtl({
glueVersion: glue.GlueVersion.V1_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
maxCapacity: 10,
workerType: glue.WorkerType.G_1X,
workerCount: 10,
})).toThrow('maxCapacity cannot be used when setting workerType and workerCount');
});

test('maxCapacity with GlueVersion 2.0 or later should throw', () => {
expect(() => new glue.Job(stack, 'Job', {
executable: glue.JobExecutable.pythonEtl({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
maxCapacity: 10,
})).toThrow('maxCapacity cannot be used when GlueVersion 2.0 or later');
});

test('workerType without workerCount should throw', () => {
expect(() => new glue.Job(stack, 'Job', {
executable: glue.JobExecutable.pythonEtl({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
workerType: glue.WorkerType.G_1X,
})).toThrow('Both workerType and workerCount must be set');
});

test('workerCount without workerType should throw', () => {
expect(() => new glue.Job(stack, 'Job', {
executable: glue.JobExecutable.pythonEtl({
glueVersion: glue.GlueVersion.V2_0,
pythonVersion: glue.PythonVersion.THREE,
script,
}),
workerCount: 10,
})).toThrow('Both workerType and workerCount must be set');
});
});
});
});