-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the EKS CloudFormation node pool template to the latest official available state #3300
Conversation
fbefb25
to
a43d1e8
Compare
I decided to use Also thought about trying to define the We still have possible workarounds for the future, but all of those go beyond the original intent of this PR (updating the template to the latest official state), so instead I decided avoiding to introduce the I'm more than happy to resolve this |
295ebe9
to
08d1dfb
Compare
Fixed |
08d1dfb
to
85c027c
Compare
Created #3309 for further efforts in this direction. |
85c027c
to
1ab6cdd
Compare
Fixed This high level means defaulting to no surge / no ensured service instances behavior in case no surge is specified in the request instead of the previous "use the last used surge / ensured service instance value" behavior. Context: the original behavior would have been considerably good in case we wanted to support a set once, use always afterwards use-case, but it would have meant that the surge / ensured instances could not have been reduced to 0 once it was not zero before. I think the explicit "we are using what you specified to this update request" is a much cleaner approach, even if requires a bit more complexity (per request surge setting) from the user. |
After checking the current enterprise implementation I have determined it requires no change in input parameter which is a delight, because currently on OSS Pipeline side we are using 2 different input types for the EKS and PKE/AWS node pool update activities, but currently these are using a common input type on the enterprise Pipeline side which would be problematic due to the differing fields in the two if we had to update the enterprise side as well thus a smaller refactor around the type would be required on the enterprise side. Regarding this case we have the luxury of Cadence passing input types around in a weak representation (marshaling with default values/allowed non-existing keys). (And the enterprise workflow doing the |
5215c2c
to
8d6fd80
Compare
af73331
to
8cb378c
Compare
8cb378c
to
e535a4c
Compare
Supposing we are disallowing fallback deliberately.
We are using preinitialized NodeInstanceRoleID which can also be specified by the end user.
We are using preinitialized node security group, possibly specified by the end user.
Added unused NodeAutoScalingGroupDesiredCapacity according to source. It cannot replace InitSize, because we are postponing migration implementation and backward compatibility issues.
Added an option for disabling instance metadata service v1. It is turned off by default, everything is supposed to work as it has before.
e535a4c
to
0fdbe23
Compare
What's in this PR?
Updated the EKS CloudFormation node pool template to the latest available version.
Due to the number of changes, see the change list at the additional notes.
Why?
To keep the corresponding CloudFormation template up to date with the latest official recommended one.
Additional context
The changes are well separated in their dedicated commits so we can easily revert any of those if we wish to do so for any reason. The PR is advised to be reviewed by commit because the changes had been introduced iteratively and it's much easier to follow them that way then comparing the original state to the result.
Detailed change list:
NodeAutoScalingGroupMaxBatchSize
.NodeAutoScalingGroupDesiredCapacity
parameter. Previously such attribute was not present in the official template describing the size of the autoscaling group at the moment (neither current/actual, nor desired).DisableIMDSv1
to be able to disable the version 1 instance metadata service. Currently it is not in use, but the correspondingNodeLaunchTemplate
is ready to use it in case we would like to turn it on.NodeGroup
(other outputs are commented out, but this seemed to be a relevant result as the node group is created by the stack initialized from this template).MinInstancesInService
attribute to theNodeGroup
'sUpdatePolicy
and introduced a corresponding optional parameterNodeAutoScalingGroupMinInstancesInService
(defaults to 0) which controls the minimum number of serving (functional) nodes at any moment during a node autoscaling group update. The implementation has been updated to set this toupdateOptions.MaxSurge
. Also created Revise EKS node pool surge node logic now that NASGMinInstancesInService is available in CF templates #3299 to revise the custom surge logic implementation we have in relation to this change. I deliberately wanted to separate the template update from the potential implementation changes branching off of the possibilities added by the update. I think this way it is easier to keep the integrity and functionality intact.Checklist
OpenAPI and Postman files updated (if needed)User guide and development docs updated (if needed)Related Helm chart(s) updated (if needed)To Do
Enterprise PR updating theUpdateNodeGroupActivity
input parameters and usage.Making sure the OSS version is released together with the enterprise version to avoid any potential activity input type mismatch (should be able to handle non-existing key and parse it into zero value, but it's best to make sure). (Is there any alternative than releasing a version from both after both PRs are merged?)