We maintain a CloudFormation custom resource provider for Amazon Connect. The provider has grown organically, and as new features were added, the default role policy has become large.
The provider can do simple low-security tasks like associateLambda
, or complex tasks like createInstance
, which requires access to security-sensitive resources like kms
and iam
.
During a recent security review, we discovered that the same role policy was being used across all provider instances. This meant that if we used a low-security operation, such as associateLambda
, the role would be granted access to high-security resources like kms
and iam
.
Solution 1 - Inject a Pre-Built Role
For the current project, we resolved the issue by introducing an optional role prop. This allowed the developer to select specific IAM permissions.
1 | // PSEUDO-CODE |
Pros
- We were able to quickly patch the current app
Cons
- Each dependent app would have to be updated manually. We have A LOT!
- The app developer must know exactly which IAM permissions are required.
Solution 2 - Dynamically Generate the Role
I updated the custom resource constructs to dynamically build up the policy based on which resources are used, so I could roll out the update in a backward-compatible way.
1 | // PSEUDO-CODE |
Pros
- No manual intervention is needed for dependent apps. Simply upgrade the NPM package and redeploy.
Cons
- Resource deletion does not work properly.
- If you had a custom resource like
associateLambda
, everything works fine because the role policy is updated before the resource is created. - But if you remove the custom resource in a future release, CloudFormation will update the role policy first (and remove the associated permission) before cleaning up the resource.
- As a result, you encounter a permission error when cleaning up the
associateLambda
resource
- If you had a custom resource like
- Circular dependencies
- If you used the provider to
createInstance
and then used the instance ARN in another construct likeassociateLambda
you will encounter a circular reference - Details
- Invoke
createInstance
and get instance ARN - Invoke
associateLambda
using instance ARN- Instance ARN is used in the dynamic policy, resulting in a circular reference
- Invoke
- If you used the provider to
Solution 3 - Mix of both
In the end, I decided to use a combination of both solutions. I created a ConnectProviderRoleBuilder
to make it easier for developers to build the role.
Additionally, I also updated the ConnectProvider
to automatically use the builder if a role is not provided.
This means that we can update existing apps without any manual intervention. If the app encounters the issues described in Solution 2 during ongoing development, the team can use the ConnectProviderRoleBuilder
to generate an appropriate role quickly.
1 | // PSEUDO-CODE |
Conclusion
The simplest solution would have been to simply force the developer to inject a role but it would have created unnecessary developer friction because:
- My app used to deploy fine, but now I have to manually create a new role.
- I have no idea what is happening under the hood and which permissions are required, resulting in even more friction.
This solution was certainly more work, but it solved the problem with the least effort from the downstream developers.
No, go build secure and elegant tools!