Manage Your AWS Lambda Feature Flags Like a Boss

6 min readDec 22, 2021

This blog post is the third installment in my feature flags series. You can find updated blogs at my new blogging website https://www.ranthebuilder.cloud.

In this blog post, I’ll provide a working process for getting a feature flag right in AWS Lambda: from design to implementation, testing, and all the way to production and retirement.

While the code examples are written in Python, the concepts I describe are the same for any feature flag SDK written in any programming language and any feature flag solution that you might choose.

For a complete working Python project that deploys to AWS AppConfig and uses feature flags in runtime click here.

Background

In my first post, I analyzed the different possible implementations of feature flags, and why I chose to deploy them to AWS AppConfig instead of using a third-party solution. A simple Python implementation was also provided.

In the second blog post, I took the implementation even further and provided a rule engine SDK (which I later donated to AWS Lambda Powertools). The rule engine SDK allows to get and evaluate feature flags according to session context (customer id, username, time of day, etc.).

However, as I continued deploying feature flags to production, I realized that my testing matrix and maintenance overhead have increased dramatically.

It was time to define a proper process for developing and managing feature flags.

CI/CD Infinity Loop

I decided to use the proven CI/CD infinity loop and see how it fits the feature flags use case.

Ownership

I believe that the development team owns the feature flags process from start to end. The team and its lead are required to implement the feature flag and manage it throughout its’ lifecycle. The development team is the source of truth for the design. It has the best understanding of the feature’s side effects, implementation, point of failures, and how to overcome them.

Plan, Code, and Build

Once a feature has been deemed large enough or risky, a feature flag should be considered.

The most straightforward implementation is a single point of execution with a single ‘if-else’ statement. Consider the following pseudo-code example:

In my previous blog, I’ve presented an implementation for fetching and evaluating feature flags. It’s based on AWS AppConfig. However, any 3rd party solution may work as well. Choose a solution that fits your needs best.

Click here to learn why I chose AWS AppConfig and how ‘get_my_feature_flag’ is implemented.

Test

Most people focus their tests on the obvious use case: enable the feature and test the new logic surrounding it. They will check the side effects and that the business logic is handled properly. If they are thorough, they will also make sure that the code coverage remains as high as possible.

However, it is also critical to verify that feature’s logic is NOT run when the feature flag’s value is False. This might seem obvious, but running a feature’s logic when it’s not supposed to can have horrific results. It can be caused either by bugs in the code, edge cases, or unhandled flows

Test with Mocks

The tests in this example are run locally and not on AWS as they use Pytest mocks. In the following example, ‘my_handler’ is tested when ‘my_feature’ is enabled. The handler is called with a generated event and mocked feature flag SDK. Please note that although the test runs locally with Pytest mocks, the handler can still call real services and not just mocks. Define mocks and return values that simulate the test edge case.

Lines 8 to 9 enable the feature and line 11 triggers the Lambda handler with a mocked event. In lines 12 and 13, the test verifies that the handler’s output and side effects of the feature (when enabled) are as excepted.

And now, the test is run again but the feature is disabled:

The feature is disabled in lines 8–9. The test verifies in line 14 that the entry point of the feature flag handler is not called by mistake.

The feature is indeed disabled as it only has a SINGLE point of execution.

E2E tests

As a rule of thumb, E2E tests should not be used to test a single feature flag’s logic. The feature flag’s logic was already tested in the mocked tests above. E2E tests should only simulate the “happy” flows of the entire service. Edge cases, mocked failures are tested prior to the E2E stage.

I believe that when a feature flag is ‘disabled’, all E2E tests are expected to pass and serve as a ‘sanity test’ baseline. They also check that the feature flags SDK works as expected and that it evaluates the feature flag’s value properly.

However, once the feature flag is enabled in the ‘dev’ environment, some E2E tests might fail. This failure suggests that the mocked tests missed an edge case/use case. Fix the mocked tests accordingly and continue to the next step — release and deploy.

Release and Deploy

Assuming you don’t deploy straight to production, it’d be wise to first deploy the feature flag as ‘disabled’ to all non ‘dev’ environments.

When you are ready and have the capacity to test & debug, release the feature flag in at least one environment that simulates a real production environment- ‘staging’. This might cause E2E tests to fail if you mocked tests are not good enough or have missed some use case. Add the missing tests and continue.

Release to production

There are numerous deployment strategies. Choose the one that fits your need. Common deployment strategies include ‘all at once’, ‘canary’ deployment, and ‘dark launching’.

Read more about deployment strategies for AWS AppConfig here.

Read more about Dark Launching on Martin Fowler’s blog.

Monitor and Operate

Once the feature goes live, there’s no turning back. If there’s a problem, you are required to act fast:

Disable the feature flag ASAP.
Update tests, add missing use cases.
Deploy and re-release.
Conduct retro meeting — try to identify additional overlooked use cases.

But how do I know if there’s a problem?

Monitoring depends heavily on your chosen feature flags. I use AWS AppConfig. AWS AppConfig has automatic health checks which can monitor any CloudWatch alert. It is also possible to define an automatic configuration rollback in case an AWS CloudWatch alert is triggered. Read more about it here.

Other solutions such as AWS CloudWatch Evidently provide graphs and insights into the deployment enabled flag.

Keep in mind that defining AWS CloudWatch alerts that monitor your service’s health and status are considered as best practice regardless of the feature flags usage.

Plan to Retire

Feature flags are powerful and addictive. However, the more you add, the more your code complexity and testing overhead increases.

Nevertheless, you are not bound for ever to those feature flags.

It’s ok and even recommended to give them the ‘axe’ once they reach a maturity level and stability. The key principle here is visibility. Make sure you are able to view ALL your feature flags in ALL your AWS accounts in one single view. You need to be able to understand the complete state of feature flags across all AWS accounts in a glance. If your tool of choice does not provide such a way, create it yourself. It can be a simple dashboard or a simple UI.

The development team should schedule a one-hour-long meeting per month to review the current state of feature flags and decide what features can be retired.

Here are my rules of thumb for selecting a proper retirement candidate:

The feature has been deployed to 100% of customers for ‘X’ weeks. Use common sense to define ‘X’.
The feature has been stable for ‘X’ weeks — no known issues/bugs.
Customer feedback is positive and there are no open issues.
The feature is not expected to undergo any refactors/additions.
Your product team does not use it or believes it is required anymore.

Special thank you goes to Roy Ben Yosef, Alon Sadovski, Alexy Grabov, and Mauro Rois