#AWS | #Python | #blog

August 18, 2019

CI/CD pipeline for AWS Lambda (Python runtime)

Continuous integration and continuous delivery are powerful practices that allow release software faster and of a higher quality. This post walks through steps to implement CI/CD pipeline for a small lambda function that calculates square roots by:

getting message from SQS that contains the number to calculate sqrt for
checks if the calculation was done before by querying DynamoDB
if there is not cached answer in DynamoDB - calculate sqrt and saves the result
print the result so it’s visible in CloudWatch logs

Things I’d like the pipeline to do:

create all the resources - SQS and Dynamo
subscribe to any changes that are committed to master branch of GitHub repo
run tests - I’m going to run unit tests, but since the resources are there you can run integration/end-to-end tests
build the package for lambda with all python dependencies
deploy the package

Pipeline architecture:

arch

The initial CloudFormation template along with all the code can be found in this GitHub repo

The pipeline.yaml (see aws folder in the repo) contain the CloudFormation template that will create SQS, Dynamo, CodePipeline with all its steps:

source step to get the source code form GitHub
CodeTest (CodeBuild type) to run a container to run tests
CodeBuild - a container that will prepare the build - a zip file on S3 Lambda can digest
CodeDeploy - the step to deploy newly build Lambda.

The first thing to do is to create GitHub OAuth token - just follow steps 1-6 from this AWS doc.

Next, you need to create a stack from AWS console - Go to CloudFormation and click Create Stack. It will ask to fill in the stack parameters:

name - a reference to the resources of the stack
GitHub token, repo owner and repo name

Newly created pipeline appears in CodePipeline console right after that. If you open it there will be Source, CodeTest, CodeBuild and CodeDeploy stages present.

Also, all additional resources will be created:

SQS queue that will feed the Lambda
DynamoDB table with Pay-Per-Request billing
S3 bucket for pipeline artifacts - it’s the mechanism to pass result of CodePipeline stages between each other
S3 bucket that will hold zip file with packaged Lambda code

Source step of the pipeline is pretty autonomous. AWS will monitor the changes and start the execution of the pipeline once there was a push to the master branch. There is a limit on how many repositories it can monitor so the alternative is to implement a GitHub webhook that will trigger a special separate Lambda that in turn will start pipeline execution.

CodeTest is a step of CodeBuild type. It runs unit tests. Usually, the unit tests are run by developers individually by implementing pre-commit hooks on the local machine. But this step ensures that they were executed before push. Also, it can run the test for a higher level of the testing pyramid.

CodeBuild uses chalice package to do a couple of things

create CloudFormation template to deploy lambda
package Lambda code
create Lambda policies

The important part of this stage is the image the container will use. Since some python packages are wrappers around C libraries, which are compiled when we run pip install, so the OS where we run pip install should be similar to the OS which will run the code and use these packages. I found this images amazonlinux:latest on DockerHub, which resembles Lambda runtime. Dependencies are installed into virtual environment. All the site-packages go to vendor folder.

- python3 -m venv v-env && . v-env/bin/activate && pip install --upgrade pip && pip install -r requirements/requirements.txt && deactivate
- mkdir vendor
- cp -R v-env/lib/python3.7/site-packages/. vendor
- cp -R v-env/lib64/python3.7/site-packages/. vendor

Next thing is the code - it should be placed in a vendor folder as per chalice docs. I don’t like to have it in my projects structures so I’m creating it in codebuild.yaml (which is referred in CodeTest as a buildspec - a script to run) and copying everything in it.

- cp -R my_package vendor

There is another difficulty - a Lambda IAM policy. Ideally, it should be as restrictive as possible, but grant access to the SQS and Dynamo we already created. In this stage I’m passing a number of different environment variables to the container:

the ones that start with LAMBDA_ENV go into Lambda config to be available at its runtime
the ones starting with POLICY_ENV are used to generate policy document (chalice policy generator is not yet good enough for that)

There is a config_generator.py script that read these variables and put them in proper places. LAMBDA_ENV will be put into a file .chalice/config.json and POLICY_ENV will go to .chalice/police.json - both will be used later by chalice to generate CloudFormation template.

Finally, the size of the build should be less than 265Mb. So delete extra files (boto3 is needed for build, but it’s available in Lambda runtime so no need to take it with us).

- echo 'Size of build' $(du -sm --exclude=./v-env .) 'MB'
- find . -name "*.pyc" -exec rm -f {} \;
- find ./vendor  -name 'boto3' -prune -type d -exec rm -rf {} \;
- find ./vendor  -name 'botocore' -prune -type d -exec rm -rf {} \;
- echo 'Size of build after cleaning' $(du -sm --exclude=./v-env .) 'MB'

The chalice package command, in the end, creates a zip file along with sam.json file. AWS CLI command will prepare the final template for CloudFormation - transformed.yaml - that will drive the deploy stage

- . v-env/bin/activate && python config_generator.py && chalice package /tmp/packaged && deactivate
- aws cloudformation package --template-file /tmp/packaged/sam.json --s3-bucket ${APP_S3_BUCKET} --output-template-file transformed.yaml

After all that all the changes to the master branch of your repo should be automatically tested, (ideally) integrated and deployed.

If you use Python in a serverless environment on AWS or use CI/CD for such applications, connect with me on LinkedIn

Code for this post is available here

August 18, 2019

CI/CD pipeline for AWS Lambda (Python runtime)

Similar articles: