We’ve been using Terraform in Vistaprint for a while and we love the flexibility it provides for managing our infrastructure as code. Terraform allows us to make changes to our infrastructure, but to also make those changes quickly and safely. Terraform also offers the flexibility to create modules that make repeatable patterns reusable.

While that’s the good stuff Terraform offers, we were disappointed to learn how Terraform managed environments. We wanted something that would allow us to deploy the same infrastructure in isolated environments, to allow individual developers to create short lived infrastructure that looked and functioned the same way as our production environments,… This is where Terraform Dev Kit (TDK) comes in.

TDK is a ruby library that take’s the hassle out of managing environments in Terraform. It also has a lot of nifty features such as pre and post apply hooks that allow us to run coordination tasks before our infrastructure is launched in a cloud provider. On top of that we’re able to run tests on our infrastructure, allowing us to ensure any changes we make will work as we intend them to in production. Finally, TDK manages our remote or local state files.

In the next sections, we’ll look deeper into these three features that TDK provides us, and an example application using all of them.

Terraform Dev Kit features

Environment-based Terraform state management

One of the important benefits about setting up your infrastructure using code is that anyone can check it out and recreate an environment. However, if multiple people are making changes to an environment it gets tricky to maintain a single, unified, state.

To handle this used case Terraform manages state remotely. When using a remote state Terraform first acquires a lock and then begins modifying the infrastructure. When it’s finished it releases the lock and the next process that was going to modify the infrastructure acquires a lock and continues against the new state of the environment. Terraform doesn’t set any of this up for you, so we quickly found that we were repeating the same thing for every project.

TDK takes care of deciding whether to use Local or Remote state. If the environment is a permanent one, that will be maintained by multiple people, TDK will use a remote state. TDK uses the name of the environment to decide it: any infrastructure created under prod or test gets a remote state in an S3 bucket and a DynamoDB table. Using any other name for the environment will create the infrastructure and store its state locally, as the environment is considered a development one and, because of that, disposable.

pre_ and post_ actions

Sometimes, it may be useful to do some pre or post actions when creating infrastructure. For example, you may need to package python code into a lambda package so that Terraform has the latest lambda package ready to upload when we deploy. Creating the python payload as part of the infrastructure guarantees we’re always deploying the latest changes with a single command. As a result setting up a CI/CD pipeline is as easy as calling rake apply[env_name].

custom_prepare and custom_test

Just as pre and post actions allow us to run tasks before and after a terraform apply and destroy, the custom prepare hook allows us to run actions before any terraform command runs.

Using the custom test hook, we’re able to run ruby specs to ensure our environment is set up as we expect it to be after the apply task has run. When we combined this with the excellent awspec gem, we were able to write tests that document how we expect the environment to be set up.

This allows us to get rapid feedback about an environment: if a test fails we know something is wrong, and we can act quickly to resolve it. TDK has several other hooks for before and after applying and destroying infrastructure.

End-to-end testing during development

Manning multiple environments is time consuming and tricky. It’s also expensive to keep several environments running. We like to run end to end tests in isolated environments that are as close to production as possible, then when we’re finished tear them down and throw them away.

TDK helps us create ephemeral environments for running end to end tests. We can launch an entire infrastructure, run our application, run all of our integration tests, and get quick feedback that our applications are working as expected when deployed in our infrastructure. We use a Jenkins CI/CD pipeline for this. We create environments when a new pull request is made and make sure all tests pass end to end before merging the pull request.


Example: Using Terraform Dev Kit to develop and deploy a simple application on AWS

In this example we will build a REST API using AWS ApiGateway and AWS Lambda. The application will be very simple — just an endpoint that doubles the received value — as the goal of the example is to describe the different features in TDK rather than building a complex application.

Initial setup

As TDK is built using ruby the first step will be to install ruby, TDK itself and all the necessary dependencies. The following instructions have been tested on Ubuntu 18.04, but they will most likely work for many other Linux distributions. For Mac OS you may install ruby using homebrew. Refer to the ruby site for more information on installing ruby.

sudo sh -c "apt-get update && apt-get install ruby ruby-rspec git curl"
sudo gem install TerraformDevKit

The next step will be to create a directory where to store all the files that we will create during this example. We will use git to easily manage multiple versions of the application.  The next commands will create a directory in your home folder and initialize a git repository:

WORK_FOLDER=${HOME}/terraform-devkit-example
mkdir ${WORK_FOLDER} && cd ${WORK_FOLDER}
git init

Let’s also create a .gitignore file with the following content:

bin
envs

To complete the example application it is necessary to set up a policy with sufficient permissions to create all the required resources on AWS. We provide you with a template to create a user with an inline policy. You just need to replace the <ACCOUNT-ID> placeholders with your AWS account identifier. Follow AWS documentation if you need help to create the inline policy. This is certainly not the only way to accomplish the setup of the permissions. Feel free to use any other method if you prefer so.

Scaffolding the application

Rakefile

TDK relies on rake to provide a command line application that exposes all the different actions (e.g., plan, apply or test). So, we will add a rakefile for the project. Create a file named Rakefile with the following content:

ROOT_PATH = File.dirname(File.expand_path(__FILE__))

spec = Gem::Specification.find_by_name 'TerraformDevKit'
load "#{spec.gem_dir}/tasks/devkit.rake"

At this point you can list the different actions supported in TDK by executing:

rake -T

Attempting to run any action, however, will fail as we have not added a configuration file yet. Let’s do that!

Configuration

First, create a directory named config, and then create within that directory a file named config-dev.yml with the following content:

terraform-version: 0.11.8
project-name: TdkExample
aws:
  profile: tdkexample
  region: eu-west-1

This configuration file will be used for all development environments. The file has several important fields. The required terraform version is specified first — TDK will download terraform into the current folder if the exact version is not found in the system. The project name is used in some of the template files that will be described later. Finally, the configuration file also contains the AWS credentials required to create all the application resources.

TDK currently requires AWS credentials to be specified in a shared credentials file. But we have plans to make TDK more flexible. In any case, this only affects credentials required to store the remote state on AWS. Any other resources can be created on any of the cloud providers supported by terraform. The required credentials to create these resources can be specified in any form supported by terraform.

For the example application, add a profile named tdkexample to the AWS shared credentials file:

[tdkexample]
aws_access_key_id = YOUR_AWS_ACCESS_KEY_ID
aws_secret_access_key = YOUR_AWS_SECRET_ACCESS_KEY

Consider reading AWS documentation on how to create a shared credentials file if you need more details to add the profile.

Terraform templated files

Next, we will add a terraform file describing the required providers and the remote backend configuration. Create a file named provider.tf.mustache with the following content:

data "aws_caller_identity" "current" {}

locals {
  profile    = "{{Profile}}"
  region     = "{{Region}}"
  env        = "{{Environment}}"
  account_id = "${data.aws_caller_identity.current.account_id}"
}

provider "aws" {
  profile = "${local.profile}"
  region  = "${local.region}"
}

provider "template" {}
provider "archive" {}

terraform {
  {{#LocalBackend}}backend "local" {}{{/LocalBackend}}
  {{^LocalBackend}}backend "s3" {
    bucket         = "{{ProjectName}}-{{Environment}}-state"
    key            = "{{ProjectAcronym}}-{{Environment}}.tfstate"
    dynamodb_table = "{{ProjectAcronym}}-{{Environment}}-lock-table"
    encrypt        = false
    profile        = "{{Profile}}"
    region         = "{{Region}}"
  }
  {{/LocalBackend}}
}

This file is actually a templated terraform file. TDK uses mustache to inject several values into the terraform files. Some of the values injected by default include:

  • Environment: the environment specified by the user
  • Profile: AWS profile
  • Region: AWS region
  • ProjectName: the project name specified in the configuration file
  • LocalBackend: a value set to true when the user has selected an environment backed up by a local backend — any environment except prod and test, which use a remote backend to store the infrastructure state

At this point it is already possible to execute TDK actions such as plan or apply (only for development environments as we have not created configuration files for test or production). But no resource would be created yet as we have only added the provider description and the backend configuration.

Application implementation

Everything is now in place to start adding the different AWS resources needed to implement the example application.

Before starting to add more files, let’s make a commit and create a new branch where we will develop the first version of the application:

git add . && git commit -m "Initial commit"
git checkout -b double_endpoint

First, we will add a Python file with the AWS lambda’s code. Create a file named tdkexample.py with the following content:

def double(event, _):
    value = int(event['value'])
    return value * 2

As you can see, the application logic is minimal. This allows us to keep the focus on describing TDK and its features. At Vistaprint, however, we have used TDK to successfully develop, test and deploy larger, multi-cloud applications.

In the next step we will define the REST API for the application using AWS API Gateway. Defining a REST API only using the available resources in terraform is rather tedious. That is why at Vistaprint we developed some open-source terraform modules to ease the process of creating a REST API as well as other resources. This example application will use these modules. Terraform will automatically download the modules, so there is no need for any additional setup.

Create a file named apigw.tf.mustache with the following content:

resource "aws_api_gateway_rest_api" "api" {
  name = "tdkexample_${local.env}_api"
}

module "double_path" {
  source = "github.com/vistaprint/TerraformModules?ref=v0.0.20//modules/api_path/path2"

  api    = "${aws_api_gateway_rest_api.api.id}"
  parent = "${aws_api_gateway_rest_api.api.root_resource_id}"
  path   = ["double", "{value}"]
}

module "double_method" {
  source = "github.com/vistaprint/TerraformModules?ref=v0.0.20//modules/api_method"
  api    = "${aws_api_gateway_rest_api.api.id}"
  parent = "${element(module.double_path.path_resource_id, 1)}"

  request = {
    type = "AWS"
    uri  = "arn:aws:apigateway:${local.region}:lambda:path/2015-03-31/functions/${aws_lambda_function.double_lambda.arn}/invocations"

    template = <<EOF
{
  "value": "$input.params('value')"
}    
EOF
  }

  responses = {
    "200" = {
      selection_pattern = ""
      template          = "$input.path('$')"
    }
  }
}

module "deployment" {
  source = "github.com/vistaprint/TerraformModules?ref=v0.0.20//modules/api_deployment"
  api    = "${aws_api_gateway_rest_api.api.id}"

  depends_id = [
    "${module.double_method.depends_id}",
  ]

  default_stage = {
    name        = "Default"
    description = "Default stage"
  }
}

output "api_url" {
  value = "${module.deployment.api_url}"
}

Finally we just need to define the AWS Lambda to complete the application. The resources described next will zip the Python file and create the lambda function on AWS, setting up the necessary roles and permissions. Create a file named lambda.tf.mustache with the following content:

resource "aws_iam_role" "iam_for_lambda" {
  name = "tdkexample_${local.env}_iam_for_lambda"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

data "archive_file" "lambda" {
  type        = "zip"
  source_file = "../../tdkexample.py"
  output_path = "tdkexample_lambda.zip"
}

resource "aws_lambda_function" "double_lambda" {
  filename         = "${data.archive_file.lambda.output_path}"
  function_name    = "tdkexample_${local.env}_double_lambda"
  role             = "${aws_iam_role.iam_for_lambda.arn}"
  handler          = "tdkexample.double"
  source_code_hash = "${data.archive_file.lambda.output_base64sha256}"
  runtime          = "python3.6"
}

resource "aws_lambda_permission" "allow_api_gateway" {
  function_name = "${aws_lambda_function.double_lambda.function_name}"
  statement_id  = "AllowExecutionFromApiGateway"
  action        = "lambda:InvokeFunction"
  principal     = "apigateway.amazonaws.com"
  source_arn    = "arn:aws:execute-api:${local.region}:${local.account_id}:${aws_api_gateway_rest_api.api.id}/*/*"
}

After fully describing the application, we can now deploy it. Run the following action to see the application creation plan for a development environment named double_endpoint—no resource will be created yet. The plan action will create a directory named envs/double_endpoint with the rendered versions of the templated terraform files. These files will have all the necessary values injected (e.g., environment name and AWS credentials) and it will execute terraform plan within the created directory.

rake plan[double_endpoint]

This previous action will output all the resources that will be created once the apply action is executed. Let’s just do that so that we can finally use the example application.

rake apply[double_endpoint]

After some seconds all the resources should be created, and the URL to access the application will be printed on the screen. Now we can use curl (or a browser) to access the application. Execute the following command after replacing <BASE-URL> with the appropriate value obtained from the previous step.

curl https://<BASE-URL>/Default/double/10

The response from the REST API should be 20. We have successfully created our first application with TDK!

Creating another environment to test changes to the application

Imagine we want to add a new endpoint that divides a given value by two. With TDK we could make the changes to the application and deploy it using another environment, preserving the environment created in the previous steps.

Before making any changes, let’s commit all the changes, and switch to another branch:

git add . && git commit -m "Implementation for double_endpoint"
git checkout -b half_endpoint

Add the following method to tdkexample.py:

def half(event, _):
    value = int(event['value'])
    return value / 2

Then add the following additional resources to apigw.tf.mustache:

module "half_path" {
  source = "github.com/vistaprint/TerraformModules?ref=v0.0.20//modules/api_path/path2"

  api    = "${aws_api_gateway_rest_api.api.id}"
  parent = "${aws_api_gateway_rest_api.api.root_resource_id}"
  path   = ["half", "{value}"]
}

module "half_method" {
  source = "github.com/vistaprint/TerraformModules?ref=v0.0.20//modules/api_method"
  api    = "${aws_api_gateway_rest_api.api.id}"
  parent = "${element(module.half_path.path_resource_id, 1)}"

  request = {
    type = "AWS"
    uri  = "arn:aws:apigateway:${local.region}:lambda:path/2015-03-31/functions/${aws_lambda_function.half_lambda.arn}/invocations"

    template = <<EOF
{
  "value": "$input.params('value')"
}    
EOF
  }

  responses = {
    "200" = {
      selection_pattern = ""
      template          = "$input.path('$')"
    }
  }
}

And the following ones to lambda.tf.mustache:

resource "aws_lambda_function" "half_lambda" {
  filename         = "${data.archive_file.lambda.output_path}"
  function_name    = "tdkexample_${local.env}_half_lambda"
  role             = "${aws_iam_role.iam_for_lambda.arn}"
  handler          = "tdkexample.half"
  source_code_hash = "${data.archive_file.lambda.output_base64sha256}"
  runtime          = "python3.6"
}

resource "aws_lambda_permission" "half_allow_api_gateway" {
  function_name = "${aws_lambda_function.half_lambda.function_name}"
  statement_id  = "AllowExecutionFromApiGateway"
  action        = "lambda:InvokeFunction"
  principal     = "apigateway.amazonaws.com"
  source_arn    = "arn:aws:execute-api:${local.region}:${local.account_id}:${aws_api_gateway_rest_api.api.id}/*/*"
}

Let’s now deploy the new application using an environment named half_endpoint:

rake apply[half_endpoint]

Once all the resources have been created on AWS, the URL for the new API Gateway will appear on the screen. Use curl to test the modified application. Feel free to use curl to verify it is still possible to access the application deployed using the double_endpoint environment.

Testing the application

So far we have used TDK to create the infrastructure that supports the example application. But, TDK also allows for running some tests against the created infrastructure. We will use some rspec tests to validate that the application works as expected.

Let’s first run the following command to set rspec up in the current directory:

rspec --init

Then, create a file in the spec folder named tdkexample_spec.rb with the following content:

require 'spec_helper'

require 'open-uri'
require 'openssl'

def fetch(path)
  url = ENV.fetch('BASE_URL') + path
  yield open(url, ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE)
rescue OpenURI::HTTPError => error
  yield error.io
end

describe 'tdkexample' do
  it 'doubles the given value' do
    fetch('/double/10') do |response|
      status_code = response.status[0].to_i
      content = response.read.to_i

      expect(status_code).to eq(200)
      expect(content).to eq(20)
    end
  end

  it 'halves the given value' do
    fetch('/half/10') do |response|
      status_code = response.status[0].to_i
      content = response.read.to_i

      expect(status_code).to eq(200)
      expect(content).to eq(5)
    end
  end
end

This rspec file contains two tests, one for each endpoint. The tests make an HTTP request to their corresponding endpoint, and check that the expected status code and content are returned for the given inputs. Additional and more complex tests could be added to the test suite, but we leave that as an exercise to the reader.

The rakefile needs to be changed so that tests are executed when users use the test or preflight tasks. Add the following to the end of the rakefile:

require 'uri'

task :custom_test, [:env] do |_, args|
  output = Dir.chdir("envs/#{args.env}") do
    %x[terraform output]
  end

  base_url = URI.extract(output)[0]

  system({'BASE_URL' => base_url}, 'rspec')
end

Let’s run now the tests against the second version of the application. Execute the test action:

rake test[half_endpoint]

Doing this would execute the apply action to create the infrastructure first, if necessary. In our case, this will not happen as the infrastructure was already created in the previous steps.

Temporary test environments

Another feature that we added to TDK was the ability to create temporary test environments. These are environments with a unique name that will be created with the sole purpose of checking whether the tests pass against the infrastructure, and after that the environment will be torn down. We use this feature frequently as part of our CI/CD pipelines.

Execute the following command to test this feature:

rake preflight

Deleting an environment

Now that we have an enhanced application, we may decide there is no reason to keep our first implementation. Let’s first commit all the changes, and switch back to double_endpoint branch:

git commit -m "Implementation for half_endpoint"
git checkout double_endpoint

Now we can execute the following command to fully destroy the application deployed using the double_endpoint environment:

rake destroy[double_endpoint]

After some seconds, all the resources in the specified environment will be destroyed.

To fully clean up all the resources used in this example repeat the equivalent steps for half_endpoint environment:

git checkout half_endpoint
rake destroy[half_endpoint]

Next steps

This example has demonstrated the most common features of Terraform Dev Kit. Check the project’s documentation to find out about additional features that might be useful when building more complex applications. Some of the supported features are: support for production and test environments, injecting custom variables into the template system or specifying additional directories where to copy files from into the environment folder.

For convenience, all the code and configuration files used in this post can be found in a github repository.


Leave a Reply

Your email address will not be published. Required fields are marked *