This tutorial shows you how to deploy a web app on AWS in a reliable way (similar to the way we do it at Transcend.io). We use the following stack, which will be doven into deeper in future sections:
brew install node
brew install terraform
Negative : I am looking to start using a terraform version manager in place of the brew version. https://github.com/tfutils/tfenv
brew install docker
brew install awscli
Positive : If you work at Transcend, It may be helpful to familiarize yourself with these services using our internal Notion docs
This tutorial assumes basic familiarity with web applications and hosting
We will create a basic web app that can run on localhost.
package.json
file to track dependenciesnpm init
Express
as our web servernpm install --save express
app.js
for the web server with the following contents:const app = require('express')();
app.get('/', (req, res) => {
res.send('Hello, World!\n');
});
app.listen(3000, '0.0.0.0');
This file, as you might expect, starts a web server on localhost (If unfamiliar, that is what 0.0.0.0
refers to) on port 3000
.
node app.js
In the last step, we used a package.json
file to explicitly remember our dependencies. This is a common approach for ensuring that other team members or cloud environments can easily use the same libraries as we do.
However, it is not always enough. The previous step lacked:
package.json
lists javascript dependencies, it does not list all dependencies, such as Node
itself or even that it must run on some operating system.package.json
may have helpful scripts such as npm run
for running the app, there is no set of commands that will properly build all apps.With Docker, you can:
Create a file named Dockerfile
. In it, paste the following code:
FROM node:10
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm i
COPY . .
EXPOSE 3000
CMD [ "node", "app.js" ]
Let's examine this code line by line:
FROM node:10
This says that your app will depend on Node
version 10.
WORKDIR /usr/src/app
This says that for any other commands in your docker file, you will be in the /usr/src/app
directory of the virtual file system.
COPY package*.json ./
This tells the image to copy over your package.json
and package-lock.json
files. The COPY
command takes its first argument from your current directory (outside of docker) and its second argument relative to your WORKDIR
inside docker.
RUN npm i
This installs the dependencies you listed in the package.json
file.
COPY . .
This copies over all the other files from your current workspace, minus those in your .dockerignore
file. We do not want to copy over our node_modules
because we already ran npm i
last step. So we can create a new file named .dockerignore
with the contents:
node_modules
npm-debug.log
This also saves a bit of time as node_modules
can be quite large and slow to copy.
EXPOSE 3000
By default, docker will not give any external process access to inside the container. We want to allow one port to be exposed so that outsiders can access the webapp. This exposes port 3000
, which happens to be the port we hosted our Node
app on locally earlier.
CMD [ "node", "app.js" ]
The last step is to tell the container to run the app.js
file we copied over using the node
binary, which hosts the app.
Let's build an image!
docker build -t some-image-name .
This will build an image to your local machine named some-image-name
.
In the output, you may notice that it also tags your image as some-image-name:latest
. Tags are a way to version and, well, tag, your images in case you ever want multiple images related to the same app/job.
Now, lets run it on localhost:
docker run -p 12345:3000 -d some-image-name
This command says to run the some-image-name
image locally. The -p 12345:3000
flag is an example of port forwarding. Your local machine and the docker container have their own, distinct sets of ports. Port forwarding enables you to say "Whenever someone asks for my 12345 port, send them to some docker container's 3000 port instead." This is kind of like a proxy server, if that helps you.
Click here to view the app in your browser
Let's cleanup, as we no longer want to run this web app locally.
Running
docker ps
will give you an overview of the currently running docker images. Copy the name of the image you just started, and run
docker kill <process_name>
to stop it. You may want to run docker ps
one last time to verify it has stopped.
Terraform lets you declare infrastructure as code. This is a pretty popular paradigm predicated on the idea that code is easier to version, share, and change than infrastructure made through web consoles. Here are a few more specific benefits of Terraform:
terraform plan
shows you what infra will change before you make any changes so you can easily see exactly how it will change.terraform graph
gives you a visual representation of how your infra exists.terraform apply
10 times, you'll still only have one server, not 10. It automatically cleans up old, no longer necessary resources to save some money and confusion.ECR is the Amazon Elastic Container Registry. If you have ever used Docker Hub, it is basically the same thing. At Transcend, we use ECR because it gives us cheap/free private repos, unlike Docker Hub (TODO: verify this. Right now this is just my best guess).
Its entire job is to host Docker images in repos. You use it similarly to using S3, where you create a repo (instead of an S3 bucket) and then can upload images (instead of files) to that repo. Just like S3, it keeps track of versions and tags for you.
Positive : You should have already set up an AWS profile locally that you have permissions to deploy AWS resources with. You can also use a personal account (which I did), or just follow along without actually deploying anything if you won't be working on Terraform changes very often.
Create a new folder named deployment
to store your terraform code and cd
into it.
To start, create a file named provider.tf
. In this, we will specify that we want to deploy to AWS specifically. Terraform supports many cloud providers. This looks like:
provider "aws" {
region = "eu-west-1"
profile = "test"
}
This says that all deploys will be in the eu-west-1
region. It also says that I would like to use my test
profile in the awscli
, which I set up to be my personal account.
Now, create a file named ecr.tf
with the contents:
resource "aws_ecr_repository" "ecr_repo" {
name = "ecr_example_repo"
}
This follows the syntax:
resource "some aws resource" "some terraform name that lets you reference this resource from other resources in terraform" {
name = "the name that will appear in the AWS console for this resource"
...other args...
}
To find a list of usable aws resource names and the arguments they take, check out the docs.
Run the command
terraform init
to initialize your directory as containing terraform code. This will download all plugins available from the aws provider you listed in provider.tf
Next, run
terraform plan
This step is optional, but highly recommended anytime you change infrastructure.
You should see output that looks something like:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_ecr_repository.ecr_repo will be created
+ resource "aws_ecr_repository" "ecr_repo" {
+ arn = (known after apply)
+ id = (known after apply)
+ image_tag_mutability = "MUTABLE"
+ name = "ecr_example_repo"
+ registry_id = (known after apply)
+ repository_url = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
That looks good. It shows us that an aws_ecr_repository
will be created. As this matches our expectation, we can run:
terraform apply
After confirming the plan, you can go to your ECR page on your AWS account and will see that an empty repository was made!
Negative : If you don't see the repo, ensure that you are looking in the correct region.
We have a repo, now we need to make sure we are authenticated to it so we can push and pull images.
This can be done by using the following commands:
ACCOUNT_ID=$(aws sts get-caller-identity | jq -r ".Account")
aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin "$ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com"
Now that we're authenticated, we can push our local docker image to the remote repo. This is done in two steps, tagging our local image and pushing our changes.
Find the repository url from your docker image, and copy it. Then, run:
docker tag some-example-image:latest <repo_url>:latest
This is kind of similar to a git remote add origin
in git.
Then, run:
docker push <repo_url>:latest
to upload the image to the remote repo. This is similar to a git push
in git.
Head back to your AWS console, and verify you can see the image you uploaded.
The remaining step is to deploy the ECR image to AWS, which requires quite a few aws services, each with some terraform code to specify it.
Negative : Terraform can be very verbose for simple examples, which you'll see in this section. With great control comes annoying levels of specification. This is as basic an example I could think of, with no logging, load balancer, etc.
Let's start with the fun stuff, permissions and roles!
Create a file named iam.tf
with the contents:
resource "aws_iam_role" "ecs_role" {
name = "ecs_role_example_app"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "ecs_policy_attachment" {
role = "${aws_iam_role.ecs_role.name}"
// This policy adds logging + ecr permissions
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
This creates a new IAM role named ecs_role_example_app
with an attached AmazonECSTaskExecutionRolePolicy
. This policy ensures that the role will be able to pull from ECR.
Next, create a file network.tf
that contains:
resource "aws_vpc" "vpc_example_app" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
}
resource "aws_subnet" "public_a" {
vpc_id = "${aws_vpc.vpc_example_app.id}"
cidr_block = "10.0.1.0/24"
availability_zone = "${var.aws_region}a"
}
resource "aws_subnet" "public_b" {
vpc_id = "${aws_vpc.vpc_example_app.id}"
cidr_block = "10.0.2.0/24"
availability_zone = "${var.aws_region}b"
}
resource "aws_internet_gateway" "internet_gateway" {
vpc_id = "${aws_vpc.vpc_example_app.id}"
}
resource "aws_route" "internet_access" {
route_table_id = "${aws_vpc.vpc_example_app.main_route_table_id}"
destination_cidr_block = "0.0.0.0/0"
gateway_id = "${aws_internet_gateway.internet_gateway.id}"
}
resource "aws_security_group" "security_group_example_app" {
name = "security_group_example_app"
description = "Allow TLS inbound traffic on port 80 (http)"
vpc_id = "${aws_vpc.vpc_example_app.id}"
ingress {
from_port = 80
to_port = 3000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
This creates a VPC that other resources can go into. It has a public subnet (in two availability zones) that can connect to the internet via an internet gateway.
For security reasons, we specify that only port 3000 should be exposed to the public, but outgoing traffic from our resources is unrestricted.
If this is confusing (it was for me at first), then I would recommend this youtube playlist.
Fargate is the final, and most exciting step. It is a service that deploys Docker containers for us, which means we're finally at the step of having our simple NodeJs app running on AWS infrastructure!
Create a file fargate.tf
with the contents:
resource "aws_ecs_task_definition" "backend_task" {
family = "backend_example_app_family"
// Fargate is a type of ECS that requires awsvpc network_mode
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
// Valid sizes are shown here: https://aws.amazon.com/fargate/pricing/
memory = "512"
cpu = "256"
// Fargate requires task definitions to have an execution role ARN to support ECR images
execution_role_arn = "${aws_iam_role.ecs_role.arn}"
container_definitions = <<EOT
[
{
"name": "example_app_container",
"image": "<your_ecr_repo_url>:latest",
"memory": 512,
"essential": true,
"portMappings": [
{
"containerPort": 3000,
"hostPort": 3000
}
]
}
]
EOT
}
resource "aws_ecs_cluster" "backend_cluster" {
name = "backend_cluster_example_app"
}
resource "aws_ecs_service" "backend_service" {
name = "backend_service"
cluster = "${aws_ecs_cluster.backend_cluster.id}"
task_definition = "${aws_ecs_task_definition.backend_task.arn}"
launch_type = "FARGATE"
desired_count = 1
network_configuration {
subnets = ["${aws_subnet.public_a.id}", "${aws_subnet.public_b.id}"]
security_groups = ["${aws_security_group.security_group_example_app.id}"]
assign_public_ip = true
}
}
Please fill in where I specified
Fargate is a type of the Elastic Container Service, which has three concepts:
It should be pretty easy to map those concepts to the three terraform resource blocks above.
There are quite a few arguments I won't go over in detail here, but they mostly relate to:
network.tf
Find the public IP Address on the task page in your AWS console, and go to http://
to view your super scalable hello world application!
Sometimes you need to put sensitive data in your terraform code, or otherwise you need to repeat the same values over and over (such as with an AWS region). That's where variables come in.
This page is a summary of the official terraform docs on input variables.
To declare a variable, you can write a variable
block:
variable "aws_region" {
default = "eu-west-1"
description = "Which region should the resources be deployed into?"
}
Anywhere you want to use the value of that variable in your resource
or provider
blocks, you can just enter something like:
provider "aws" {
region = "${var.aws_region}"
}
and the variable will be injected.
You can specify a variable in a terraform plan
or terraform apply
command by running something like
terraform apply -var="region=us-east-1"
You can store your secrets in a file, and then load them all in with the -var-file
flag.
Example vars.tfvars
file:
region = "us-east-1"
family = "some_other_var"
Usage:
terraform apply -var-file="vars.tfvars"
If you have sensitive data in this file, make sure it is in your .gitignore.
If you don't specify a default value, running terraform plan
or terraform apply
will ask you for an input before running.
Any env var with the prefix TF_VAR_
will be picked up automatically.
From the terminal, type:
export TF_VAR_region="us-east-1"
Datadog is a tool for collecting metrics about your apps, and provides the options to add dashboards and alerts to stay on top of out of line metrics. It even has some fancy ML code that watches over your stats and looks for anomalies. Some examples of useful questions Datadog can answer for you are:
/some/url/endpoint
url take to return a response on average?and many more.
Datadog data collection is often automatic once you install the Datadog Agent
, but can also require installation of an integration. They have integrations for dozens of popular services, including:
and more. Most of these integrations require a few short lines of code to add in, and are rather painless.
Let's start by installing the agent, which is software that runs on your servers and sends the metrics to Datadog. You don't have to manually send data ever, the agent simply runs in the background and sends the data for you without blocking your tasks. How neat is that? That's pretty neat.
In your fargate.tf
file from earlier, add the following json into your task definition. We are using the publically available datadog agent Docker image from Docker Hub and are running it in the same task as our webapp. By doing so, the agent will examine Fargate for us and will give us useful slices in our dashboard by Docker image, EC2 server, etc. Because we are using Fargate, it is required to add the ECS_FARGATE
flag to be true so the auto discovery can happen. It also needs your api key so that it can publish the metrics it collects to your dashboard.
{
"name": "datadog-agent",
"image": "datadog/agent:latest",
"essential": true,
"environment": [
{
"name": "DD_API_KEY",
"value": "${var.datadog_api_key}"
},
{
"name": "ECS_FARGATE",
"value": "true"
}
]
}
After running terraform apply
, you should see metrics about your Fargate cluster appear in Datadog within 5 minutes or so :)
StatsD is a daemon for aggregating arbitrary stats. Datadog supports it as an easy to install integration.
So why would you use it?
Say you want to keep track of how many times a specific line of code has run. At Transcend, an example is that we keep track of how many times a user submits a DSR.
Let's create a new express route where we will keep track of how many times it is requested (this is a simple example as Datadog already tracks this, but the concept can be used anywhere).
First, we need to install dogstatsd:
npm install --save hot-shots
Then, we need to initialize the stats client:
const StatsD = require('hot-shots');
const dogstatsd = new StatsD();
Lastly, we can use the client from our routes:
app.get('/one', (req, res) => {
dogstatsd.increment('page.views.one');
res.send('one');
});
I encourage you to be very liberal with counters, histograms, and any supported statsd data types you want a metric for. They are great for anytime you want to track a metric that doesn't have an existing integration that works out of the box from Datadog. As we'll see later, it is very easy to setup alerts in the datadog console for when thresholds are crossed.
It's important to cleanup any resources you created in this codelab so that we don't get charged for them going forwards.
To do so, all it takes is a:
terraform destroy
When prompted, type yes
and all the resources will magically disappear.