Matt Andrews
Software engineer making apps – that aren’t apps – and more at the FT. 会说汉语.
How to securely configure Cloudflare for static website hosting on AWS S3

Goal: to create a static website hosted on S3 and Cloudflare securely — by which I mean restrict access to the contents of the bucket to Cloudflare only.

Step 1 - Create your bucket

First we’ll create your S3 bucket. Login to the AWS console and head over the S3 service and click the big blue + Create Bucket button to start the wizard that will guide you through creating your bucket.

The first question it’ll ask you is for a name. The name that you will need to enter will need to match the eventual hostname website you plan to host. I’m going to create testing.mattandre.ws so I’ve named the bucket that:-

Leave the rest as default and hit Next.

Step 2 - Configure Options

On the Configure options you can leave all the other options as default.

Step 3 - Set permissions

Again leave this as default (Block all public access), which at first might not seem intuitive as we’re making a public website. But don’t worry, in a later step we will grant permission just to Cloudflare to access the content.

Step 4 - Review & Create bucket

Assuming you’ve followed everything correctly so far — go ahead and hit Create Bucket.

Step 5 - Enable static website hosting

Staying in the AWS console for the S3 service, click into your newly created empty bucket, and click Properties.

Then, click the Static website hosting box — and choose Use this bucket to host a website.

You’ll need to enter the Index document, which I’ll set to be their suggestion, index.html.

Make sure to take note of the Endpoint — you’ll need that later! In my case this is http://testing.mattandre.ws.s3-website-eu-west-1.amazonaws.com/.

Step 6 - Grant access to your bucket from Cloudflare

Next, click the Permissions tab of your S3 bucket (between Properties and Management) and click Bucket Policy.

Take a copy of the following JSON, take care to replace testing.mattandre.ws (my bucketname) with your bucket name, then paste it into the Bucket policy editor. Then click Save.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::testing.mattandre.ws/*",
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"2400:cb00::/32",
"2405:8100::/32",
"2405:b500::/32",
"2606:4700::/32",
"2803:f800::/32",
"2c0f:f248::/32",
"2a06:98c0::/29",
"103.21.244.0/22",
"103.22.200.0/22",
"103.31.4.0/22",
"104.16.0.0/12",
"108.162.192.0/18",
"131.0.72.0/22",
"141.101.64.0/18",
"162.158.0.0/15",
"172.64.0.0/13",
"173.245.48.0/20",
"188.114.96.0/20",
"190.93.240.0/20",
"197.234.240.0/22",
"198.41.128.0/17"
]
}
}
}
]
}

(This grants Cloudflare’s IP ranges read-only access to your S3 bucket contents)

Extra step - Upload a index.html file to your bucket

Strictly speaking not necessary but useful to check everything is working.

I uploaded an index.html file with the contents <h1>Hello world</h1> to my S3 bucket.

Step 7 - Add the DNS hostname to your Cloudflare DNS console

  • Login to your Cloudflare dashboard
  • Click on the domain that you’ll be setting up a static website on (in my case mattandre.ws)
  • Then click DNS
  • Click Add record
  • Set the Type to be CNAME
  • Set the Name to be the subdomain (or leave blank if you are creating a static website on the root of your website)
  • Set the Target to be equal to the Endpoint that you were given in step 5 but remove the http:// prefix and / suffix. In my case this is testing.mattandre.ws.s3-website-eu-west-1.amazonaws.com
  • Leave TTL and Proxy Status to be their defaults (Auto and Proxied)
  • Click Save

Finished!

You’re done — load up your domain to see the fruits of your hard work: https://testing.mattandre.ws.

Semver as a Service

I’ve been continuing learning bits and pieces with mini projects… This time: Semver as a Service built with AWS Lambda, AWS API Gateway, AWS CloudFormation and Golang.

What is ‘Semver as a Service’?

https://github.com/matthew-andrews/semver-as-a-service/

Semver as a Service is a simple API that will look at any GitHub repository’s releases/tags, sort them and tell you the highest version or, if you specify a constraint, the highest version that meets a constraint.

Try it out here:-

Why?

Well, the main purpose was to learn Go, AWS, etc, but it’s also handy for writing install scripts. For example, this could be a simple script to install the latest version of s3up on your Mac:-

1
2
3
curl -sf https://api.mattandre.ws/semver/github/matthew-andrews/s3up \
| xargs -I '{}' curl -sfL https://github.com/matthew-andrews/s3up/releases/download/{}/s3up_darwin_386 -o /usr/local/bin/s3up \
&& chmod +x /usr/local/bin/s3up
Catching All Errors in AWS Lambda and API Gateway

When building applications with AWS Lambda and API Gateway I’ve found error handling quite difficult to work with.

You first define what status codes your API method is able to serve (200, 404 and 500, for example). You are encouraged to choose 200 as the default. Then you can write regular expressions that match against ‘Lambda Errors’.

According to Amazon’s documentation:-

For Lambda error regex […] type a regular expression to specify which Lambda function error strings (for a Lambda function) […] map to this output mapping.

Note
The error patterns are matched against the errorMessage property in the Lambda response, which is populated by context.fail(errorMessage) in Node.js or by throw new MyException(errorMessage) in Java.
Be aware of the fact that the .\ pattern will not match any newline (\n).

This seems simple enough.

Lambda functions that have run successfully shouldn’t have errorMessages so I should be able to:-

  1. Set a Lambda Error Regex that looks for .*404 Not Found.* and maps that to 404 errors — this works fine
  2. and then I should be able to map all other errors to 500 with (\n|.)* (note the \n is there because I heeded the warning in the documentation above in case one of my errors has a new line).

Whilst the Lambda Error Regex does indeed now map all errors to 500 responses, unfortunately it also maps all the successful Lambda response to 500s as well.

Lambda Error Regex

WARNING: THE LAMBDA ERROR REGEX WILL TRY TO MATCH AGAINST SUCCESSFUL RESPONSES FROM LAMBDA FUNCTIONS AS WELL AS FAILED ONES.

So, how do we fix it?

Easy. Whilst the Lambda Error Regex is used to compare against successful Lambda responses, in this case errorMessage is set to something like an empty string.

Just set the Lambda Error Regex that you want to match to your ‘catch all’ error response to (\n|.)+.

Like this:-

Thoughts

I’m really surprised that this is so difficult and that none of the documentation encourages (or helps) developers to write Lambda Error Regexs that match against all possible errors.

If I had to write regular expressions against all the errors I anticipated having to handle I would never feel 100% confident that I got them all and would have needlessly risked returning 200 responses containing errors to users.

Uploading static files & websites to Amazon S3 efficiently with s3up

I’ve been using Amazon S3 at work and at home a lot recently and have grown to really like its features. Versioning, lifecycle rules and event streams can be used in really cool ways to make rock solid and super performant websites.

When it comes to actually uploading files to S3 there are plenty of choices for command line tools but they all seemed to a bit more than I wanted or not quite enough and I’m learning Go at the moment so…

Introducing s3up!

https://s3up.mattandre.ws

A new cross platform command line tool for uploading files to S3.

If you’d like to try it out or report bugs, installation instructions and more information is up on GitHub.

Features

  • Optimised for uploading static websites
  • Uploads multiple files concurrently (or can be set to upload one at a time — this can be controlled via the --concurrency option)
  • Only uploads files that are new or have changed
  • Automatically detects and sets an appropriate Content-Type for each file uploaded
  • Allows for easy configuration of ACLs and Cache-Control headers for files
  • Splits large files up and uploads them in smaller pieces
  • Written in Go and compiled for all platforms, which means it is fast, can be installed quickly, and is standalone — it does not rely on other dependencies (like Python or Node)
  • Allows manipulation of the path that files get uploaded to
  • Has a --dry-run so that the changes it will make to objects in S3 can be previewed

Manipulating upload path

When deploying a static website to S3 it’s useful to be able to upload files from a different local directory than the one you’re working in or to a directory other than the root in the S3 bucket.

With s3up, files can be uploaded into subdirectories via the --prefix option and leading components to be stripped off file names (for example a generated index.html in a dist folder can be uploaded to the root of an S3 bucket like this: s3up --strip 1 dist/index.html --bucket s3up-test)

I hope you like it and find it useful. Please report bugs if you find them.

Building a half-a-second website

I’ve spent the past couple of weekends rebuilding my website. Previously it was a really old, slow, out-of-date WordPress site running on ridiculously expensive (for what it was) GoDaddy shared hosting. Converting it to a statically generated (Jekyll or similar) site had been on my to-do list for years…

This is it.

Tools and architecture

  • It’s built with Hexo.io (although I swapped out the Sass compilation with one we developed for the Financial Times and removed the client side JavaScript entirely.
  • It’s hosted on S3 (provisioned with CloudFormation).
  • Circle CI runs the builds and pushes to production on green (when linting passes and the pages build).
  • It’s behind a CDN (CloudFlare) who provide SSL for free (thank you CloudFlare <3). They also support HTTP2 and have a nice API that you can use to do some clever cache optimisations with…

Purge on deploy

Currently the CDN in front of https://mattandre.ws is configured to store everything for up to 1 month (and I’m talking to CloudFlare to see if I can increase this to a year) but only instruct users’ browsers to only cache pages for up to 30 minutes. Then, I have set things up to call the CloudFlare API to automatically purge the files that have changed — and only the files that have changed.

Now clearly since Circle CI is already running all my build steps for me and knows what files have changed it could easily coordinate purging of the CDN. Indeed, we use this pattern a lot at the FT. But that was nowhere near over-engineered enough to qualify for a weekend hack project.

Instead, I created a Lambda function that was connected to my website’s S3 bucket’s ObjectRemoved and ObjectCreated streams. Each change in the S3 bucket generates an event that then triggers a Lambda function (written in Go) that purges the CDN for the associated pages. See the code.

Making this change caused the cache hit ratio to jump and even though the website was already fast before making this change, it’s now even faster still. Pages no longer need to travel all the way from Ireland (where my S3 bucket is) to reach every user — it would be as if the site had servers in every one of these cities around the world.

HTTP2 + S3 + CDN make a very fast website

When you add together HTTP2, S3 and smart use of a CDN you get a very performant website.

The above image shows that, occasionally, pages take the almost same amount of time to load in production (right) as they do on my local machine (left). Production isn’t always this quick (a few, very unscientific and statistically invalid spot checks of pages on https://mattandre.ws shows that most of the site loads in about half a second, but is sometimes as slow as 800ms) but it does show that a crazy level of performance is possible.

And there’s so much more left to optimise.