Matt Andrews
Software engineer making apps – that aren’t apps – and more at the FT. 会说汉语.
Semver as a Service

I’ve been continuing learning bits and pieces with mini projects… This time: Semver as a Service built with AWS Lambda, AWS API Gateway, AWS CloudFormation and Golang.

What is ‘Semver as a Service’?

https://github.com/matthew-andrews/semver-as-a-service/

Semver as a Service is a simple API that will look at any GitHub repository’s releases/tags, sort them and tell you the highest version or, if you specify a constraint, the highest version that meets a constraint.

Try it out here:-

Why?

Well, the main purpose was to learn Go, AWS, etc, but it’s also handy for writing install scripts. For example, this could be a simple script to install the latest version of s3up on your Mac:-

1
2
3
curl -sf https://api.mattandre.ws/semver/github/matthew-andrews/s3up \
| xargs -I '{}' curl -sfL https://github.com/matthew-andrews/s3up/releases/download/{}/s3up_darwin_386 -o /usr/local/bin/s3up \
&& chmod +x /usr/local/bin/s3up
Catching All Errors in AWS Lambda and API Gateway

When building applications with AWS Lambda and API Gateway I’ve found error handling quite difficult to work with.

You first define what status codes your API method is able to serve (200, 404 and 500, for example). You are encouraged to choose 200 as the default. Then you can write regular expressions that match against ‘Lambda Errors’.

According to Amazon’s documentation:-

For Lambda error regex […] type a regular expression to specify which Lambda function error strings (for a Lambda function) […] map to this output mapping.

Note
The error patterns are matched against the errorMessage property in the Lambda response, which is populated by context.fail(errorMessage) in Node.js or by throw new MyException(errorMessage) in Java.
Be aware of the fact that the .\ pattern will not match any newline (\n).

This seems simple enough.

Lambda functions that have run successfully shouldn’t have errorMessages so I should be able to:-

  1. Set a Lambda Error Regex that looks for .*404 Not Found.* and maps that to 404 errors — this works fine
  2. and then I should be able to map all other errors to 500 with (\n|.)* (note the \n is there because I heeded the warning in the documentation above in case one of my errors has a new line).

Whilst the Lambda Error Regex does indeed now map all errors to 500 responses, unfortunately it also maps all the successful Lambda response to 500s as well.

Lambda Error Regex

WARNING: THE LAMBDA ERROR REGEX WILL TRY TO MATCH AGAINST SUCCESSFUL RESPONSES FROM LAMBDA FUNCTIONS AS WELL AS FAILED ONES.

So, how do we fix it?

Easy. Whilst the Lambda Error Regex is used to compare against successful Lambda responses, in this case errorMessage is set to something like an empty string.

Just set the Lambda Error Regex that you want to match to your ‘catch all’ error response to (\n|.)+.

Like this:-

Thoughts

I’m really surprised that this is so difficult and that none of the documentation encourages (or helps) developers to write Lambda Error Regexs that match against all possible errors.

If I had to write regular expressions against all the errors I anticipated having to handle I would never feel 100% confident that I got them all and would have needlessly risked returning 200 responses containing errors to users.

Uploading static files & websites to Amazon S3 efficiently with s3up

I’ve been using Amazon S3 at work and at home a lot recently and have grown to really like its features. Versioning, lifecycle rules and event streams can be used in really cool ways to make rock solid and super performant websites.

When it comes to actually uploading files to S3 there are plenty of choices for command line tools but they all seemed to a bit more than I wanted or not quite enough and I’m learning Go at the moment so…

Introducing s3up!

https://s3up.mattandre.ws

A new cross platform command line tool for uploading files to S3.

If you’d like to try it out or report bugs, installation instructions and more information is up on GitHub.

Features

  • Optimised for uploading static websites
  • Uploads multiple files concurrently (or can be set to upload one at a time — this can be controlled via the --concurrency option)
  • Only uploads files that are new or have changed
  • Automatically detects and sets an appropriate Content-Type for each file uploaded
  • Allows for easy configuration of ACLs and Cache-Control headers for files
  • Splits large files up and uploads them in smaller pieces
  • Written in Go and compiled for all platforms, which means it is fast, can be installed quickly, and is standalone — it does not rely on other dependencies (like Python or Node)
  • Allows manipulation of the path that files get uploaded to
  • Has a --dry-run so that the changes it will make to objects in S3 can be previewed

Manipulating upload path

When deploying a static website to S3 it’s useful to be able to upload files from a different local directory than the one you’re working in or to a directory other than the root in the S3 bucket.

With s3up, files can be uploaded into subdirectories via the --prefix option and leading components to be stripped off file names (for example a generated index.html in a dist folder can be uploaded to the root of an S3 bucket like this: s3up --strip 1 dist/index.html --bucket s3up-test)

I hope you like it and find it useful. Please report bugs if you find them.

Building a half-a-second website

I’ve spent the past couple of weekends rebuilding my website. Previously it was a really old, slow, out-of-date WordPress site running on ridiculously expensive (for what it was) GoDaddy shared hosting. Converting it to a statically generated (Jekyll or similar) site had been on my to-do list for years…

This is it.

Tools and architecture

  • It’s built with Hexo.io (although I swapped out the Sass compilation with one we developed for the Financial Times and removed the client side JavaScript entirely.
  • It’s hosted on S3 (provisioned with CloudFormation).
  • Circle CI runs the builds and pushes to production on green (when linting passes and the pages build).
  • It’s behind a CDN (CloudFlare) who provide SSL for free (thank you CloudFlare <3). They also support HTTP2 and have a nice API that you can use to do some clever cache optimisations with…

Purge on deploy

Currently the CDN in front of https://mattandre.ws is configured to store everything for up to 1 month (and I’m talking to CloudFlare to see if I can increase this to a year) but only instruct users’ browsers to only cache pages for up to 30 minutes. Then, I have set things up to call the CloudFlare API to automatically purge the files that have changed — and only the files that have changed.

Now clearly since Circle CI is already running all my build steps for me and knows what files have changed it could easily coordinate purging of the CDN. Indeed, we use this pattern a lot at the FT. But that was nowhere near over-engineered enough to qualify for a weekend hack project.

Instead, I created a Lambda function that was connected to my website’s S3 bucket’s ObjectRemoved and ObjectCreated streams. Each change in the S3 bucket generates an event that then triggers a Lambda function (written in Go) that purges the CDN for the associated pages. See the code.

Making this change caused the cache hit ratio to jump and even though the website was already fast before making this change, it’s now even faster still. Pages no longer need to travel all the way from Ireland (where my S3 bucket is) to reach every user — it would be as if the site had servers in every one of these cities around the world.

HTTP2 + S3 + CDN make a very fast website

When you add together HTTP2, S3 and smart use of a CDN you get a very performant website.

The above image shows that, occasionally, pages take the almost same amount of time to load in production (right) as they do on my local machine (left). Production isn’t always this quick (a few, very unscientific and statistically invalid spot checks of pages on https://mattandre.ws shows that most of the site loads in about half a second, but is sometimes as slow as 800ms) but it does show that a crazy level of performance is possible.

And there’s so much more left to optimise.

Makefile reuse and inheritance

Reusable Makefile

Trends like microservices and technologies like AWS Lambda mean the way applications are built is changing. Before, a single monolithic application may have been contained within a single repository but now one product might be delivered by dozens or even hundreds of little sub-applications each in individual repositories where every one of those responsible for a small number of tasks.

Why you might do this is a topic for another time but this approach makes maintaining build automation tools across an application harder.

Front end build automation tools like gulp and grunt have mature plugin frameworks. That means it’s easy to share solutions to common problems. You can create plugins that are downloaded at build time and can be shared across different repositories.

There is no package manager or registry for makefile. But there’s nothing to stop you putting Makefiles in npm (or bower, nuget, composer, etc) modules. You could even use git submodules.

Once you have a shared bit of makefile you can then use include to include it in another makefile.

include

The include [filenames…] ‘directive’ of makefile reads in the contents of all the filenames into the parent makefile before continuing.

filenames… can also be shell patterns.

For example adding include *.mk will include all files in the same directory that end with .mk.

The Financial Times’ Makefile for front end applications

The pattern we adopted was to have a single, centralised Makefile that gets committed to each our applications in a file called n.Makefile. That file is then included into each repository’s actual makefile by an include n.Makefile added at the top. That centralised makefile contains a simple update script that allows each repository to be upgraded to the latest version by running make update-tools.

That centralised makefile contains standard patterns for installing dependencies, building assets, deployments and linting.

Part of the philosophy behind it was to ensure that it was easy to override any part of it.

For example, if a developer wanted their repository use the default install task provided by the central makefile (run by the developer by typing make install) they wouldn’t need to do anything besides include n.Makefile. It would be provided by default.

However, if the developer wanted to write their own install task all they need to do is implement an install task in their Makefile. They can even call make install-super anywhere in their install task to run the shared makefile’s install task as well.

This might appear at first glance to be quite similar to inheritance except that make does not support inheritance. So how did we achieve this?

We basically hacked inheritance into make by exploiting wildcards. In our shared makefile instead of defining an install task, we define an instal% task.

This means that if there isn’t a task called install in the project’s Makefile, running make install will run the steps defined in the instal% task of the shared Makefile. Similarly if a developer adds an install task to their project’s Makefile that will run whenever a developer runs make install instead.

For example:-

n.Makefile

1
2
instal%:
echo "shared install"

Makefile

1
2
3
4
5
include n.Makefile

install:
echo "repo install"
make install-super

Will do this:-

1
2
3
4
5
$ make install
echo "repo install"
repo install
echo "shared install"
shared install

We use -super as a suffix by convention but the way we’ve achieved inheritance actually means install-super could be swapped with any string that matches instal% and isn’t equal to install.