In a previous post, I have described how to protect a statically-rendered site with SSO (in our case, Keycloak). This requirement came from the fact that we’re building a product for businesses, where we want to define some access restrictions.

Furthermore, access also depends on which parts of the product suite a customer bought. Each customer gets their own environment that we host on our cloud account. Content pages themselves do not differ, but some customers get access to the “full” documentation (including SDK documentation), whereas others only receive a subset. Additionally, there is an internal version too, also including “Getting Started” documentation for development.

To summarize: We have a documentation repository containing a bunch of Markdown files and want to render various subsets of it per environment.

Defining the environments

While Jekyll supports environments out of the box, we wanted to avoid cumbersome, long-winded switches in the Liquid markup:

{% if jekyll.environment == "development" %}
  {% include index_development.md %}
{% elsif jekyll.environment == "full" %}
  {% include index_full.md %}
{% elsif ... %}
  ...
{% endif %}

Moreover, for some environments, some files were not supposed to be even included in the output. Jekyll provides the configuration options include and exclude for controlling the set of files that are being generated. But those have to be specified in _config.yml, which does not support any sort of environment switching:

markdown: kramdown
theme: just-the-docs
# ...
exclude:
  - "README.md"
  - "docker-compose.yml"
  - ".nginx"
  - "scripts"
  - "vendor"
  - ".environments"
  - "NOTICE"

Our plan was to declare the environments in external configuration files. Those files should have the same structure as the main Jekyll configuration and be merged into it during load time.

For example, the development environment configuration should be empty, since all pages are supposed to be included. But the full environment should look as follows:

exclude:
  - "devel/"

include:
  - "devel/database-schema.md"

This means: Exclude all files under devel, except for the database schema.

Writing a plugin

In Jekyll, plugins can either be loaded through the Gemfile, or by dropping Ruby files into the _plugins folder. The earliest point in which a Jekyll plugin can run in the generation pipeline is just after initializing the site. The documentation describes the :after_init hook as follows:

Just after the site initializes. Good for modifying the configuration of the site. Triggered once per build / serve session

Looks like a great match!

The necessary Ruby code to make this happen is fairly short, assuming that environment configuration files reside under the .environments folder:

require "psych"
require "deep_merge"

Jekyll::Hooks.register :site, :after_init do |jekyll|
  puts "Loading environment '#{Jekyll.env}' ..."
  file = ".environments/#{Jekyll.env}.yml"
  env_config = Psych.safe_load_file(file)

  main_config = jekyll.config.clone
  main_config.deep_merge!(env_config, {:merge_nil_values => true})

  jekyll.config = main_config
end

The Jekyll.env property contains the value of the JEKYLL_ENV environment variable, or development if that variable is not specified. The plugin loads the YAML file declaring environment-specific configuration and merges it with deep_merge! into the main configuration. This would even allow overriding other configuration keys from _config.yml, such as the title or the theme of the site.

To run this, our developers can use e.g., the following command:

JEKYLL_ENV=full bundle exec jekyll serve -wl

Note that the build-on-save functionality of Jekyll does not automatically apply configuration changes, but such changes rarely happen anyway.

Building and packaging

Since we host the documentation ourselves, we can bundle all different documentation flavours (currently, there are five) into the same Docker image. We just need to tell NGINX which one to pick.

For that, I wrote a small shell script running in CI that builds all flavours by enumerating the files in .environments:

build_environment()
{
  local env="$1"
  if [[ ! "$env" =~ ^[a-zA-Z]+$ ]]; then
    die "Illegal environment name $env"
  fi
  mkdir -p "$dest/$env"
  JEKYLL_ENV="$env" /usr/bin/env bundle exec jekyll build -d "$dest/$env"
}

for env in .environments/*.yml; do
  build_environment "$env"
done

This script creates one subfolder per environment in $dest. Our Dockerfile then copies all files from $dest into /usr/share/nginx/html/. As I described in my previous post, we can pass environment variables to the NGINX Docker image, which runs envsubst during startup. Now I only had to change the root directive in our NGINX configuration as follows:

root   /usr/share/nginx/html/${JEKYLL_ENV};

… and NGINX would set the correct root path according to the JEKYLL_ENV environment variable.

On the operations side, our Kubernetes administrators need to ensure that the container is loaded with the correct JEKYLL_ENV depending on the environment. If it is unset, the container fails to start; this avoids accidental information leakage.

Local development experience

As a last step, I also wrote a Docker Compose file for team members who didn’t want to install a local Ruby toolchain.

version: "3.9"
services:
  jekyll:
    image: docs-jekyll-dev
    build:
      context: .
      dockerfile: scripts/Dockerfile-jekyll-dev
    environment:
      JEKYLL_ENV: "${JEKYLL_ENV:-development}"
    volumes:
      - type: bind
        source: ./
        target: /src/
      - type: volume
        source: jekyll-generated
        target: /dest/
  nginx:
    image: docs-nginx-dev
    build:
      context: .
      dockerfile: scripts/Dockerfile-nginx-dev
    environment:
      SIGN_OUT_URL: "#"
      JEKYLL_ENV: "${JEKYLL_ENV:-development}"
    ports:
      - "8080:80"
    volumes:
      - type: volume
        source: jekyll-generated
        target: /usr/share/nginx/html/

volumes:
  jekyll-generated:

This file declares two Docker containers:

  1. a container that continuously builds the page using Jekyll (including file watching and incremental builds),
  2. a container that serves the generated files with NGINX.

The jekyll-generated volume acts as the common storage for the two containers.

To reduce code duplication, I added a flag to the aforementioned shell script that would just build one flavour. Using those containers is almost as easy as using Jekyll directly:

$ docker compose build
$ JEKYLL_ENV=full docker compose up

By defining the environment variable JEKYLL_ENV as ${JEKYLL_ENV:-development}, we emulate the defaulting behaviour of Jekyll.env.

Conclusion

A few tricks are sufficient to turn Jekyll and NGINX into an “almost” CMS. The advantage of not using a full-blown CMS lie within the better workflow for developers, and generally less operational overhead. Of course, more complex authorization mechanisms (such as ACLs) cannot be easily mapped to this model.