Before we dive into the details of setting up a development environment using Docker, we should spare a couple of seconds and think about what we would like to achieve. How can we use Docker’s capabilities and which problems can we solve with it? The IMO most noticable benefit is that with Docker it becomes easy and fast to provide a controlled and isolated development environment for a project. Since Docker images can be created in a repeatable process from a versionable Dockerfile, we have full control and information about the content of the environment we are building. All dependencies can be clearly spelled out – either using the mechanisms of the platform on which we base our environment, or via any build or configuration management tool we choose to integrate. Another major effect of the isolation provided through the use of Docker is that we shield ourselves from the host system we’re using. We no longer need to install the servlet containers or app servers of choice directly on our development machine or make sure that we switch to the correct JDK before we build the project. There is also no chance that a careless typo in a cleanup script removes vital parts of the OS on the host – not a common occurence, but scary nonetheless. There are further benefits, such as the ability to easily provide a defined state of a test database, but for the purpose of this article we will concentrate on the isolation aspect.
Of course all these advantages could be achieved using standard virtualization technology. The main benefit that Docker has compared to these solutions is that it uses a much more lightweight approach. While this seems to be only a quantitative difference in theory it changes the way you work in practice. Once you have the capability to start literally dozens of containers on a machine that was hard pressed to run five or six VMs before, and have them available in a one second or less instead of 30 or more seconds, it will change the way you work.
So if we decide to set up an environment using Docker, what functionality should be provided? The details will vary with the type of project, but some basic principles will apply in most or all cases:
- On the most basic level, the environment needs to provide all tools and libraries that are needed to set up a new project and work with it. If we’re looking at a project that aims to provide a JVM Runtime, we will at least need a corresponding JDK. A tool that automatically manages dependencies and handles builds and test is basically mandatory as well.
- There must be a possiblity to create new source code and other resources and work with existing ones. It should be possible to put these artefacts under version control, either using tools installed on the host or within the environment.
- It should be possible to cache automatically downloaded dependencies and re-use them across different invocations of the environment. It also would be nice if it were possible to share them between different projects to save resources.
- Any developer who has checked out the project sources should be able to start working with the environment without further setup.
- It should not be more difficult to work with the development environment than working with a project that has been set up the classical way. Ideally the developer should be shielded from any complexity that is introduced by using the development environment.
Having established the basics we now can take a look how these ideas can be put into reality. This example will provide a Docker Image that can be used to set up projects using the Typesafe Activator. The Activator can be used to bootstrap and control a project with Akka or the Play Framework. It provides a shell that can be used to build, test and run the project as well as a REPL for interactive work. The setup scripts and configuration for the sample image can be found in its Github repo.
First let’s take a look at how the image is created. Since our project will be JVM-based, we will base the image on one of the base images provided by the dockerfile/java Repository on the Docker Hub Registry. We could use any of the flavours available for the repository, but for this example we use the one including the Oracle 8 JDK. We will extend this image in a couple of ways:
- The current version of the Activator needs to be fetched and integrated in the environment. The Typesafe website provides a zip file, but there exists a better way: When started, the Activator grabs a JSON file containing information about the latest version from the Typesafe website to determine if there is an update available. The build.sh script provided in the Github repo uses the same JSON file to get the location of the latest zip file and download it to the downloads directory if necessary before kicking off the actual build. In the Dockerfile the zip file is copied to the image being created, extracted and installed at a defined location.
- It is a bad practice and a potential security problem to run software as root inside a Docker container. Therefore we create a new user ‘develop’.
- The containers that we will be starting from this image will be transient, i.e., they will be automatically be deleted after each session. Any data that needs to be persisted between sessions needs to be located in a volume that can be mounted to a directory on the host or be provided by a second container. The first volume that we define will be used to hold the project(s) we will create. The second volume is intended to hold all cached dependencies that are automatically downloaded by the Activator and the tools it uses. By default these dependencies are stored in the home directory of the user running the Activator, so for simplicities sake we’ll just expose this. That means that we can’t place the volume for the projects under the home directory, therefore we put it under /var/projects. It should be possible to change the location of the cache directories via system properties, I just couldn’t get that to work correctly yet.
- The last piece of the puzzle is the script that is run when the environment is started. This is named ‘start-activator.sh’ (located in the resources directory). It provides two modes: When started with the parameter ‘–new’, it creates a new project. When started with the name of a previously created project, it locates the corresponding startup script and executes it.
The whole setup looks a tad complex, but for ease of use the necessary calls can be wrapped in two shell scripts:
First use ‘build.sh’ to to create a new image with the current version of the Activator and place it in your local Docker repository. It does not require any parameters, so if you are ok with default name of ‘jpreissler/activator’ just run it and be done. When a new version of the Activator is release, simply re-run the script to update your local image.
Now you can start working with projects using this image. For this purpose the script ‘d.activator’ is provided. Just put this on your path and use the following steps to get up and running:
- Create a new directory with two sub-directories
- ‘projects’ will hold any projects that you create. You might want to put it under version control using tools installed on your host since the Docker container does not contain provisions for that (yet?).
- ‘cache’ will hold all cached dependencies in various hidden sub-directories. Its contents can be safely scrubbed if you feel like it, they will be re-downloaded as needed.
- Change to the new directory and create your first project:
- Choose a name and template for the new project when prompted.
- Start up the shell for new project:
Perhaps the quickest way to check if things are working correctly is to choose the play-scala template first and enter the command ‘run’ when you have started the shell. d.activator exposes port 9000 from the container to your host, so if you open a browser on http://localhost:9000, you should be greeted with the initial play project page once all downloads and compiles have finished.
That concludes the first introduction on the Docker images for developers. There are a couple of next steps that could be taken. The most obvious one would be to provide images for other tools and technologies. Maven certainly looks like a candidate. But it also would be interesting to examine how the software development lifecycle could be changed through the use of these images. How feasible is it to use images as deliverables from development to QA? Can we use this approach to easily put microservices in production? Another topic that might we worthwhile to look at is how the container-based approach can be enhanced. How about using additional containers to provide persistence or dependencies such as databases? Can we use this somehow to find a better way to provide defined test environments? If you want to participate in any of this or just want to voice your interest please do so.