One reason for this is that containers are simply processes running on a Linux host. They are not virtualized, only isolated. This is achieved by features of the Linux kernel to isolate processes such as kernel namespaces, chroots, uid_map, gid_map or cgroups. However, despite isolation, containers use the host’s kernel directly. For example, if the code executed in a container has been malicious or compromised, it could exploit kernel vulnerabilities to gain more privileges than allowed.
However, a container is also an application that is packaged with its runtime environment. If you use one of the usual Linux distributions, vulnerabilities could also exist there. Depending on which parts are actually used by your own application, this is more or less problematic.
If you use the Container Manager Kubernetes, other things are added that have to be considered in the context of security. These are on the one hand the processes of Kubernetes itself, but on the other hand also the configuration of what one allows containers to do or not.
In general, questions about security on a Kubernetes cluster can be divided into the following areas:
Host / Kubernetes
- Encrypted and authenticated communication of the processes of the Kubernetes cluster itself (API server, ETCD,…)
- Kernel vulnerabilities of the host
- Authorized communication through Role Based Access Control
- By the user with kubectl
- Through PODs via Service Accounts
- Safety-relevant aspects of the pod specification (YAML)
- Vulnerabilities of the container image
- Control of the application’s IP communication through a distributed firewall (within the VLAN)
- Encryption, authentication and authorization of the services of the application between each other (see for example SPIFFE)
This post is intended to give a brief overview of some of the most important points.
Kubernetes Process Security
A Kubernetes cluster consists of 5 processes (kubelet, kube-proxy, API Server, Scheduler, Controller Manager ) and an ETCD cluster. Users communicate with the API server via the CLI kubectl. If you do not use a managed cluster but set up the cluster yourself and/or implement your own user management, the following points should be considered:
- Use Transport Level Security (TLS) for all communication with the API server and the ETCD cluster
- API-Server Authentication
- Small clusters use a simple certificate or a static Bearer Token. Larger clusters can integrate an existing OIDC or LDAP server.
- All API clients must be authenticated. Also those that are part of the Kubernetes infrastructure, such as Nodes, Kube Proxy, Kubelet or even volume plugins.
- Restrict access to ETCD
- Possible read or write access of other components to ETCD is equivalent to the granting of cluster admin rights.
Role Based Access Control
For authorization with the API server, Kubernetes has an integrated Role-Based Access Control (RBAC), which bundles users or a group of users with a number of authorizations into roles. These permissions combine verbs (get, create, delete, update,…) with resources (pods, ingress, deployment,…) and can refer to a namespace or the entire cluster. RBAC not only controls the rights of users, but also those of PODs that communicate with the API server, e.g. according to the operator or controller pattern.
When is RBAC necessary?
Without RBAC every user with a valid kubeconfig has full access, i.e. cluster admin rights, on the cluster. But not only that, because every deployed POD in the system has the same full rights when communicating with the API server, it could monitor other PODs and, for example, start processes there or read data from them. Clusters without RBAC are therefore only recommended for demo or test installations.
A network policy is a specification of how groups of pods may communicate with each other and with other network endpoints. They do not implement a firewall on node level, but on a logical level for PODs and namespaces. Usually network policies are implemented by the VLAN, which is provided by a network plug-in during cluster installation (e.g. Canal).
The project Kubernetes Network Policy Recipes offers a very good overview of different use cases and the recipes for them. For example, the following policy prevents network access from other namespaces (e.g. “other”) to all PODs in the namespace “my-app”.
When are network policies necessary?
The less clusters you use, the more advantageous Kubernetes can be used. F.e. one cluster makes the most effective use of resources such as hardware or personnel for cluster administration. Following this approach, you need to protect running PODs from access or, if necessary, prevent them from establishing connections to the outside.
Use cases for the use of network policies can be among others:
- Isolation of different client systems on a common cluster
- Isolation of a production namespace from various test namespaces (which contain the same components and should not inadvertently access e.g. a wrong persistence)
- Whitelisting of incoming network traffic
- Blocking outbound connection attempts of an application
Pod Security Policies
Containers are nothing more than isolated processes running on a host. What such a container is allowed to do is defined by the settings of the pod specification (the YAML). There you can allow a container to access the host file system or all devices. Since the software developer is usually responsible for such a YAML, it is not always possible to automatically check the settings there. This applies in particular to the installation of external components via HELM Charts or ksonnet.
Pod Security Policies serve the purpose of making such deployments more secure. They control security-related aspects of the pod specification and they define a number of conditions (for a good overview of the security functions of the Linux kernel, see also here) under which a pod must run to be included in the system.
This is realized by special pod security policy resources that are bound to service accounts via role and role binding objects. The PODs created via these service accounts are then checked by policies.
A pod security policy allows a very broad and fine-grained configuration of what is allowed for a container, e.g. by the following parameters:
The most important security policies
- Privileged - determines whether a container in a pod can activate privileged mode. By default, a container may not access any devices on the host. A privileged container can do this and thus has almost the same access options as a process running on the host.
- Host Network: Controls whether the pod is allowed to use the Node Network. Containers of such PODs have access to the loopback device to which processes are bound to localhost, can access the network traffic of other PODs on the same node and may not be controlled by the network policies of a VLAN.
- Volumes: Allows configuration of a whitelist for allowed volume types. This makes it possible, for example, to generally exclude hostPath volumes.
- Allowed Host Paths: If a hostPath volume is required, a whitelist of paths that can be used can be specified.
- Capabilities: Linux divides the privileges traditionally associated with the superuser into different units called Capabilities. They can be activated or deactivated independently of each other.
Further information about Pod Security Policies can be found here.
When are security policies necessary?
Allowing a container unlimited access to a host’s file system provides many ways to escalate privileges. Sensitive data (e.g. logs, logon information, etc.) from other containers or processes or from the host itself can be read. Excluding HostPath volumes (or at least whitelisting paths) is certainly a good idea.
Like the host file system, you will want to protect the host’s devices and the host network (PODs should only use the VLAN of the network plugin). Only allowing this for specific PODs, e.g. all PODs in the namespace kube-system, is always recommended.
Many existing containers run as root or require the capabilities Docker has added by default for its containers (see also the Docker documentation). These include for example NET_BIND_SERVICE (allows binding a socket to a privileged port below 1024) or SETUID (allows arbitrary manipulation of the process UIDs). In which case it makes sense to withdraw a capability from a container or add individual ones to it, depends of course on the application that is to be operated. In general, however, fewer capabilities are always better than more of them. Pod security policies are a good way to prescribe a selected set of capabilities as usable or prohibited (although in this radical case, most more complex applications will not start anymore).
Securing Kubernetes processes is just as common as ensuring reliable API server authentication (whether via simple certificates, tokens or an OIDC server) or authorization (RBAC for users or PODs). However, it is often forgotten that containers contain their runtime environment and are simple processes that run on a host system without virtualization or sandboxing. Not only the container images have to be continuously updated, but also the kernel or the linux of the host in order to eliminate potential vulnerabilities.
Kubernetes (or Docker) allows to extend the rights of a container in the POD specification. The less control you have over POD specifications or deployed containers, the more you want to do general restrictions. Pod security policies are an easy way to do this.
But the applications themselves must also be secured. Who wants the applications of a test namespace to communicate with the production database or that the HTTP UI of client A accidentally accesses the backend service of client B? However, using a separate cluster for each client, team, project, test/QA/staging is not always a viable and productive way. The network policies allow you to configure the necessary restrictions for egress or ingress, so that you do not necessarily have to invest in the effort of several clusters.
Dockers and Kubernetes do not do any miracle and there are still some aspects of safety to be considered. However, both offer enough possibilities to make containers and clusters as secure as possible.