Introduction
To empower developers to manage their applications from each commit to the repository through to the production deployment, many companies follow the idea of self-service, whereby several functions are provided that reduce the need for interaction between developers and the platform team. Part of these means is the introduction of container technologies and Kubernetes. This brings many advantages to the classic delivery approach:
- Containers simplify the packaging of applications with all corresponding dependencies and delivery via a standardized format
- Kubernetes provides the management of deployments and operational aspects without having direct access to or knowledge about the underlying hardware
- Kubernetes provides abstraction in the form of resource definitions and a common API to interact with. This is not only for containerized workloads but also other operational means like load balancing, exposing applications via gateways or service configuration
With Kubernetes in place, the level of interaction shifts. There is no need for installation manuals, manually setting up servers with all required dependencies or a handover of artifacts with installation scripts.
The platform team can focus on hardening the platform and improving the developer experience to make it easier to use. Developer teams, on the other hand, can autonomously decide:
- What to deploy and in which version
- How to roll out, as a rolling update or a full recreate
- The number of replicas
- When to scale and how
- How to make the application available to the outside world
- How to configure and provide secrets to the application
This is a good step towards the ability of self-management, but part of the lifecycle of an application is also external resources such as databases or caches. The platform teams still struggle deciding how to provide a way for the developers to manage resources on their own. Some of the problems they face are:
- Infrastructure is provisioned using the cloud provider’s API directly
- This API typically has another semantic than the developers know from Kubernetes
- Another API also means another identity management, so the platform team must duplicate access control both for Kubernetes and for the Cloud Provider API, which may be complicated
- The tooling to automate the API calls is also different, probably Terraform or something similar directly from the cloud provider
- The developer teams require credentials to interact with the cloud provider. These credentials are quite wide-ranging and could lead to security leaks as they must be managed outside the cluster
- Developers can be overwhelmed by the options e.g., how to set up a database, and it is hard for the platform team to set up governance or best practices to avoid misconfiguration. How to set up a database correctly is part of the expected knowledge of a platform team, but not of the developer teams
So even when everything is automated, there is still dependency on platform teams to provision infrastructure and often involves a manual process.
This slows down the setup and stops the developer teams from being fully autonomous.
The Operator Pattern
One of Kubernetes’ strengths is its extensibility, and this has shaped a new set of tools that leverage this possibility to move a lot of management into the cluster. These tools are generally called “operators” and typically consist of:
- A set of custom resource definitions (CRDs) to extend the Kubernetes API with new resources
- An application that interacts with the Kubernetes API and acts whenever a resource is created/updated/deleted based on the custom resource definitions
There are several operators already available which can also interact with external APIs on the team’s behalf.
This already solves many issues:
- Developers only need to interact with Kubernetes
- Platform teams only have to handle the access management for one API and can decide who can create / update / delete external resources
- Credentials to interact with the cloud provider are only needed by the operator in the secured environment of the cluster. The developers do not need to have any credentials
- There is a common resource structure for internal and external resources
But some topics are still open:
- An operator is needed for each API the teams work with. For example, there are several AWS operators for different kinds of resources
- Different operators may have different semantics, e.g. how a database is defined
- As the operators provide CRDs which mimic all possible options of the original resource of the cloud provider, the number of possibilities is still overwhelming
- Additional tooling like admission controllers would be required to enforce best practices and governance
What Does Crossplane Bring To The Mix?
Crossplane offers a framework to fill these shortcomings by defining easy-to-use resources which will be mapped to specific resources of the cloud provider without writing any code. Crossplane helps to create a control plane for all kinds of external and internal resources. Complicated setups are abstracted to higher-level resources to provide a simple interface for developers. Crossplane is more like a meta-framework for operators than merely an operator itself.
This sounds vague, so let’s have a look at some of the features Crossplane provides.
Provide Access To External APIs By Provider Packages
Crossplane manages the installation and configuration of operators (controllers) which can interact with external APIs in the form of so-called “providers”. Providers are Crossplane packages which include the controller and CRDs which defines external resources, i.e. “managed resources”. Crossplane already comes along with several providers, which cover several of the big cloud providers.
How provider packages are distributed and installed in a cluster is managed by Crossplane. A new provider can be installed by defining a provider resource in the cluster.
Let’s demonstrate how this works by setting up a local test cluster and installing Crossplane with the help of Helm.
kind create cluster
helm repo add crossplane-stable https://charts.crossplane.io/stable
helm repo update
helm upgrade --install \
crossplane crossplane-stable/crossplane \
-n crossplane-system \
--create-namespace \
--wait
After installation, you can see the installed Crossplane control plane and registered new CRDs (part of it is the provider resource).
kubectl -n crossplane-system get pods
NAME READY STATUS RESTARTS AGE
crossplane-545c58944d-gwzl9 1/1 Running 0 10m
crossplane-rbac-manager-85fd5c9f6c-4zm4m 1/1 Running 0 10m
kubectl get crds
NAME CREATED AT
compositeresourcedefinitions.apiextensions.crossplane.io 2022-06-28T13:07:13Z
compositionrevisions.apiextensions.crossplane.io 2022-06-28T13:07:13Z
compositions.apiextensions.crossplane.io 2022-06-28T13:07:13Z
configurationrevisions.pkg.crossplane.io 2022-06-28T13:07:13Z
configurations.pkg.crossplane.io 2022-06-28T13:07:13Z
controllerconfigs.pkg.crossplane.io 2022-06-28T13:07:13Z
locks.pkg.crossplane.io 2022-06-28T13:07:13Z
providerrevisions.pkg.crossplane.io 2022-06-28T13:07:13Z
providers.pkg.crossplane.io 2022-06-28T13:07:13Z
storeconfigs.secrets.crossplane.io 2022-06-28T13:07:13Z
Now we have the infrastructure to install providers for the different cloud providers or external APIs we want to manage. An example for the AWS provider looks like this:
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-aws
spec:
package: "crossplane/provider-aws:master"
The provider links to the packages, which will then be installed by Crossplane. The CRDs which belong to the package will be registered and the controller will be installed and started.
kubectl -n crossplane-system get pods
NAME READY STATUS RESTARTS AGE
crossplane-545c58944d-gwzl9 1/1 Running 0 16m
crossplane-rbac-manager-85fd5c9f6c-4zm4m 1/1 Running 0 16m
provider-aws-84fe593bd332-f6ff57c86-86qt9 1/1 Running 0 65s
kubectl get crds
NAME CREATED AT
accesskeys.iam.aws.crossplane.io 2022-06-28T13:22:20Z
accesspoints.efs.aws.crossplane.io 2022-06-28T13:22:21Z
activities.sfn.aws.crossplane.io 2022-06-28T13:22:20Z
addons.eks.aws.crossplane.io 2022-06-28T13:22:21Z
addresses.ec2.aws.crossplane.io 2022-06-28T13:22:17Z
aliases.kms.aws.crossplane.io 2022-06-28T13:22:18Z
apikeys.apigateway.aws.crossplane.io 2022-06-28T13:22:23Z
apimappings.apigatewayv2.aws.crossplane.io 2022-06-28T13:22:22Z
apis.apigatewayv2.aws.crossplane.io 2022-06-28T13:22:18Z
authorizers.apigateway.aws.crossplane.io 2022-06-28T13:22:24Z
authorizers.apigatewayv2.aws.crossplane.io 2022-06-28T13:22:19Z
backups.dynamodb.aws.crossplane.io 2022-06-28T13:22:19Z
basepathmappings.apigateway.aws.crossplane.io 2022-06-28T13:22:22Z
brokers.mq.aws.crossplane.io 2022-06-28T13:22:19Z
bucketpolicies.s3.aws.crossplane.io 2022-06-28T13:22:23Z
buckets.s3.aws.crossplane.io 2022-06-28T13:22:16Z
cacheclusters.cache.aws.crossplane.io 2022-06-28T13:22:21Z
cacheparametergroups.elasticache.aws.crossplane.io 2022-06-28T13:22:22Z
cachepolicies.cloudfront.aws.crossplane.io 2022-06-28T13:22:19Z
cachesubnetgroups.cache.aws.crossplane.io 2022-06-28T13:22:18Z
certificateauthorities.acmpca.aws.crossplane.io 2022-06-28T13:22:18Z
...
configurations.pkg.crossplane.io 2022-06-28T13:07:13Z
connections.glue.aws.crossplane.io 2022-06-28T13:22:23Z
controllerconfigs.pkg.crossplane.io 2022-06-28T13:07:13Z
crawlers.glue.aws.crossplane.io 2022-06-28T13:22:17Z
databases.glue.aws.crossplane.io 2022-06-28T13:22:22Z
dbclusterparametergroups.docdb.aws.crossplane.io 2022-06-28T13:22:22Z
dbclusterparametergroups.rds.aws.crossplane.io 2022-06-28T13:22:18Z
dbclusters.docdb.aws.crossplane.io 2022-06-28T13:22:18Z
dbclusters.neptune.aws.crossplane.io 2022-06-28T13:22:17Z
dbclusters.rds.aws.crossplane.io 2022-06-28T13:22:17Z
dbinstanceroleassociations.rds.aws.crossplane.io 2022-06-28T13:22:21Z
dbinstances.docdb.aws.crossplane.io 2022-06-28T13:22:23Z
dbinstances.rds.aws.crossplane.io 2022-06-28T13:22:21Z
dbparametergroups.rds.aws.crossplane.io 2022-06-28T13:22:19Z
...
As you can see the list of new resources is quite exhaustive covering all the available resources of the provider.
Configuring The Providers
With the setup above, the developer can already interact with and e.g., create a new database resource like this:
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
metadata:
name: my-db
spec:
forProvider:
region: eu-central-1
dbInstanceClass: db.t2.small
masterUsername: master
allocatedStorage: 10
engine: postgres
engineVersion: "12"
skipFinalSnapshotBeforeDeletion: true
writeConnectionSecretToRef:
namespace: default
name: my-db-credentials
When applied, we can see that it gets picked up by the provider but as the provider has no credentials to interact with the external API, it gets stuck.
kubectl apply -n crossplane-system -f database.yaml
rdsinstance.database.aws.crossplane.io/my-db created
kubectl get -n crossplane-system rdsinstance
NAME READY SYNCED STATE ENGINE VERSION AGE
my-db False postgres 12 34s
kubectl describe -n crossplane-system rdsinstances.database.aws.crossplane.io my-db
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning CannotConnectToProvider 28s (x6 over 57s) managed/rdsinstance.database.aws.crossplane.io cannot get referenced Provider: ProviderConfig.aws.crossplane.io "default" not found
To fully work, we need to provide credentials in the form of a ProviderConfig referencing the secret. First, we need a file with the credentials. For AWS it looks like this:
[default]
aws_access_key_id=XXXXXXXXXXXXXXX
aws_secret_access_key=XXXXXXXXXXXXXXXXXXXXXXXXX
To create a secret in Kubernetes based on this file, the following command must be run:
kubectl create secret generic aws-creds -n crossplane-system --from-file=creds=./creds.conf
Subsequently, we can create a ProviderConfig referencing this secret and make it available to the provider.
apiVersion: aws.crossplane.io/v1beta1
kind: ProviderConfig
metadata:
name: default
spec:
credentials:
source: Secret
secretRef:
namespace: crossplane-system
name: aws-creds
key: creds
When it’s deployed, we see that the RDSInstance will be picked up and a corresponding instance will be created in AWS.
kubectl apply -n crossplane-system -f aws-provider-config.yaml
providerconfig.aws.crossplane.io/default created
kubectl get -n crossplane-system rdsinstance
NAME READY SYNCED STATE ENGINE VERSION AGE
my-db False True creating postgres 12.8 21s
The Kubernetes state will now be synced to AWS.
After a while, the database is fully started and the corresponding secrets are available within the cluster.
kubectl get -n default secrets
NAME TYPE DATA AGE
my-db-credentials connection.crossplane.io/v1alpha1 4 4m46s
With this setup, we have similar functionality as to that described with the operators, but with a common structure for how to package and distribute providers and overarching management of these packages with Crossplane in the cluster. As described with AWS, providers can be set up for GCP and Azure as well as other products with an API. For a full list, see the corresponding description on Crossplane’s homepage.
Creating Your Resource Abstractions And Compositions
With this setup, we only recreated the status quo with operators. It is probably better managed than with loose operators, but we still have the issue of complexity and governance. The delivered CRDs of the providers are still a one-to-one copy of the full options provided by the API.
Crossplane provides us the possibility to define our own CRDs which then will be mapped to one or many managed resource CRDs provided by the provider. The structure of these resources is completely up to us, we only need to define the resource we want and a mapping for this resource to the managed resources. These new resources are called “composite resources” (XRDs) in Crossplane.
So Crossplane needs a new CRD and a mapping. For example, we want to create a CRD for the developer with the following properties:
- Storage size in GB
- Postgres engine version to be used (the developer can only choose between 11 and 12 and if nothing is set, it defaults to 12)
All other properties like dbInstanceClass, engine, or region are not configurable and will be defined by the platform team.
We can create a new resource called Database with only a limited set of configurations like this:
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xdatabases.innoq.com
spec:
group: innoq.com
names:
kind: XDatabase
plural: xdatabases
defaultCompositionRef:
name: xdatabases.aws.innoq.com
claimNames:
kind: DatabaseClaim
plural: databaseclaims
connectionSecretKeys:
- username
- password
- endpoint
- port
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
parameters:
type: object
properties:
storageGB:
type: integer
version:
type: string
enum:
- "11"
- "12"
default: "12"
required:
- storageGB
required:
- parameters
This creates a new CRD for us, which defines only two fields that can be set by the developer (storageGB and version). The CompositeResourceDefinition uses the Open API Schema in the same way as CRDs, so we can define constraints or defaults, as shown for the version property.
kubectl apply -f database-composite-resource.yaml
compositeresourcedefinition.apiextensions.crossplane.io/xdatabases.innoq.com created
kubectl get crds | grep innoq
databaseclaims.innoq.com 2022-06-29T08:32:23Z
xdatabases.innoq.com 2022-06-29T08:32:23Z
The second part needed is the mapping.
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: xdatabases.aws.innoq.com
spec:
writeConnectionSecretsToNamespace: crossplane-system
compositeTypeRef:
apiVersion: innoq.com/v1alpha1
kind: XDatabase
resources:
- name: rdsinstance
base:
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
spec:
forProvider:
region: eu-central-1
dbInstanceClass: db.t2.small
masterUsername: masteruser
engine: postgres
skipFinalSnapshotBeforeDeletion: true
publiclyAccessible: true
writeConnectionSecretToRef:
namespace: crossplane-system
patches:
- fromFieldPath: "metadata.uid"
toFieldPath: "spec.writeConnectionSecretToRef.name"
transforms:
- type: string
string:
fmt: "%s-postgresql"
- fromFieldPath: "spec.parameters.storageGB"
toFieldPath: "spec.forProvider.allocatedStorage"
- fromFieldPath: "spec.parameters.version"
toFieldPath: "spec.forProvider.engineVersion"
connectionDetails:
- fromConnectionSecretKey: username
- fromConnectionSecretKey: password
- fromConnectionSecretKey: endpoint
- fromConnectionSecretKey: port
The composition describes how the new CRD and its properties shall be mapped to a list of managed resources. Here it is only one resource, but we could, for example, create an IAM role, user, or an own subnet together with the RDS instance using default values set by the platform team. The final interface to the developer looks like this:
apiVersion: innoq.com/v1alpha1
kind: DatabaseClaim
metadata:
name: my-db
namespace: default
spec:
parameters:
storageGB: 20
version: "11"
writeConnectionSecretToRef:
name: db-conn
The developer can only set the storage size and the engine version. All other properties are hidden and managed by the platform team thanks to the composition.
Crossplane brings all the tools with it to define new CRDs and define an easier and less error-prone interface between the platform team and developer teams just by providing configurations instead of writing our own operators.
Package And Distribute Your Own Resources
At one point we have set up a lot of our own composite resources and composition which we want to version and share between several clusters. Crossplane has the concept of packages to distribute configuration between Crossplane installations. There are two types of packages:
- Provider packages
- Configuration packages
We already referenced a provider package when we set up the cluster.
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-aws
spec:
package: "crossplane/provider-aws:master"
The spec.package
points to the package which shall be installed for this provider. Packages are just OCI compatible container images, but instead of containing code, they contain a list of yaml files that will be read by Crossplane to register e.g. CRDs and install the corresponding operator. An example provider package can be found on the Crossplane website.
A similar approach is used to distribute configurations, i.e., our own set of composite resources and compositions. To distribute, we need to package them and upload them to a container registry. We can also define dependencies to provider packages so that Crossplane takes care of installing the corresponding provider before the configuration gets installed.
Let’s say we packaged our configuration into an image called innoq/infra-configuration
. If we want Crossplane to download the configuration and register all CRDs and providers automatically, we can create a resource in our cluster like this:
apiVersion: pkg.crossplane.io/v1
kind: Configuration
metadata:
name: innoq-infra
spec:
package: innoq/infra-configuration:latest
packagePullPolicy: IfNotPresent
revisionActivationPolicy: Automatic
revisionHistoryLimit: 1
When deployed, Crossplane will:
- download the configuration package
- check if the Crossplane version defined in the package fits the installed one and otherwise stop
- check if the provider is the right version defined as a dependency in the package already installed in the cluster and if not, update it
- register all composite resources and compositions found in the package
So, it provides a simple process to share common configurations without the need for additional infrastructure. For a full description, have a look at the corresponding documentation.