When building cloud platforms for our development teams, our top priority is automating the application operation process and empowering teams to handle deployments independently. This approach is a step in the right direction for achieving team autonomy and bridging the gap between development and operations. A typical setup for our platform would involve:
- Development teams requesting infrastructure from the platform team via tickets or other channels. Provisioning tools like Terraform are often used to manage infrastructure.
- When creating components like databases, the corresponding credentials must be securely shared with the development teams. One common approach is to set up secrets in the cluster, or allow development teams to create their own secrets.
- Development teams can manage their required resources for the cluster, such as deployments or services, in their own repository using variant management tools like Helm or Kustomize.
- Pipelines are set up for the development teams to push changes to their applications into the cluster. These pipelines use access tokens to interact with the cluster.
- Development teams typically have read access to view the state of their deployments in the cluster.
- Platform teams provide template repositories to copy from to start a new service
While this setup is an improvement over the manual handover process of software from development to operations, it still has its own challenges as described below.
Kubernetes is for people building platforms. If you are a developer building your own platform (AppEngine, Cloud Foundry, or Heroku clone), then Kubernetes is for you.Kelsey HightowerDeveloper Advocate @ Google
Kubernetes is an excellent tool for running applications, as it enables us to define our operational needs and constraints declaratively, in the form of resources. This approach is a game-changer, as it allows us to make constraints explicit and abstract away low-level interactions with the hardware behind an API. However, as Kelsey Hightower has noted, it may not be the best API for development teams to use directly. This is because understanding Kubernetes internals, such as routing, placement, and resource capacities, and how to manipulate them using Kubernetes means can be a distraction from the team’s core objective of rolling out new functionality and running applications efficiently.
Many platforms do not provide a way to manage secrets in the same way as other resources required to run applications. Whether the secrets are managed manually or automatically generated from a pipeline, they are often treated differently than normal deployments, requiring extra tooling or processes. Furthermore, teams may need access to the secrets in order to generate the secret resource themselves, which can lead to storing them in less secure environments such as the build environment.
The example setup necessitates granting the build pipeline access to the cluster, with far-reaching permissions that allow it to create, update, or delete resources within the cluster. The access token for the pipeline must be stored in the build environment and carefully safeguarded to prevent misuse. Furthermore, team members require access to the cluster in order to monitor the health of their own application or perform failure analysis. Managing the appropriate permissions and access can be a complicated and error-prone process, with the potential to create security vulnerabilities if not done correctly.
Request for infrastructure
While infrastructure creation can be automated, there is still a trust boundary between development teams and the platform team. As a result, organizational processes are required to manage requests that cross this boundary. This can take the form of a ticketing system or other means of communication. Ultimately, it is the platform team’s responsibility to set up the infrastructure in order to hide the underlying complexity. For example, setting up the infrastructure may require different tooling such as Terraform, which should not be a burden for the development teams to learn and master. Additionally, governance is necessary to ensure that the infrastructure is set up correctly with the appropriate permissions, default values, and in the right region or network. The platform team is responsible for managing the budget, which may impact decisions related to infrastructure that the development team may not be aware of. As a result, team autonomy may be limited when it comes to infrastructure decisions, without sacrificing the platform team’s governance.
Discoverability and Best practices
Breaking down our systems into tens or hundreds of smaller services, even with automation and a common runtime platform, can make it challenging to maintain an overview. It becomes difficult to answer questions such as: what services exist? Who is responsible for each service? Where can I find the source code and description of each service? Is the service still active? This information is often scattered across multiple wikis, systems, and individuals, making it challenging to obtain a comprehensive understanding of the system.
On the other hand, it is crucial to have a system in place to provide best practices when creating a new service. Although Service Template repositories can be used as a baseline for the code, there is more to creating a new service than just the code. Other steps, such as setting up a pipeline, granting access to the cluster, and other necessary configuration, need to be completed before a new service can be deployed. Therefore, it is essential to have a comprehensive system in place to guide the creation of new services, beyond just providing code templates.
What solutions can we introduce to address the challenges discussed above?
KubeVela provides a simpler way to define applications, among other benefits. Although it is agnostic to the runtime infrastructure, it has excellent integration with Kubernetes. Rather than defining all operational constraints separately, we can use Kubevela to define a simple application like this:
KubeVela is based on the Open Application Model and offers many features, but its main advantage is providing a better interface for development teams. With KubeVela, teams can focus on the various aspects of the application and mix in traits as needed. By defining the application, KubeVela handles the corresponding Kubernetes resources, saving teams time and effort.
ExternalSecrets simplifies the integration of external secret management solutions such as AWS Secrets Manager, Azure Key Vault, or Google Secrets Manager. It maps externally stored secrets to secret resources in the Kubernetes cluster using a defined resource. This provides an easy way to manage secrets across multiple environments, without having to worry about storing them securely within the cluster.
Since ExternalSecrets does not contain any credentials or sensitive information, it can be safely stored in Git along with other resources. Developers do not need to know the secrets and do not have to store them separately in the build environment. If access is required for debugging purposes, it can be granted directly through the external secrets management platform, which eliminates the need to provide secrets in a less secure manner, such as via email.
ExternalSecrets is just one of many available solutions that enable secure storage of secrets in Git. Another possible option is Sealed Secrets, which offers a similar feature but does not require an external source for secrets.
ArgoCD is a powerful GitOps tool that simplifies deployment by reversing the flow of deployment. Instead of actively pushing changes into the cluster, ArgoCD runs within the cluster and pulls changes available in Git. This allows teams to focus on their code and use their existing Git workflows to manage deployments.
With ArgoCD, the pipeline renders the final resource files, also known as manifests, and pushes them into a Git repository. ArgoCD then pulls the latest changes from this repository and applies them to the cluster. This makes Git the single source of truth, ensuring that what is defined in Git is what is deployed in the cluster.
By implementing ArgoCD, we can significantly enhance the security and efficiency of our cluster. ArgoCD follows the GitOps approach, meaning that instead of manually deploying changes into the cluster, it pulls changes from a Git repository. This approach provides several benefits:
- We can easily track and audit changes made to the application via Git history.
- Developers no longer need direct access to the cluster to make changes, improving security.
- The ArgoCD UI allows developers to see the health of their applications.
- We can rebuild the cluster’s state just based on the information stored in Git.
- The build pipeline only requires access to Git, not the cluster, reducing the need for excessive permissions.
- By storing Git credentials in the cluster, rather than storing cluster credentials in the build environment, we can enhance security.
ArgoCD is just one of many GitOps tools available. Flux is another popular option. Both ArgoCD and Flux simplify the access management to the cluster by eliminating the need for developers to have direct access. This approach improves security and accountability.
Crossplane provides a universal control plane for managing not just internal resources, but also external resources from different cloud providers in a consistent manner. It uses the Kubernetes resource model to model external resource claims and requirements, which can be applied in the same way as all other resources like deployments or PVCs.
In addition to raw mappings to external resources, Crossplane also enables the platform team to define their own abstracted resources, providing a simpler interface for developers. For example, a platform team could define a custom resource for a database with specific configurations, and developers could use this abstracted resource in their applications without needing to worry about the underlying cloud provider or the specifics of the database.
This makes it easier for teams to manage their cloud infrastructure in a consistent way, and reduces the complexity and overhead of managing multiple cloud provider APIs separately. Crossplane also provides a policy engine to ensure that resources are provisioned in compliance with company policies and guidelines.
By moving the external resource state to the cluster and managing it as any other resource, Crossplane simplifies the management of external resources and makes it more consistent with the Kubernetes resource model. It also allows for the use of GitOps to have Git as the single source of truth for the external resources such as databases or message middlewares, which can improve the security and auditing of these resources.
Depending on the level of trust in the development teams, these resources can be directly managed in the application’s repository, or a separate Git repository with restricted permissions can be used to keep control in the hands of the platform team.
If Crossplane is already in place it can make the usage of KubeVela obsolete, as the same can be achieved with the resource abstraction feature of Crossplane to define an own application resource.
Backstage is an open-source platform for building developer portals. It provides a centralised platform for developers to discover and reuse shared components, services, and knowledge within an organisation.
With Backstage, developers can view and manage the tools and services that they need to develop, test, and deploy their applications. It provides a modular architecture that allows teams to create plugins for their specific needs, such as integrating with CI/CD pipelines or cloud providers.
Overall, Backstage aims to improve developer productivity by simplifying the management of shared services and components, reducing duplication of effort, and providing a consistent user experience across teams.
One of the core features of Backstage is its workflow templates, which enable teams to quickly set up new repositories with everything needed to get started. These templates include a base setup for a service, including the Git repository and code, as well as the setup for a pipeline and the necessary Kubernetes resources. Additionally, they help to ensure that the correct permissions are in place for teams.
While setting up a new service is one example of a workflow that can be automated with Backstage, the workflow templates can be extended with plugins to automate all kinds of workflows and integrate with external systems.
Overall, Backstage helps to streamline the development process and promote collaboration between teams.