This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Operating

Get started with using tenancy on Kubernetes

1 - Setup

Setting up your environment for Capsule

1.1 - Installation

Installing the Capsule Controller

Requirements

  • Helm 3 is required when installing the Capsule Operator chart. Follow Helm’s official for installing helm on your particular operating system.
  • A Kubernetes cluster 1.16+ with following Admission Controllers enabled:
    • PodNodeSelector
    • LimitRanger
    • ResourceQuota
    • MutatingAdmissionWebhook
    • ValidatingAdmissionWebhook
  • A Kubeconfig file accessing the Kubernetes cluster with cluster admin permissions.
  • Cert-Manager is recommended but not required

Installation

We officially only support the installation of Capsule using the Helm chart. The chart itself handles the Installation/Upgrade of needed CustomResourceDefinitions. The following Artifacthub repository are official:

Perform the following steps to install the capsule Operator:

  1. Add repository:

     helm repo add projectcapsule https://projectcapsule.github.io/charts
    
  2. Install Capsule:

     helm install capsule projectcapsule/capsule --version 0.10.6 -n capsule-system --create-namespace
    

    or (OCI)

     helm install capsule oci://ghcr.io/projectcapsule/charts/capsule --version 0.10.6 -n capsule-system --create-namespace
    
  3. Show the status:

     helm status capsule -n capsule-system
    
  4. Upgrade the Chart

     helm upgrade capsule projectcapsule/capsule -n capsule-system
    

    or (OCI)

     helm upgrade capsule oci://ghcr.io/projectcapsule/charts/capsule --version 0.10.7
    
  5. Uninstall the Chart

     helm uninstall capsule -n capsule-system
    

Considerations

Here are some key considerations to keep in mind when installing Capsule. Also check out the Best Practices for more information.

Certificate Management

We recommend using cert-manager to manage the TLS certificates for Capsule. This will ensure that your Capsule installation is secure and that the certificates are automatically renewed. Capsule requires a valid TLS certificate for it’s Admission Webserver. By default Capsule reconciles it’s own TLS certificate. To use cert-manager, you can set the following values:

certManager:
  generateCertificates: true
tls:
  enableController: false
  create: false

Webhooks

Capsule makes use of webhooks for admission control. Ensure that your cluster supports webhooks and that they are properly configured. The webhooks are automatically created by Capsule during installation. However some of these webhooks will cause problems when capsule is not running (this is especially problematic in single-node clusters). Here are the webhooks you need to watch out for.

Generally we recommend to use matchconditions for all the webhooks to avoid problems when Capsule is not running. You should exclude your system critical components from the Capsule webhooks. For namespaced resources (pods, services, etc.) the webhooks all select only namespaces which are part of a Capsule Tenant. If your system critical components are not part of a Capsule Tenant, they will not be affected by the webhooks. However, if you have system critical components which are part of a Capsule Tenant, you should exclude them from the Capsule webhooks by using matchconditions as well or add more specific namespaceselectors/objectselectors to exclude them. This can also be considered to improve performance.

Refer to the webhook values.

The Webhooks below are the most important ones to consider.

Nodes

There is a webhook which catches interactions with the Node resource. This Webhook is mainly interesting, when you make use of Node Metadata. In any other case it will just case you problems. By default the webhook is enabled, but you can disable it by setting the following value:

webhooks:
  hooks:
    nodes:
      enabled: false

Or you could at least consider to set the failure policy to Ignore:

webhooks:
  hooks:
    nodes: 
      failurePolicy: Ignore

If you still want to use the feature, you could execlude the kube-system namespace (or any other namespace you want to exclude) from the webhook by setting the following value:

webhooks:
  hooks:
    nodes: 
      matchConditions:
      - name: 'exclude-kubelet-requests'
        expression: '!("system:nodes" in request.userInfo.groups)'
      - name: 'exclude-kube-system'
        expression: '!("system:serviceaccounts:kube-system" in request.userInfo.groups)'

Namespaces

Namespaces are the most important resource in Capsule. The Namespace Webhook is responsible for enforcing the Capsule Tenant boundaries. It is enabled by default and should not be disabled. However, you may change the matchConditions to execlude certain namespaces from the Capsule Tenant boundaries. For example, you can exclude the kube-system namespace by setting the following value:

webhooks:
  hooks:
    namespaces: 
      matchConditions:
      - name: 'exclude-kube-system'
        expression: '!("system:serviceaccounts:kube-system" in request.userInfo.groups)'

Compatibility

The Kubernetes compatibility is announced for each Release. Generally we are up to date with the latest upstream Kubernetes Version. Note that the Capsule project offers support only for the latest minor version of Kubernetes. Backwards compatibility with older versions of Kubernetes and OpenShift is offered by vendors.

GitOps

There are no specific requirements for using Capsule with GitOps tools like ArgoCD or FluxCD. You can manage Capsule resources as you would with any other Kubernetes resource.

ArgoCD

Manifests to get you started with ArgoCD.

---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: capsule
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: system
  source:
    repoURL: ghcr.io/projectcapsule/charts
    targetRevision: 0.10.6
    chart: capsule
    helm:
      valuesObject:
        crds:
          install: true
        certManager:
          generateCertificates: true
        tls:
          enableController: false
          create: false
        manager: 
          options:  
            capsuleConfiguration: default
            ignoreUserGroups:
              - oidc:administators
            capsuleUserGroups:
              - oidc:kubernetes-users
              - system:serviceaccounts:capsule-argo-addon
        webhooks:
          hooks:
            nodes: 
              failurePolicy: Ignore
        serviceMonitor:
          enabled: true
        proxy:
          enabled: true
          webhooks:
            enabled: true
          certManager:
            generateCertificates: true
          options:
            generateCertificates: false
            oidcUsernameClaim: "email"
            extraArgs:
            - "--feature-gates=ProxyClusterScoped=true"
            - "--feature-gates=ProxyAllNamespaced=true"

  destination:
    server: https://kubernetes.default.svc
    namespace: capsule-system

  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - ServerSideApply=true
    - CreateNamespace=true 
    - PrunePropagationPolicy=foreground 
    - PruneLast=true
    - RespectIgnoreDifferences=true 
    retry:
      limit: 5
      backoff:
        duration: 5s 
        factor: 2 
        maxDuration: 3m
---
apiVersion: v1
kind: Secret
metadata:
  name: capsule-repo
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: repository
stringData:
  url: ghcr.io/projectcapsule/charts
  name: capsule
  project: system
  type: helm
  enableOCI: "true"

FluxCD

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: capsule
  namespace: flux-system
spec:
  serviceAccountName: kustomize-controller
  targetNamespace: "capsule-system"
  interval: 10m
  releaseName: "capsule"
  chart:
    spec:
      chart: capsule
      version: "0.10.6"
      sourceRef:
        kind: HelmRepository
        name: capsule
      interval: 24h
  install:
    createNamespace: true
  upgrade:
    remediation:
      remediateLastFailure: true
  driftDetection:
    mode: enabled
  values:
    crds:
      install: true
    certManager:
      generateCertificates: true
    tls:
      enableController: false
      create: false
    manager: 
      options:  
        capsuleConfiguration: default
        ignoreUserGroups:
          - oidc:administators
        capsuleUserGroups:
          - oidc:kubernetes-users
          - system:serviceaccounts:capsule-argo-addon
    webhooks:
      hooks:
        nodes: 
          failurePolicy: Ignore
    serviceMonitor:
      enabled: true
    proxy:
      enabled: true
      webhooks:
        enabled: true
      certManager:
        generateCertificates: true
      options:
        generateCertificates: false
        oidcUsernameClaim: "email"
        extraArgs:
        - "--feature-gates=ProxyClusterScoped=true"
        - "--feature-gates=ProxyAllNamespaced=true"
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
  name: capsule
  namespace: flux-system
spec:
  type: "oci"
  interval: 12h0m0s
  url: oci://ghcr.io/projectcapsule/charts

Security

See all available Artifacts

Signature

To verify artifacts you need to have cosign installed. This guide assumes you are using v2.x of cosign. All of the signatures are created using keyless signing. You can set the environment variable COSIGN_REPOSITORY to point to this repository. For example:

# Docker Image
export COSIGN_REPOSITORY=ghcr.io/projectcapsule/capsule

# Helm Chart
export COSIGN_REPOSITORY=ghcr.io/projectcapsule/charts/capsule

To verify the signature of the docker image, run the following command. Replace <release_tag> with an available release tag:

COSIGN_REPOSITORY=ghcr.io/projectcapsule/charts/capsule cosign verify ghcr.io/projectcapsule/capsule:<release_tag> \
  --certificate-identity-regexp="https://github.com/projectcapsule/capsule/.github/workflows/docker-publish.yml@refs/tags/*" \
  --certificate-oidc-issuer="https://token.actions.githubusercontent.com" | jq

To verify the signature of the helm image, run the following command. Replace <release_tag> with an available release tag:

COSIGN_REPOSITORY=ghcr.io/projectcapsule/charts/capsule cosign verify ghcr.io/projectcapsule/charts/capsule:<release_tag> \
  --certificate-identity-regexp="https://github.com/projectcapsule/capsule/.github/workflows/helm-publish.yml@refs/tags/*" \
  --certificate-oidc-issuer="https://token.actions.githubusercontent.com" | jq

Provenance

Capsule creates and attests to the provenance of its builds using the SLSA standard and meets the SLSA Level 3 specification. The attested provenance may be verified using the cosign tool.

Verify the provenance of the docker image. Replace <release_tag> with an available release tag

cosign verify-attestation --type slsaprovenance \
  --certificate-identity-regexp="https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@refs/tags/*" \
  --certificate-oidc-issuer="https://token.actions.githubusercontent.com" \
  ghcr.io/projectcapsule/capsule:<release_tag> | jq .payload -r | base64 --decode | jq

Verify the provenance of the helm image. Replace <release_tag> with an available release tag

cosign verify-attestation --type slsaprovenance \
  --certificate-identity-regexp="https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@refs/tags/*" \
  --certificate-oidc-issuer="https://token.actions.githubusercontent.com" \
  ghcr.io/projectcapsule/charts/capsule:<release_tag> | jq .payload -r | base64 --decode | jq

Software Bill of Materials (SBOM)

An SBOM (Software Bill of Materials) in CycloneDX JSON format is published for each release, including pre-releases. You can set the environment variable COSIGN_REPOSITORY to point to this repository. For example:

# Docker Image
export COSIGN_REPOSITORY=ghcr.io/projectcapsule/capsule

# Helm Chart
export COSIGN_REPOSITORY=ghcr.io/projectcapsule/charts/capsule

To inspect the SBOM of the docker image, run the following command. Replace <release_tag> with an available release tag:

COSIGN_REPOSITORY=ghcr.io/projectcapsule/capsule cosign download sbom ghcr.io/projectcapsule/capsule:<release_tag>

To inspect the SBOM of the helm image, run the following command. Replace <release_tag> with an available release tag:

COSIGN_REPOSITORY=ghcr.io/projectcapsule/charts/capsule cosign download sbom ghcr.io/projectcapsule/charts/capsule:<release_tag>

1.2 - OpenShift

How to install Capsule and the Capsule Proxy on OpenShift

Introduction

Capsule is a Kubernetes multi-tenancy operator that enables secure namespace-as-a-service in Kubernetes clusters. When combined with OpenShift’s robust security model, it provides an excellent platform for multi-tenant environments.

This guide demonstrates how to deploy Capsule and Capsule Proxy on OpenShift using the nonroot-v2 and restricted-v2 SecurityContextConstraint (SCC), ensuring tenant owners operate within OpenShift’s security boundaries.

Why Capsule on OpenShift

While OpenShift can be already configured to be quite multi-tenant (together with for example Kyverno), Capsule takes it a step further and easier to manage.

When people say a multitenant kubernetes cluster, they often think they will get one or two namespaces inside a cluster, with not that much privileges. But: Capsule is different. As a tenant owner, you can create as many namespaces as you want. RBAC is much easier, since Capsule is handling it, making it less error-prone. And resource quota is not set per namespace, but it’s spread across a whole tenant, making management easy. Not to mention RBAC issues while listing clusterwide resources that are solved by the Capsule Proxy. Also, even some operators are able to be installed inside a tenant because of the Capsule Proxy. Add the service account as a tenant owner, and set the env variable KUBERNETES_SERVICE_HOST of the operator deployment to the capsule proxy url. Now your operator thinks it is admin, but it lives completely inside the tenant.

Prerequisites

Before starting, ensure you have:

  • OpenShift cluster with cluster-admin privileges
  • kubectl CLI configured
  • Helm 3.x installed
  • cert-manager installed

Limitations

There are a few limitations that are currently known of using OpenShift with Capsule:

  • A tenant owner can not create a namespace/project in the OpenShift GUI. This must be done with kubectl.
  • When copying the login token from the OpenShift GUI, there will always be the server address of the kubernetes api instead of the Capsule Proxy. There is a RFE created at Red Hat to make this url configurable (RFE-7592). If you have a support contract at Red Hat, it would be great to create a SR and ask that you would also like to have this feature to be implemented. The more requests there are, the more likely it will be implemented.

Capsule Installation

Remove selfprovisioners rolebinding

By default, OpenShift comes with a selfprovisioner role and rolebinding. This role lets all users always create namespaces. For the use case of Capsule, this should be removed. The Red Hat documentation can be found here. Remove the subjects from the rolebinding:

kubectl patch clusterrolebinding.rbac self-provisioners -p '{"subjects": null}'

Also set the autoupdate to false, so the rolebinding doesn’t get reverted by Openshift.

kubectl patch clusterrolebinding.rbac self-provisioners -p '{ "metadata": { "annotations": { "rbac.authorization.kubernetes.io/autoupdate": "false" } } }'

Extend the admin role

In this example, we will add the default kubernetes admin role to the tenant owner, so it gets admin privileges on the namespaces that are in their tenant. This role should be extended.

  • Add the finalizers so users can create/edit resources that are managed by capsule
  • Add the SCC’s that tenant owners can use. In this example, it is will be restricted-v2 and nonroot-v2.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: extend-admin-role
  labels:
    rbac.authorization.k8s.io/aggregate-to-admin: 'true'
rules:
  - verbs:
      - update
    apiGroups:
      - capsule.clastix.io
    resources:
      - '*/finalizers'
  - apiGroups:
      - security.openshift.io
    resources:
      - securitycontextconstraints
    resourceNames:
      - restricted-v2
      - nonroot-v2
    verbs:
      - 'use'

Helm Chart values

The jobs that Capsule uses can be runned with the restricted-v2 SCC. For this, the securityContext and podSecurityContexts of the job must be disabled. For Capsule it self, we leave it to enabled. This is because capsule runs as nonroot-v2, which is still a very secure SCC. Also, always add the pullPolicy: Always on a multitenant cluster, to make sure you are working with the correct images you intended to. The following chart values can be used:

  podSecurityContext:
    enabled: true
  securityContext:
    enabled: true
  jobs:
    podSecurityContext:
      enabled: false
    securityContext:
      enabled: false
    image:
      pullPolicy: Always
  manager:
    image:
      pullPolicy: Always

Deploy the Capsule Helm chart with (at least) these values.

Example tenant

A minimal example tenant can look as the following:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: sun
spec:
  imagePullPolicies:
    - Always
  owners:
    - clusterRoles:
        - admin
        - capsule-namespace-deleter
      kind: Group
      name: sun-admin-group
  priorityClasses:
    allowed:
      - openshift-user-critical

Capsule Proxy

The same principles for Capsule are also for Capsule Proxy. That means, that all (pod)SecurityContexts should be disabled for the job. In this example we enable the ProxyAllNamespaced feature, because that is one of the things where the Proxy really shines in its power. The following helm values can be used as a template:

  securityContext:
    enabled: true
  podSecurityContext:
    enabled: true
  options:
    generateCertificates: false #set to false, since we are using cert-manager in .Values.certManager.generateCertificates
    enableSSL: true
    extraArgs:
      - '--feature-gates=ProxyAllNamespaced=true'
      - '--feature-gates=ProxyClusterScoped=false'
  image:
    pullPolicy: Always
  global:
    jobs:
      kubectl:
        securityContext:
          enabled: true
  webhooks:
    enabled: true
  certManager:
    generateCertificates: true
  ingress:
    enabled: true
    annotations:
      route.openshift.io/termination: "reencrypt"
      route.openshift.io/destination-ca-certificate-secret: capsule-proxy-root-secret
    hosts:
    - host: "capsule-proxy.example.com"
      paths: ["/"]

That is basically all the configuration needed for the Capsule Proxy.

Console Customization

The OpenShift console can be customized. For example, the capsule-proxy can be added as a shortcut on the top right application menu with the ConsoleLink CR:

apiVersion: console.openshift.io/v1
kind: ConsoleLink
metadata:
  name: capsule-proxy-consolelink
spec:
  applicationMenu:
    imageURL: 'https://github.com/projectcapsule/capsule/raw/main/assets/logo/capsule.svg'
    section: 'Capsule'
  href: 'capsule-proxy.example.com'
  location: ApplicationMenu
  text: 'Capsule Proxy Kubernetes API'

It’s also possible to add links specific for certain namespaces, which are shown on the Namespace/Project overview. These can also be tenant specific by adding a NamespaceSelector:

apiVersion: console.openshift.io/v1
kind: ConsoleLink
metadata:
  name: namespaced-consolelink-sun
spec:
  text: "Sun Docs"
  href: "https://linktothesundocs.com"
  location: "NamespaceDashboard"
  namespaceDashboard:
    namespaceSelector:
      matchExpressions:
        - key: capsule.clastix.io/tenant
          operator: In
          values:
            - sun

Also a custom logo can be provided, for example by adding the Capsule logo.

Add these config lines to the existing cluster CR Console.

kubectl create configmap console-capsule-logo --from-file capsule-logo.png -n openshift-config
apiVersion: operator.openshift.io/v1
kind: Console
metadata:
  name: cluster
spec:
  customization:
    customLogoFile:
      key: capsule-logo.png
      name: console-capsule-logo
    customProductName: Capsule OpenShift Cluster

Conclusion

After this section, you have a ready to go Capsule and Capsule-Proxy setup configured on OpenShift with some nice customizations in the OpenShift console. All ready to go and to ship to the development teams!

1.3 - Rancher

How to install Capsule and the Capsule Proxy on Rancher

The integration between Rancher and Capsule, aims to provide a multi-tenant Kubernetes service to users, enabling:

  • a self-service approach
  • access to cluster-wide resources

to end-users.

Tenant users will have the ability to access Kubernetes resources through:

  • Rancher UI
  • Rancher Shell
  • Kubernetes CLI

On the other side, administrators need to manage the Kubernetes clusters through Rancher.

Rancher provides a feature called Projects to segregate resources inside a common domain. At the same time Projects doesn’t provide way to segregate Kubernetes cluster-scope resources.

Capsule as a project born for creating a framework for multi-tenant platforms, integrates with Rancher Projects enhancing the experience with Tenants.

Capsule allows tenants isolation and resources control in a declarative way, while enabling a self-service experience to tenants. With Capsule Proxy users can also access cluster-wide resources, as configured by administrators at Tenant custom resource-level.

You can read in detail how the integration works and how to configure it, in the following guides.

capsule rancher addon

Tenants and Projects

This guide explains how to setup the integration between Capsule and Rancher Projects.

It then explains how for the tenant user, the access to Kubernetes resources is transparent.

Pre-requisites

  • An authentication provider in Rancher, e.g. an OIDC identity provider
  • A Tenant Member Cluster Role in Rancher

Configure an identity provider for Kubernetes

You can follow this general guide to configure an OIDC authentication for Kubernetes.

For a Keycloak specific setup yon can check this resources list.

Known issues

Keycloak new URLs without /auth makes Rancher crash

Create the Tenant Member Cluster Role

A custom Rancher Cluster Role is needed to allow Tenant users, to read cluster-scope resources and Rancher doesn’t provide e built-in Cluster Role with this tailored set of privileges.

When logged-in to the Rancher UI as administrator, from the Users & Authentication page, create a Cluster Role named Tenant Member with the following privileges:

  • get, list, watch operations over IngressClasses resources.
  • get, list, watch operations over StorageClasses resources.
  • get, list, watch operations over PriorityClasses resources.
  • get, list, watch operations over Nodes resources.
  • get, list, watch operations over RuntimeClasses resources.

Configuration (administration)

Tenant onboarding

When onboarding tenants, the administrator needs to create the following, in order to bind the Project with the Tenant:

  • In Rancher, create a Project.

  • In the target Kubernetes cluster, create a Tenant, with the following specification:

    kind: Tenant
    ...
    spec:
      namespaceOptions:
        additionalMetadata:
          annotations:
            field.cattle.io/projectId: ${CLUSTER_ID}:${PROJECT_ID}
          labels:
            field.cattle.io/projectId: ${PROJECT_ID}
    

    where $CLUSTER_ID and $PROEJCT_ID can be retrieved, assuming a valid $CLUSTER_NAME, as:

    CLUSTER_NAME=foo
    CLUSTER_ID=$(kubectl get cluster -n fleet-default ${CLUSTER_NAME} -o jsonpath='{.status.clusterName}')
    PROJECT_IDS=$(kubectl get projects -n $CLUSTER_ID -o jsonpath="{.items[*].metadata.name}")
    for project_id in $PROJECT_IDS; do echo "${project_id}"; done
    

    More on declarative Projects here.

  • In the identity provider, create a user with correct OIDC claim of the Tenant.

  • In Rancher, add the new user to the Project with the Read-only Role.

  • In Rancher, add the new user to the Cluster with the Tenant Member Cluster Role.

Create the Tenant Member Project Role

A custom Project Role is needed to allow Tenant users, with minimum set of privileges and create and delete Namespaces.

Create a Project Role named Tenant Member that inherits the privileges from the following Roles:

  • read-only
  • create-ns

Usage

When the configuration administrative tasks have been completed, the tenant users are ready to use the Kubernetes cluster transparently.

For example can create Namespaces in a self-service mode, that would be otherwise impossible with the sole use of Rancher Projects.

Namespace creation

From the tenant user perspective both CLI and the UI are valid interfaces to communicate with.

From CLI

  • Tenants kubectl-logs in to the OIDC provider
  • Tenant creates a Namespace, as a valid OIDC-discoverable user.

the Namespace is now part of both the Tenant and the Project.

As administrator, you can verify with:

kubectl get tenant ${TENANT_NAME} -o jsonpath='{.status}'
kubectl get namespace -l field.cattle.io/projectId=${PROJECT_ID}

From UI

  • Tenants logs in to Rancher, with a valid OIDC-discoverable user (in a valid Tenant group).
  • Tenant user create a valid Namespace

the Namespace is now part of both the Tenant and the Project.

As administrator, you can verify with:

kubectl get tenant ${TENANT_NAME} -o jsonpath='{.status}'
kubectl get namespace -l field.cattle.io/projectId=${PROJECT_ID}

Additional administration

Project monitoring

Before proceeding is recommended to read the official Rancher documentation about Project Monitors.

In summary, the setup is composed by a cluster-level Prometheus, Prometheus Federator via which single Project-level Prometheus federate to.

Network isolation

Before proceeding is recommended to read the official Capsule documentation about NetworkPolicy at Tenant-level`.

Network isolation and Project Monitor

As Rancher’s Project Monitor deploys the Prometheus stack in a Namespace that is not part of neither the Project nor the Tenant Namespaces, is important to apply the label selectors in the NetworkPolicy ingress rules to the Namespace created by Project Monitor.

That Project monitoring Namespace will be named as cattle-project-<PROJECT_ID>-monitoring.

For example, if the NetworkPolicy is configured to allow all ingress traffic from Namespace with label capsule.clastix.io/tenant=foo, this label is to be applied to the Project monitoring Namespace too.

Then, a NetworkPolicy can be applied at Tenant-level with Capsule GlobalTenantResources. For example it can be applied a minimal policy for the oil Tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: GlobalTenantResource
metadata:
  name: oil-networkpolicies
spec:
  tenantSelector:
    matchLabels:
      capsule.clastix.io/tenant: oil
  resyncPeriod: 360s
  pruningOnDelete: true
  resources:
    - namespaceSelector:
        matchLabels:
          capsule.clastix.io/tenant: oil
      rawItems:
      - apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        metadata:
          name: oil-minimal
        spec:
          podSelector: {}
          policyTypes:
            - Ingress
            - Egress
          ingress:
            # Intra-Tenant
            - from:
              - namespaceSelector:
                  matchLabels:
                    capsule.clastix.io/tenant: oil
            # Rancher Project Monitor stack
            - from:
              - namespaceSelector:
                  matchLabels:
                    role: monitoring
            # Kubernetes nodes
            - from:
              - ipBlock:
                  cidr: 192.168.1.0/24
          egress:
            # Kubernetes DNS server
            - to:
              - namespaceSelector: {}
                podSelector:
                  matchLabels:
                    k8s-app: kube-dns
                ports:
                  - port: 53
                    protocol: UDP
            # Intra-Tenant
            - to:
              - namespaceSelector:
                  matchLabels:
                    capsule.clastix.io/tenant: oil
            # Kubernetes API server
            - to:
              - ipBlock:
                  cidr: 10.43.0.1/32
                ports:
                  - port: 443

Capsule Proxy and Rancher Projects

This guide explains how to setup the integration between Capsule Proxy and Rancher Projects.

It then explains how for the tenant user, the access to Kubernetes cluster-wide resources is transparent.

Rancher Shell and Capsule

In order to integrate the Rancher Shell with Capsule it’s needed to route the Kubernetes API requests made from the shell, via Capsule Proxy.

The capsule-rancher-addon allows the integration transparently.

Install the Capsule addon

Add the Clastix Helm repository https://clastix.github.io/charts.

By updating the cache with Clastix’s Helm repository a Helm chart named capsule-rancher-addon is available.

Install keeping attention to the following Helm values:

  • proxy.caSecretKey: the Secret key that contains the CA certificate used to sign the Capsule Proxy TLS certificate (it should be"ca.crt" when Capsule Proxy has been configured with certificates generated with Cert Manager).
  • proxy.servicePort: the port configured for the Capsule Proxy Kubernetes Service (443 in this setup).
  • proxy.serviceURL: the name of the Capsule Proxy Service (by default "capsule-proxy.capsule-system.svc" hen installed in the capsule-system Namespace).

Rancher Cluster Agent

In both CLI and dashboard use cases, the Cluster Agent is responsible for the two-way communication between Rancher and the downstream cluster.

In a standard setup, the Cluster Agents communicates to the API server. In this setup it will communicate with Capsule Proxy to ensure filtering of cluster-scope resources, for Tenants.

Cluster Agents accepts as arguments:

  • KUBERNETES_SERVICE_HOST environment variable
  • KUBERNETES_SERVICE_PORT environment variable

which will be set, at cluster import-time, to the values of the Capsule Proxy Service. For example:

  • KUBERNETES_SERVICE_HOST=capsule-proxy.capsule-system.svc
  • (optional) KUBERNETES_SERVICE_PORT=9001. You can skip it by installing Capsule Proxy with Helm value service.port=443.

The expected CA is the one for which the certificate is inside the kube-root-ca ConfigMap in the same Namespace of the Cluster Agent (cattle-system).

Capsule Proxy

Capsule Proxy needs to provide a x509 certificate for which the root CA is trusted by the Cluster Agent. The goal can be achieved by, either using the Kubernetes CA to sign its certificate, or by using a dedicated root CA.

With the Kubernetes root CA

Note: this can be achieved when the Kubernetes root CA keypair is accessible. For example is likely to be possibile with on-premise setup, but not with managed Kubernetes services.

With this approach Cert Manager will sign certificates with the Kubernetes root CA for which it’s needed to be provided a Secret.

kubectl create secret tls -n capsule-system kubernetes-ca-key-pair --cert=/path/to/ca.crt --key=/path/to/ca.key

When installing Capsule Proxy with Helm chart, it’s needed to specify to generate Capsule Proxy Certificates with Cert Manager with an external ClusterIssuer:

  • certManager.externalCA.enabled=true
  • certManager.externalCA.secretName=kubernetes-ca-key-pair
  • certManager.generateCertificates=true

and disable the job for generating the certificates without Cert Manager:

  • options.generateCertificates=false

Enable tenant users access cluster resources

In order to allow tenant users to list cluster-scope resources, like Nodes, Tenants need to be configured with proper proxySettings, for example:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
    proxySettings:
    - kind: Nodes
      operations:
      - List
[...]

Also, in order to assign or filter nodes per Tenant, it’s needed labels on node in order to be selected:

kubectl label node worker-01 capsule.clastix.io/tenant=oil

and a node selector at Tenant level:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  nodeSelector:
    capsule.clastix.io/tenant: oil
[...]

The final manifest is:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
    proxySettings:
    - kind: Node
      operations:
      - List
  nodeSelector:
    capsule.clastix.io/tenant: oil

The same appplies for:

  • Nodes
  • StorageClasses
  • IngressClasses
  • PriorityClasses

More on this in the official documentation.

Configure OIDC authentication with Keycloak

Pre-requisites

  • Keycloak realm for Rancher
  • Rancher OIDC authentication provider

Keycloak realm for Rancher

These instructions is specific to a setup made with Keycloak as an OIDC identity provider.

Mappers

  • Add to userinfo Group Membership type, claim name groups
  • Add to userinfo Audience type, claim name client audience
  • Add to userinfo, full group path, Group Membership type, claim name full_group_path

More on this on the official guide.

Rancher OIDC authentication provider

Configure an OIDC authentication provider, with Client with issuer, return URLs specific to the Keycloak setup.

Use old and Rancher-standard paths with /auth subpath (see issues below).

Add custom paths, remove /auth subpath in return and issuer URLs.

Configuration

Configure Tenant users

  1. In Rancher, configure OIDC authentication with Keycloak to use with Rancher.
  2. In Keycloak, Create a Group in the rancher Realm: capsule.clastix.io.
  3. In Keycloak, Create a User in the rancher Realm, member of capsule.clastix.io Group.
  4. In the Kubernetes target cluster, update the CapsuleConfiguration by adding the "keycloakoidc_group://capsule.clastix.io" Kubernetes Group.
  5. Login to Rancher with Keycloak with the new user.
  6. In Rancher as an administrator, set the user custom role with get of Cluster.
  7. In Rancher as an administrator, add the Rancher user ID of the just-logged in user as Owner of a Tenant.
  8. (optional) configure proxySettings for the Tenant to enable tenant users to access cluster-wide resources.

1.4 - Managed Kubernetes

Capsule on managed Kubernetes offerings

Capsule Operator can be easily installed on a Managed Kubernetes Service. Since you do not have access to the Kubernetes APIs Server, you should check with the provider of the service:

the default cluster-admin ClusterRole is accessible the following Admission Webhooks are enabled on the APIs Server:

  • PodNodeSelector
  • LimitRanger
  • ResourceQuota
  • MutatingAdmissionWebhook
  • ValidatingAdmissionWebhook

AWS EKS

This is an example of how to install AWS EKS cluster and one user manged by Capsule. It is based on Using IAM Groups to manage Kubernetes access

Create EKS cluster:

export AWS_DEFAULT_REGION="eu-west-1"
export AWS_ACCESS_KEY_ID="xxxxx"
export AWS_SECRET_ACCESS_KEY="xxxxx"

eksctl create cluster \
--name=test-k8s \
--managed \
--node-type=t3.small \
--node-volume-size=20 \
--kubeconfig=kubeconfig.conf

Create AWS User alice using CloudFormation, create AWS access files and kubeconfig for such user:

cat > cf.yml << EOF
Parameters:
  ClusterName:
    Type: String
Resources:
  UserAlice:
    Type: AWS::IAM::User
    Properties:
      UserName: !Sub "alice-${ClusterName}"
      Policies:
      - PolicyName: !Sub "alice-${ClusterName}-policy"
        PolicyDocument:
          Version: "2012-10-17"
          Statement:
          - Sid: AllowAssumeOrganizationAccountRole
            Effect: Allow
            Action: sts:AssumeRole
            Resource: !GetAtt RoleAlice.Arn
  AccessKeyAlice:
    Type: AWS::IAM::AccessKey
    Properties:
      UserName: !Ref UserAlice
  RoleAlice:
    Type: AWS::IAM::Role
    Properties:
      Description: !Sub "IAM role for the alice-${ClusterName} user"
      RoleName: !Sub "alice-${ClusterName}"
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
        - Effect: Allow
          Principal:
            AWS: !Sub "arn:aws:iam::${AWS::AccountId}:root"
          Action: sts:AssumeRole
Outputs:
  RoleAliceArn:
    Description: The ARN of the Alice IAM Role
    Value: !GetAtt RoleAlice.Arn
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-RoleAliceArn"
  AccessKeyAlice:
    Description: The AccessKey for Alice user
    Value: !Ref AccessKeyAlice
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-AccessKeyAlice"
  SecretAccessKeyAlice:
    Description: The SecretAccessKey for Alice user
    Value: !GetAtt AccessKeyAlice.SecretAccessKey
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-SecretAccessKeyAlice"
EOF

eval aws cloudformation deploy --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=test-k8s" \
  --stack-name "test-k8s-users" --template-file cf.yml

AWS_CLOUDFORMATION_DETAILS=$(aws cloudformation describe-stacks --stack-name "test-k8s-users")
ALICE_ROLE_ARN=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"RoleAliceArn\") .OutputValue")
ALICE_USER_ACCESSKEY=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"AccessKeyAlice\") .OutputValue")
ALICE_USER_SECRETACCESSKEY=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"SecretAccessKeyAlice\") .OutputValue")

eksctl create iamidentitymapping --cluster="test-k8s" --arn="${ALICE_ROLE_ARN}" --username alice --group capsule.clastix.io

cat > aws_config << EOF
[profile alice]
role_arn=${ALICE_ROLE_ARN}
source_profile=alice
EOF

cat > aws_credentials << EOF
[alice]
aws_access_key_id=${ALICE_USER_ACCESSKEY}
aws_secret_access_key=${ALICE_USER_SECRETACCESSKEY}
EOF

eksctl utils write-kubeconfig --cluster=test-k8s --kubeconfig="kubeconfig-alice.conf"
cat >> kubeconfig-alice.conf << EOF
      - name: AWS_PROFILE
        value: alice
      - name: AWS_CONFIG_FILE
        value: aws_config
      - name: AWS_SHARED_CREDENTIALS_FILE
        value: aws_credentials
EOF

Export “admin” kubeconfig to be able to install Capsule:

export KUBECONFIG=kubeconfig.conf

Install Capsule and create a tenant where alice has ownership. Use the default Tenant example:

kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/master/config/samples/capsule_v1beta1_tenant.yaml

Based on the tenant configuration above the user alice should be able to create namespace. Switch to a new terminal and try to create a namespace as user alice:

# Unset AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY if defined
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
kubectl create namespace test --kubeconfig="kubeconfig-alice.conf"

Azure AKS

This reference implementation introduces the recommended starting (baseline) infrastructure architecture for implementing a multi-tenancy Azure AKS cluster using Capsule. See CoAKS.

Charmed Kubernetes

Canonical Charmed Kubernetes is a Kubernetes distribution coming with out-of-the-box tools that support deployments and operational management and make microservice development easier. Combined with Capsule, Charmed Kubernetes allows users to further reduce the operational overhead of Kubernetes setup and management.

The Charm package for Capsule is available to Charmed Kubernetes users via Charmhub.io.

1.5 - Controller Options

Understand the Capsule configuration options and how to use them.

The configuration for the capsule controller is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

CapsuleConfiguration

The configuration for Capsule is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

kubectl explain capsuleConfiguration.spec

enableTLSReconciler

Toggles the TLS reconciler, the controller that is able to generate CA and certificates for the webhooks when not using an already provided CA and certificate, or when these are managed externally with Vault, or cert-manager.

forceTenantPrefix

Enforces the Tenant owner, during Namespace creation, to name it using the selected Tenant name as prefix, separated by a dash. This is useful to avoid Namespace name collision in a public CaaS environment.

nodeMetadata

Allows to set the forbidden metadata for the worker nodes that could be patched by a Tenant. This applies only if the Tenant has an active NodeSelector, and the Owner have right to patch their nodes.

overrides

Allows to set different name rather than the canonical one for the Capsule configuration objects, such as webhook secret or configurations.

protectedNamespaceRegex

Disallow creation of namespaces, whose name matches this regexp

userGroups

Names of the groups for Capsule users. Users must have this group to be considered for the Capsule tenancy. If a user does not have any group mentioned here, they are not recognized as a Capsule user.

userNames

Names of the users for Capsule users. Users must have this name to be considered for the Capsule tenancy. If userGroups are set, the properties are ORed, meaning that a user can be recognized as a Capsule user if they have one of the groups or one of the names.

ignoreUserWithGroups

Define groups which when found in the request of a user will be ignored by the Capsule. This might be useful if you have one group where all the users are in, but you want to separate administrators from normal users with additional groups.

Controller Options

Depending on the version of the Capsule Controller, the configuration options may vary. You can view the options for the latest version of the Capsule Controller or by executing the controller locally:

$ docker run ghcr.io/projectcapsule/capsule:v0.6.0-rc0 -h
2024/02/25 13:21:21 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
Usage of /ko-app/capsule:
      --configuration-name string         The CapsuleConfiguration resource name to use (default "default")
      --enable-leader-election            Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.
      --metrics-addr string               The address the metric endpoint binds to. (default ":8080")
      --version                           Print the Capsule version and exit
      --webhook-port int                  The port the webhook server binds to. (default 9443)
      --zap-devel                         Development Mode defaults(encoder=consoleEncoder,logLevel=Debug,stackTraceLevel=Warn). Production Mode defaults(encoder=jsonEncoder,logLevel=Info,stackTraceLevel=Error)
      --zap-encoder encoder               Zap log encoding (one of 'json' or 'console')
      --zap-log-level level               Zap Level to configure the verbosity of logging. Can be one of 'debug', 'info', 'error', or any integer value > 0 which corresponds to custom debug levels of increasing verbosity
      --zap-stacktrace-level level        Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').
      --zap-time-encoding time-encoding   Zap time encoding (one of 'epoch', 'millis', 'nano', 'iso8601', 'rfc3339' or 'rfc3339nano'). Defaults to 'epoch'.

2 - Best Practices

Best Practices when running Capsule in production

2.1 - Architecture

Architecture references and considerations

Ownership

In Capsule, we introduce a new persona called the Tenant Owner. The goal is to enable Cluster Administrators to delegate tenant management responsibilities to Tenant Owners. Here’s how it works:

  • Tenant Owners: They manage the namespaces within their tenants and perform administrative tasks confined to their tenant boundaries. This delegation allows teams to operate more autonomously while still adhering to organizational policies.
  • Cluster Administrators: They provision tenants, essentially determining the size and resource allocation of each tenant within the entire cluster. Think of it as defining how big each piece of cake (Tenant) should be within the whole cake (Cluster).

Capsule provides robust tools to strictly enforce tenant boundaries, ensuring that each tenant operates within its defined limits. This separation of duties promotes both security and efficient resource management.

Key Decisions

Introducing a new separation of duties can lead to a significant paradigm shift. This has technical implications and may also impact your organizational structure. Therefore, when designing a multi-tenant platform pattern, carefully consider the following aspects. As Cluster Administrator, ask yourself:

  • 🔑 How much ownership can be delegated to Tenant Owners (Platform Users)?

The answer to this question may be influenced by the following aspects:

  • Are the Cluster Adminsitrators willing to grant permissions to Tenant Owners?

    • You might have a problem with know-how and probably your organisation is not yet pushing Kubernetes itself enough as a key strategic plattform. The key here is enabling Plattform Users through good UX and know-how transfers
  • Who is responsible for the deployed workloads within the Tenants??

    • If Platform Administrators are still handling this, a true “shift left” has not yet been achieved.
  • Who gets paged during a production outage within a Tenant’s application??

    • You’ll need robust monitoring that enables Tenant Owners to clearly understand and manage what’s happening inside their own tenant.
  • Are your customers technically capable of working directly with the Kubernetes API??

    • If not, you may need to build a more user-friendly platform with better UX — for example, a multi-tenant ArgoCD setup, or UI layers like Headlamp.

Layouts

Let’s dicuss different Tenant Layouts which could be used . These are just approaches we have seen, however you might also find a combination of these which fits your use-case.

Tenant As A Service

With this approach you essentially just provide your Customers with the Tenant on your cluster. The rest is their responsability. This concludes to a shared responsibility model. This can be achieved when also the Tenant Owners are responsible for everything they are provisiong within their Tenant’s namespaces.

Resourcepool Dashboard

Scheduling

Workload distribution across your compute infrastructure can be approached in various ways, depending on your specific priorities. Regardless of the use case, it’s essential to preserve maximum flexibility for your platform administrators. This means ensuring that:

  • Nodes can be drained or deleted at any time.
  • Cluster updates can be performed at any time.
  • The number of worker nodes can be scaled up or down as needed.

If your cluster architecture prevents any of these capabilities, or if certain applications block the enforcement of these policies, you should reconsider your approach.

Dedicated

Strong tenant isolation, ensuring that any noisy neighbor effects remain confined within individual tenants (tenant responsibility). This approach may involve higher administrative overhead and costs compared to shared compute. It also provides enhanced security by dedicating nodes to a single customer/application. It is recommended, at a minimum, to separate the cluster’s operator workload from customer workloads.

Dedicated Nodepool

Shared

With this approach you share the nodes amongst all Tenants, therefor giving you more potential for optimizing resources on a node level. It’s a common pattern to separate the controllers needed to power your Distribution (operators) form the actual workload. This ensures smooth operations for the cluster

Overview:

  • ✅ Designed for cost efficiency .
  • ✅ Suitable for applications that typically experience low resource fluctuations and run with multiple replicas.
  • ❌ Not ideal for applications that are not cloud-native ready, as they may adversely affect the operation of other applications or the maintenance of node pools.
  • ❌ Not ideal if strong isolation is required

Shared Nodepool

We provide the concept of ResourcePools to manage resources cross namespaces. There’s some further aspects you must think about with shared approaches:

2.2 - General Advice

This is general advice you should consider before making Kubernetes Distribution consideration

This is general advice you should consider before making Kubernetes Distribution consideration. They are partly relevant for Multi-Tenancy with Capsule.

Authentication

User authentication for the platform should be handled via a central OIDC-compatible identity provider system (e.g., Keycloak, Azure AD, Okta, or any other OIDC-compliant provider). The rationale is that other central platform components — such as ArgoCD, Grafana, Headlamp, or Harbor — should also integrate with the same authentication mechanism. This enables a unified login experience and reduces administrative complexity in managing users and permissions.

Capsule relies on native Kubernetes RBAC, so it’s important to consider how the Kubernetes API handles user authentication.

OCI Pull-Cache

By default, Kubernetes clusters pull images directly from upstream registries like docker.io, quay.io, ghcr.io, or gcr.io. In production environments, this can lead to issues — especially because Docker Hub enforces rate limits that may cause image pull failures with just a few nodes or frequent deployments (e.g., when pods are rescheduled).

To ensure availability, performance, and control over container images, it’s essential to provide an on-premise OCI mirror. This mirror should be configured via the CRI (Container Runtime Interface) by defining it as a mirror endpoint in registries.conf for default registries (e.g., docker.io). This way, all nodes automatically benefit from caching without requiring developers to change image URLs.

Secrets Management

In more complex environments with multiple clusters and applications, managing secrets manually via YAML or Helm is no longer practical. Instead, a centralized secrets management system should be established — such as Vault, AWS Secrets Manager, Azure Key Vault, or the CNCF project OpenBao (formerly the Vault community fork).

To integrate these external secret stores with Kubernetes, the External Secrets Operator (ESO) is a recommended solution. It automatically syncs defined secrets from external sources as Kubernetes secrets, and supports dynamic rotation, access control, and auditing.

If no external secret store is available, there should at least be a secure way to store sensitive data in Git. In our ecosystem, we provide a solution based on SOPS (Secrets OPerationS) for this use case.

👉 Demonstration

2.3 - Workloads

Control the security of the workloads running in the tenant namespaces

User Namespaces

A process running as root in a container can run as a different (non-root) user in the host; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace. Read More

Kubelet

On your Kubelet you must use the FeatureGates:

  • UserNamespacesSupport
  • UserNamespacesPodSecurityStandards (Optional)

Sysctls

user.max_user_namespaces: "11255"

Admission (Kyverno)

To make sure all the workloads are forced to use dedicated User Namespaces, we recommend to mutate pods at admission. See the following examples.

Kyverno

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-hostusers-spec
  annotations:
    policies.kyverno.io/title: Add HostUsers
    policies.kyverno.io/category: Security
    policies.kyverno.io/subject: Pod,User Namespace
    kyverno.io/kubernetes-version: "1.31"
    policies.kyverno.io/description: >-
      Do not use the host's user namespace. A new userns is created for the pod. 
      Setting false is useful for mitigating container breakout vulnerabilities even allowing users to run their containers as root
      without actually having root privileges on the host. This field is
      alpha-level and is only honored by servers that enable the
      UserNamespacesSupport feature.      
spec:
  rules:
  - name: add-host-users
    match:
      any:
      - resources:
          kinds:
          - Pod
          namespaceSelector:
            matchExpressions:
            - key: capsule.clastix.io/tenant
              operator: Exists
    preconditions:
      all:
      - key: "{{request.operation || 'BACKGROUND'}}"
        operator: AnyIn
        value:
          - CREATE
          - UPDATE
    mutate:
      patchStrategicMerge:
        spec:
          hostUsers: false

Pod Security Standards

In Kubernetes, by default, workloads run with administrative access, which might be acceptable if there is only a single application running in the cluster or a single user accessing it. This is seldom required and you’ll consequently suffer a noisy neighbour effect along with large security blast radiuses.

Many of these concerns were addressed initially by PodSecurityPolicies which have been present in the Kubernetes APIs since the very early days.

The Pod Security Policies are deprecated in Kubernetes 1.21 and removed entirely in 1.25. As replacement, the Pod Security Standards and Pod Security Admission has been introduced. Capsule support the new standard for tenants under its control as well as the oldest approach.

One of the issues with Pod Security Policies is that it is difficult to apply restrictive permissions on a granular level, increasing security risk. Also the Pod Security Policies get applied when the request is submitted and there is no way of applying them to pods that are already running. For these, and other reasons, the Kubernetes community decided to deprecate the Pod Security Policies.

As the Pod Security Policies get deprecated and removed, the Pod Security Standards is used in place. It defines three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive:

  • Privileged: unrestricted policy, providing the widest possible level of permissions.
  • Baseline: minimally restrictive policy which prevents known privilege escalations.
  • Restricted: heavily restricted policy, following current Pod hardening best practices.

Kubernetes provides a built-in Admission Controller to enforce the Pod Security Standards at either:

  1. cluster level which applies a standard configuration to all namespaces in a cluster
  2. namespace level, one namespace at a time

For the first case, the cluster admin has to configure the Admission Controller and pass the configuration to the kube-apiserver by mean of the --admission-control-config-file extra argument, for example:

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "baseline"
      enforce-version: "latest"
      warn: "restricted"
      warn-version: "latest"
      audit: "restricted"
      audit-version: "latest"
    exemptions:
      usernames: []
      runtimeClasses: []
      namespaces: [kube-system]

For the second case, he can just assign labels to the specific namespace he wants enforce the policy since the Pod Security Admission Controller is enabled by default starting from Kubernetes 1.23+:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: development

Capsule

According to the regular Kubernetes segregation model, the cluster admin has to operate either at cluster level or at namespace level. Since Capsule introduces a further segregation level (the Tenant abstraction), the cluster admin can implement Pod Security Standards at tenant level by simply forcing specific labels on all the namespaces created in the tenant.

You can distribute these profiles via namespace. Here’s how this could look like:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  namespaceOptions:
    additionalMetadataList:
    - namespaceSelector:
        matchExpressions:
          - key: projectcapsule.dev/low_security_profile
            operator: NotIn
            values: ["system"]
      labels:
        pod-security.kubernetes.io/enforce: restricted
        pod-security.kubernetes.io/warn: restricted
        pod-security.kubernetes.io/audit: restricted
    - namespaceSelector:
        matchExpressions:
          - key: company.com/env
            operator: In
            values: ["system"]
      labels:
        pod-security.kubernetes.io/enforce: privileged
        pod-security.kubernetes.io/warn: privileged
        pod-security.kubernetes.io/audit: privileged

All namespaces created by the tenant owner, will inherit the Pod Security labels:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    capsule.clastix.io/tenant: solar
    kubernetes.io/metadata.name: solar-development
    name: solar-development
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: solar-development
  ownerReferences:
  - apiVersion: capsule.clastix.io/v1beta2
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: solar

and the regular Pod Security Admission Controller does the magic:

kubectl --kubeconfig alice-oil.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

The request gets denied:

Error from server (Forbidden): error when creating "STDIN":
pods "nginx" is forbidden: violates PodSecurity "baseline:latest": privileged
(container "nginx" must not set securityContext.privileged=true)

If the tenant owner tries to change o delete the above labels, Capsule will reconcile them to the original tenant manifest set by the cluster admin.

As additional security measure, the cluster admin can also prevent the tenant owner to make an improper usage of the above labels:

kubectl annotate tenant solar \
  capsule.clastix.io/forbidden-namespace-labels-regexp="pod-security.kubernetes.io\/(enforce|warn|audit)"

In that case, the tenant owner gets denied if she tries to use the labels:

kubectl --kubeconfig alice-solar.kubeconfig label ns solar-production \
    pod-security.kubernetes.io/enforce=restricted \
    --overwrite

Error from server (Label pod-security.kubernetes.io/audit is forbidden for namespaces in the current Tenant ...

Pod Security Policies

As stated in the documentation, “PodSecurityPolicies enable fine-grained authorization of pod creation and updates. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the pod specification. The PodSecurityPolicy objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields.”

Using the Pod Security Policies, the cluster admin can impose limits on pod creation, for example the types of volume that can be consumed, the linux user that the process runs as in order to avoid running things as root, and more. From multi-tenancy point of view, the cluster admin has to control how users run pods in their tenants with a different level of permission on tenant basis.

Assume the Kubernetes cluster has been configured with Pod Security Policy Admission Controller enabled in the APIs server: --enable-admission-plugins=PodSecurityPolicy

The cluster admin creates a PodSecurityPolicy:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp:restricted
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false

Then create a ClusterRole using or granting the said item

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp:restricted
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['psp:restricted']
  verbs: ['use']

He can assign this role to all namespaces in a tenant by setting the tenant manifest:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  additionalRoleBindings:
  - clusterRoleName: psp:privileged
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"

With the given specification, Capsule will ensure that all tenant namespaces will contain a RoleBinding for the specified Cluster Role:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: 'capsule-solar-psp:privileged'
  namespace: solar-production
  labels:
    capsule.clastix.io/tenant: solar
subjects:
  - kind: Group
    apiGroup: rbac.authorization.k8s.io
    name: 'system:authenticated'
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: 'psp:privileged'

Capsule admission controller forbids the tenant owner to run privileged pods in solar-production namespace and perform privilege escalation as declared by the above Cluster Role psp:privileged.

As tenant owner, creates a namespace:

kubectl --kubeconfig alice-solar.kubeconfig create ns solar-production

and create a pod with privileged permissions:

kubectl --kubeconfig alice-solar.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

Since the assigned PodSecurityPolicy explicitly disallows privileged containers, the tenant owner will see her request to be rejected by the Pod Security Policy Admission Controller.

2.4 - Networking

Multi-Tenant Networking considerations

Network-Policies

It’s a best practice to not allow any traffic outside of a tenant (or a tenant’s namespace). For this we can use Tenant Replications to ensure we have for every namespace Networkpolicies in place.

The following NetworkPolicy is distributed to all namespaces which belong to a Capsule tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: GlobalTenantResource
metadata:
  name: default-networkpolicies
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - rawItems:
        - apiVersion: networking.k8s.io/v1
          kind: NetworkPolicy
          metadata:
            name: default-policy
          spec:
            # Apply to all pods in this namespace
            podSelector: {}
            policyTypes:
              - Ingress
              - Egress
            ingress:
              # Allow traffic from the same namespace (intra-namespace communication)
              - from:
                  - podSelector: {}

              # Allow traffic from all namespaces within the tenant
              - from:
                  - namespaceSelector:
                      matchLabels:
                        capsule.clastix.io/tenant: "{{tenant.name}}"

              # Allow ingress from other namespaces labeled (System Namespaces, eg. Monitoring, Ingress)
              - from:
                  - namespaceSelector:
                      matchLabels:
                        company.com/system: "true"

            egress:
              # Allow DNS to kube-dns service IP (might be different in your setup)
              - to:
                  - ipBlock:
                      cidr: 10.96.0.10/32
                ports:
                  - protocol: UDP
                    port: 53
                  - protocol: TCP
                    port: 53

              # Allow traffic to all namespaces within the tenant
              - to:
                  - namespaceSelector:
                      matchLabels:
                        capsule.clastix.io/tenant: "{{tenant.name}}"

Deny Namespace Metadata

In the above example we allow traffic from namespaces with the label company.com/system: "true". This is meant for Kubernetes Operators to eg. scrape the workloads within a tenant. However without further enforcement any namespace can set this label and therefor gain access to any tenant namespace. To prevent this, we must restrict, who can declare this label on namespaces.

We can deny such labels on tenant basis. So in this scenario every tenant should disallow the use of these labels on namespaces:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  namespaceOptions:
    forbiddenLabels:
      denied:
          - company.com/system

Or you can implement a Kyverno-Policy, which solves this.

Non-Native Network-Policies

The same principle can be applied with alternative CNI solutions. In this example we are using Cilium:

apiVersion: capsule.clastix.io/v1beta2
kind: GlobalTenantResource
metadata:
  name: default-networkpolicies
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - rawItems:
        - apiVersion: cilium.io/v2
          kind: CiliumNetworkPolicy
          metadata:
            name: default-policy
          spec:
            endpointSelector: {}  # Apply to all pods in the namespace
            ingress:
              - fromEndpoints:
                  - matchLabels: {}  # Same namespace pods (intra-namespace)
              - fromEntities:
                  - cluster  # For completeness; can be used to allow internal cluster traffic if needed
              - fromEndpoints:
                  - matchLabels:
                      capsule.clastix.io/tenant: "{{tenant.name}}"  # Pods in other namespaces with same tenant
              - fromNamespaces:
                  - matchLabels:
                      company.com/system: "true"  # System namespaces (monitoring, ingress, etc.)
          
            egress:
              - toCIDR:
                  - 10.96.0.10/32  # kube-dns IP
                toPorts:
                  - ports:
                      - port: "53"
                        protocol: UDP
                      - port: "53"
                        protocol: TCP
          
              - toNamespaces:
                  - matchLabels:
                      capsule.clastix.io/tenant: "{{tenant.name}}"  # Egress to all tenant namespaces

2.5 - Container Images

Multi-Tenant Container Images considerations

Until this issue is resolved (might be in Kubernetes 1.34)

it’s recommended to use the ImagePullPolicy Always for private registries on shared nodes. This ensures that no images can be used which are already pulled to the node.

3 - Authentication

Integrate Capsule with Authentication of your Kubernetes cluster

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by userGroups option in the CapsuleConfiguration, which defaults to projectcapsule.dev.

OIDC

In the following guide, we’ll use Keycloak an Open Source Identity and Access Management server capable to authenticate users via OIDC and release JWT tokens as proof of authentication.

Configuring OIDC Server

Configure Keycloak as OIDC server:

  • Add a realm called caas, or use any existing realm instead
  • Add a group projectcapsule.dev
  • Add a user alice assigned to group projectcapsule.dev
  • Add an OIDC client called kubernetes (Public)
  • Add an OIDC client called kubernetes-auth (Confidential (Client Secret))

For the kubernetes client, create protocol mappers called groups and audience If everything is done correctly, now you should be able to authenticate in Keycloak and see user groups in JWT tokens. Use the following snippet to authenticate in Keycloak as alice user:

$ KEYCLOAK=sso.clastix.io
$ REALM=kubernetes-auth
$ OIDC_ISSUER=${KEYCLOAK}/realms/${REALM}

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/token \
     -d grant_type=password \
     -d response_type=id_token \
     -d scope=openid \
     -d client_id=${OIDC_CLIENT_ID} \
     -d client_secret=${OIDC_CLIENT_SECRET} \
     -d username=${USERNAME} \
     -d password=${PASSWORD} | jq

The result will include an ACCESS_TOKEN, a REFRESH_TOKEN, and an ID_TOKEN. The access-token can generally be disregarded for Kubernetes. It would be used if the identity provider was managing roles and permissions for the users but that is done in Kubernetes itself with RBAC. The id-token is short lived while the refresh-token has longer expiration. The refresh-token is used to fetch a new id-token when the id-token expires.

{  
   "access_token":"ACCESS_TOKEN",
   "refresh_token":"REFRESH_TOKEN",
   "id_token": "ID_TOKEN",
   "token_type":"bearer",
   "scope": "openid groups profile email"
}

To introspect the ID_TOKEN token run:

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/introspect \
     -d token=${ID_TOKEN} \
     --user kubernetes-auth:${OIDC_CLIENT_SECRET} | jq

The result will be like the following:

{
  "exp": 1601323086,
  "iat": 1601322186,
  "aud": "kubernetes",
  "typ": "ID",
  "azp": "kubernetes",
  "preferred_username": "alice",
  "email_verified": false,
  "acr": "1",
  "groups": [
    "capsule.clastix.io"
  ],
  "client_id": "kubernetes",
  "username": "alice",
  "active": true
}

Configuring Kubernetes API Server

Configuring Kubernetes for OIDC Authentication requires adding several parameters to the API Server. Please, refer to the documentation for details and examples. Most likely, your kube-apiserver.yaml manifest will looks like the following:

The configuration file approach allows you to configure multiple JWT authenticators, each with a unique issuer.url and issuer.discoveryURL. The configuration file even allows you to specify CEL expressions to map claims to user attributes, and to validate claims and user information.

apiVersion: apiserver.config.k8s.io/v1beta1
kind: AuthenticationConfiguration
jwt:
- issuer:
    url: https://${OIDC_ISSUER}
    audiences:
    - kubernetes
    - kubernetes-auth
    audienceMatchPolicy: MatchAny
  claimMappings:
    username:
      claim: 'email'
      prefix: ""
    groups:
      claim: 'groups'
      prefix: ""
  certificateAuthority: <PEM encoded CA certificates>

This file must be present and consistent across all kube-apiserver instances in the cluster. Add the following flag to the kube-apiserver manifest:

spec:
  containers:
  - command:
    - kube-apiserver
    ...
    - --authentication-configuration-file=/etc/kubernetes/authentication/authentication.yaml

Read More

OIDC Flags (Legacy)

spec:
  containers:
  - command:
    - kube-apiserver
    ...
    - --oidc-issuer-url=https://${OIDC_ISSUER}
    - --oidc-ca-file=/etc/kubernetes/oidc/ca.crt
    - --oidc-client-id=kubernetes
    - --oidc-username-claim=preferred_username
    - --oidc-groups-claim=groups
    - --oidc-username-prefix=-

KinD

As reference, here is an example of a KinD configuration for OIDC Authentication, which can be useful for local testing:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
    kubeadmConfigPatches:
     - |
       kind: ClusterConfiguration
       apiServer:
           extraArgs:
             oidc-issuer-url: https://${OIDC_ISSUER}
             oidc-username-claim: preferred_username
             oidc-client-id: kubernetes
             oidc-username-prefix: "keycloak:"
             oidc-groups-claim: groups
             oidc-groups-prefix: "keycloak:"
             enable-admission-plugins: PodNodeSelector       

Configuring kubectl

There are two options to use kubectl with OIDC:

  • OIDC Authenticator
  • Use the –token option

Plugin

One way to use OIDC authentication is the use of a kubectl plugin. The Kubelogin Plugin for kubectl simplifies the process of obtaining an OIDC token and configuring kubectl to use it. Follow the link to obtain installation instructions.

kubectl oidc-login setup \
	--oidc-issuer-url=https://${OIDC_ISSUER} \
	--oidc-client-id=kubernetes-auth \
	--oidc-client-secret=${OIDC_CLIENT_SECRET}

Manual

To use the OIDC Authenticator, add an oidc user entry to your kubeconfig file:

$ kubectl config set-credentials oidc \
    --auth-provider=oidc \
    --auth-provider-arg=idp-issuer-url=https://${OIDC_ISSUER} \
    --auth-provider-arg=idp-certificate-authority=/path/to/ca.crt \
    --auth-provider-arg=client-id=kubernetes-auth \
    --auth-provider-arg=client-secret=${OIDC_CLIENT_SECRET} \
    --auth-provider-arg=refresh-token=${REFRESH_TOKEN} \
    --auth-provider-arg=id-token=${ID_TOKEN} \
    --auth-provider-arg=extra-scopes=groups

To use the --token option:

$ kubectl config set-credentials oidc --token=${ID_TOKEN}

Point the kubectl to the URL where the Kubernetes APIs Server is reachable:

$ kubectl config set-cluster mycluster \
    --server=https://kube.projectcapulse.io:6443 \
    --certificate-authority=~/.kube/ca.crt

If your APIs Server is reachable through the capsule-proxy, make sure to use the URL of the capsule-proxy.

Create a new context for the OIDC authenticated users:

$ kubectl config set-context alice-oidc@mycluster \
    --cluster=mycluster \
    --user=oidc

As user alice, you should be able to use kubectl to create some namespaces:

$ kubectl --context alice-oidc@mycluster create namespace oil-production
$ kubectl --context alice-oidc@mycluster create namespace oil-development
$ kubectl --context alice-oidc@mycluster create namespace gas-marketing

Warning: once your ID_TOKEN expires, the kubectl OIDC Authenticator will attempt to refresh automatically your ID_TOKEN using the REFRESH_TOKEN. In case the OIDC uses a self signed CA certificate, make sure to specify it with the idp-certificate-authority option in your kubeconfig file, otherwise you’ll not able to refresh the tokens.

4 - Monitoring

Monitoring Capsule Items and Tenants

The Capsule dashboard allows you to track the health and performance of Capsule manager and tenants, with particular attention to resources saturation, server responses, and latencies. Prometheus and Grafana are requirements for monitoring Capsule.

ResourcePools

Instrumentation for ResourcePools.

Dashboards

Dashboards can be deployed via helm-chart, enable the following values:

monitoring:
  dashboards:
    enabled: true

Capsule / ResourcePools

Dashboard which grants a detailed overview over the ResourcePools

Resourcepool Dashboard


Rules

Example rules to give you some idea, what’s possible.

  1. Alert on ResourcePools usage
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: resourcepool-usage-alert
spec:
groups:
  - name: capsule-pool-usage.rules
    rules:
      - alert: CapsulePoolHighUsageWarning
        expr: |
          capsule_pool_usage_percentage > 90          
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: High resource usage in Resourcepool
          description: |
            Resource {{ $labels.resource }} in pool {{ $labels.pool }} is at {{ $value }}% usage for the last 10 minutes.            

      - alert: CapsulePoolHighUsageCritical
        expr: |
          capsule_pool_usage_percentage > 95          
        for: 10m
        labels:
          severity: critical
        annotations:
          summary: Critical resource usage in Resourcepool
          description: |
            Resource {{ $labels.resource }} in pool {{ $labels.pool }} has exceeded 95% usage for the last 10 minutes.            

Metrics

The following Metrics are exposed and can be used for monitoring:

# HELP capsule_claim_condition The current condition status of a claim.
# TYPE capsule_claim_condition gauge
capsule_claim_condition{condition="Bound",name="compute",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-10",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-prod"} 0
capsule_claim_condition{condition="Bound",name="compute-2",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-3",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-4",pool="solar-compute",reason="Succeeded",target_namespace="solar-test"} 1
capsule_claim_condition{condition="Bound",name="compute-5",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-test"} 0
capsule_claim_condition{condition="Bound",name="compute-6",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-test"} 0
capsule_claim_condition{condition="Bound",name="pods",pool="solar-size",reason="Succeeded",target_namespace="solar-test"} 1

# HELP capsule_claim_resource The given amount of resources from the claim
# TYPE capsule_claim_resource gauge
capsule_claim_resource{name="compute",resource="limits.cpu",target_namespace="solar-prod"} 0.375
capsule_claim_resource{name="compute",resource="limits.memory",target_namespace="solar-prod"} 4.02653184e+08
capsule_claim_resource{name="compute",resource="requests.cpu",target_namespace="solar-prod"} 0.375
capsule_claim_resource{name="compute",resource="requests.memory",target_namespace="solar-prod"} 4.02653184e+08
capsule_claim_resource{name="compute-10",resource="limits.memory",target_namespace="solar-prod"} 1.073741824e+10
capsule_claim_resource{name="compute-2",resource="limits.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-2",resource="limits.memory",target_namespace="solar-prod"} 5.36870912e+08
capsule_claim_resource{name="compute-2",resource="requests.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-2",resource="requests.memory",target_namespace="solar-prod"} 5.36870912e+08
capsule_claim_resource{name="compute-3",resource="requests.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-4",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_claim_resource{name="compute-5",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_claim_resource{name="compute-6",resource="requests.cpu",target_namespace="solar-test"} 5
capsule_claim_resource{name="pods",resource="pods",target_namespace="solar-test"} 3

# HELP capsule_pool_available Current resource availability for a given resource in a resource pool
# TYPE capsule_pool_available gauge
capsule_pool_available{pool="solar-compute",resource="limits.cpu"} 1.125
capsule_pool_available{pool="solar-compute",resource="limits.memory"} 1.207959552e+09
capsule_pool_available{pool="solar-compute",resource="requests.cpu"} 0.125
capsule_pool_available{pool="solar-compute",resource="requests.memory"} 1.207959552e+09
capsule_pool_available{pool="solar-size",resource="pods"} 4

# HELP capsule_pool_exhaustion Resources become exhausted, when there's not enough available for all claims and the claims get queued
# TYPE capsule_pool_exhaustion gauge
capsule_pool_exhaustion{pool="solar-compute",resource="limits.memory"} 1.073741824e+10
capsule_pool_exhaustion{pool="solar-compute",resource="requests.cpu"} 5.5

# HELP capsule_pool_exhaustion_percentage Resources become exhausted, when there's not enough available for all claims and the claims get queued (Percentage)
# TYPE capsule_pool_exhaustion_percentage gauge
capsule_pool_exhaustion_percentage{pool="solar-compute",resource="limits.memory"} 788.8888888888889
capsule_pool_exhaustion_percentage{pool="solar-compute",resource="requests.cpu"} 4300

# HELP capsule_pool_limit Current resource limit for a given resource in a resource pool
# TYPE capsule_pool_limit gauge
capsule_pool_limit{pool="solar-compute",resource="limits.cpu"} 2
capsule_pool_limit{pool="solar-compute",resource="limits.memory"} 2.147483648e+09
capsule_pool_limit{pool="solar-compute",resource="requests.cpu"} 2
capsule_pool_limit{pool="solar-compute",resource="requests.memory"} 2.147483648e+09
capsule_pool_limit{pool="solar-size",resource="pods"} 7

# HELP capsule_pool_namespace_usage Current resources claimed on namespace basis for a given resource in a resource pool for a specific namespace
# TYPE capsule_pool_namespace_usage gauge
capsule_pool_namespace_usage{pool="solar-compute",resource="limits.cpu",target_namespace="solar-prod"} 0.875
capsule_pool_namespace_usage{pool="solar-compute",resource="limits.memory",target_namespace="solar-prod"} 9.39524096e+08
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-prod"} 1.375
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.memory",target_namespace="solar-prod"} 9.39524096e+08
capsule_pool_namespace_usage{pool="solar-size",resource="pods",target_namespace="solar-test"} 3

# HELP capsule_pool_namespace_usage_percentage Current resources claimed on namespace basis for a given resource in a resource pool for a specific namespace (percentage)
# TYPE capsule_pool_namespace_usage_percentage gauge
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="limits.cpu",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="limits.memory",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-prod"} 68.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-test"} 25
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.memory",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-size",resource="pods",target_namespace="solar-test"} 42.857142857142854

# HELP capsule_pool_resource Type of resource being used in a resource pool
# TYPE capsule_pool_resource gauge
capsule_pool_resource{pool="solar-compute",resource="limits.cpu"} 1
capsule_pool_resource{pool="solar-compute",resource="limits.memory"} 1
capsule_pool_resource{pool="solar-compute",resource="requests.cpu"} 1
capsule_pool_resource{pool="solar-compute",resource="requests.memory"} 1
capsule_pool_resource{pool="solar-size",resource="pods"} 1

# HELP capsule_pool_usage Current resource usage for a given resource in a resource pool
# TYPE capsule_pool_usage gauge
capsule_pool_usage{pool="solar-compute",resource="limits.cpu"} 0.875
capsule_pool_usage{pool="solar-compute",resource="limits.memory"} 9.39524096e+08
capsule_pool_usage{pool="solar-compute",resource="requests.cpu"} 1.875
capsule_pool_usage{pool="solar-compute",resource="requests.memory"} 9.39524096e+08
capsule_pool_usage{pool="solar-size",resource="pods"} 3

# HELP capsule_pool_usage_percentage Current resource usage for a given resource in a resource pool (percentage)
# TYPE capsule_pool_usage_percentage gauge
capsule_pool_usage_percentage{pool="solar-compute",resource="limits.cpu"} 43.75
capsule_pool_usage_percentage{pool="solar-compute",resource="limits.memory"} 43.75
capsule_pool_usage_percentage{pool="solar-compute",resource="requests.cpu"} 93.75
capsule_pool_usage_percentage{pool="solar-compute",resource="requests.memory"} 43.75
capsule_pool_usage_percentage{pool="solar-size",resource="pods"} 42.857142857142854

Quotas

Instrumentation for Quotas.

Metrics

The following Metrics are exposed and can be used for monitoring:

# HELP capsule_tenant_namespace_count Total number of namespaces currently owned by the tenant
# TYPE capsule_tenant_namespace_count gauge
capsule_tenant_namespace_count{tenant="solar"} 6

# HELP capsule_tenant_namespace_relationship Mapping metric showing namespace to tenant relationships
# TYPE capsule_tenant_namespace_relationship gauge
capsule_tenant_namespace_relationship{namespace="earth",tenant="solar"} 1
capsule_tenant_namespace_relationship{namespace="wind",tenant="solar"} 1
capsule_tenant_namespace_relationship{namespace="fire",tenant="solar"} 1

# HELP capsule_tenant_status Tenant cordon state indicating if tenant operations are restricted (1) or allowed (0) for resource creation and modification
# TYPE capsule_tenant_status gauge
capsule_tenant_status{tenant="limiting-resources"} 0

# HELP capsule_tenant_resource_limit Current resource limit for a given resource in a tenant
# TYPE capsule_tenant_resource_limit gauge
capsule_tenant_resource_limit{resource="limits.cpu",resourcequotaindex="0",tenant="solar"} 2
capsule_tenant_resource_limit{resource="limits.memory",resourcequotaindex="0",tenant="solar"} 2.147483648e+09
capsule_tenant_resource_limit{resource="pods",resourcequotaindex="1",tenant="solar"} 7
capsule_tenant_resource_limit{resource="requests.cpu",resourcequotaindex="0",tenant="solar"} 2
capsule_tenant_resource_limit{resource="requests.memory",resourcequotaindex="0",tenant="solar"} 2.147483648e+09

# HELP capsule_tenant_resource_usage Current resource usage for a given resource in a tenant
# TYPE capsule_tenant_resource_usage gauge
capsule_tenant_resource_usage{resource="limits.cpu",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="limits.memory",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="namespaces",resourcequotaindex="",tenant="solar"} 2
capsule_tenant_resource_usage{resource="pods",resourcequotaindex="1",tenant="solar"} 0
capsule_tenant_resource_usage{resource="requests.cpu",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="requests.memory",resourcequotaindex="0",tenant="solar"} 0

Custom Metrics

You can gather more information based on the status of the tenants. These can be scrapped via Kube-State-Metrics CustomResourcesState Metrics. With these you have the possibility to create custom metrics based on the status of the tenants.

Here as an example with the kube-prometheus-stack chart, set the following values:

kube-state-metrics:
  rbac:
    extraRules:
      - apiGroups: [ "capsule.clastix.io" ]
        resources: ["tenants"]
        verbs: [ "list", "watch" ]
  customResourceState:
    enabled: true
    config:
      spec:
        resources:
          - groupVersionKind:
              group: capsule.clastix.io
              kind: "Tenant"
              version: "v1beta2"
            labelsFromPath:
              name: [metadata, name]
            metrics:
              - name: "tenant_size"
                help: "Count of namespaces in the tenant"
                each:
                  type: Gauge
                  gauge:
                    path: [status, size]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_state"
                help: "The operational state of the Tenant"
                each:
                  type: StateSet
                  stateSet:
                    labelName: state
                    path: [status, state]
                    list: [Active, Cordoned]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_namespaces_info"
                help: "Namespaces of a Tenant"
                each:
                  type: Info
                  info:
                    path: [status, namespaces]
                    labelsFromPath:
                      tenant_namespace: []
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]

This example creates three custom metrics:

  • tenant_size is a gauge that counts the number of namespaces in the tenant.
  • tenant_state is a state set that shows the operational state of the tenant.
  • tenant_namespaces_info is an info metric that shows the namespaces of the tenant.

5 - Backup & Restore

Run Backups and Restores of Tenants

Velero is a backup and restore solution that performs data protection, disaster recovery and migrates Kubernetes cluster from on-premises to the Cloud or between different Clouds.

When coming to backup and restore in Kubernetes, we have two main requirements:

  • Configurations backup
  • Data backup

The first requirement aims to backup all the resources stored into etcd database, for example: namespaces, pods, services, deployments, etc. The second is about how to backup stateful application data as volumes.

The main limitation of Velero is the multi tenancy. Currently, Velero does not support multi tenancy meaning it can be only used from admin users and so it cannot provided “as a service” to the users. This means that the cluster admin needs to take care of users’ backup.

Assuming you have multiple tenants managed by Capsule, for example oil and gas, as cluster admin, you can to take care of scheduling backups for:

  • Tenant cluster resources
  • Namespaces belonging to each tenant

Create backup of a tenant

Create a backup of the tenant solar. It consists in two different backups:

  • backup of the tenant resource
  • backup of all the resources belonging to the tenant

To backup the oil tenant selectively, label the tenant as:

kubectl label tenant oil capsule.clastix.io/tenant=solar

and create the backup

velero create backup solar-tenant \
    --include-cluster-resources=true \
    --include-resources=tenants.capsule.clastix.io \
    --selector capsule.clastix.io/tenant=solar

resulting in the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-tenant
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: true
  includedNamespaces:
  - '*'
  includedResources:
  - tenants.capsule.clastix.io
  labelSelector:
    matchLabels:
      capsule.clastix.io/tenant: solar
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Create a backup of all the resources belonging to the oil tenant namespaces:

velero create backup solar-namespaces \
    --include-cluster-resources=false \
    --include-namespaces solar-production,solar-development,solar-marketing

resulting to the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-namespaces
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: false
  includedNamespaces:
  - solar-production
  - solar-development
  - solar-marketing
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Velero requires an Object Storage backend where to store backups, you should take care of this requirement before to use Velero.

Restore a tenant from the backup

To recover the tenant after a disaster, or to migrate it to another cluster, create a restore from the previous backups:

velero create restore --from-backup solar-tenant
velero create restore --from-backup solar-namespaces

Using Velero to restore a Capsule tenant can lead to an incomplete recovery of tenant because the namespaces restored with Velero do not have the OwnerReference field used to bind the namespaces to the tenant. For this reason, all restored namespaces are not bound to the tenant:

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    34m
oil  active   9                 8                 {"pool":"oil"}  33m
solar    active   9                 0 # <<<           {"pool":"solar"}    54m

To avoid this problem you can use the script velero-restore.sh located under the hack/ folder:

./velero-restore.sh --kubeconfing /path/to/your/kubeconfig --tenant "oil" restore

Running this command, we are going to patch the tenant’s namespaces manifests that are actually ownerReferences-less. Once the command has finished its run, you got the tenant back.

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    44m
solar  active   9                 8                 {"pool":"oil"}  43m
oil    active   9                 3 # <<<           {"pool":"solar"}    12s

6 - Troubleshooting

Different topics when you encounter problems with Capsule