1 - Overview

Understand the problem Capsule is attempting to solve and how it works

Capsule implements a multi-tenant and policy-based environment in your Kubernetes cluster. It is designed as a micro-services-based ecosystem with the minimalist approach, leveraging only on upstream Kubernetes.

What’s the problem with the current status?

Kubernetes introduces the Namespace object type to create logical partitions of the cluster as isolated slices. However, implementing advanced multi-tenancy scenarios, it soon becomes complicated because of the flat structure of Kubernetes namespaces and the impossibility to share resources among namespaces belonging to the same tenant. To overcome this, cluster admins tend to provision a dedicated cluster for each groups of users, teams, or departments. As an organization grows, the number of clusters to manage and keep aligned becomes an operational nightmare, described as the well known phenomena of the clusters sprawl.

Entering Capsule

Capsule takes a different approach. In a single cluster, the Capsule Controller aggregates multiple namespaces in a lightweight abstraction called Tenant, basically a grouping of Kubernetes Namespaces. Within each tenant, users are free to create their namespaces and share all the assigned resources.

On the other side, the Capsule Policy Engine keeps the different tenants isolated from each other. Network and Security Policies, Resource Quota, Limit Ranges, RBAC, and other policies defined at the tenant level are automatically inherited by all the namespaces in the tenant. Then users are free to operate their tenants in autonomy, without the intervention of the cluster administrator.


capsule-operator

What problems are out of scope

Capsule does not aim to solve the following problems:

  • Handling of Custom Resource Definition management. Capsule does not aim to manage the control of Custom Resource Definition. Users have to implement their own solution.

1.1 - Benchmark

Multi-Tenancy Benchmark

The Multi-Tenancy Benchmark is a WG (Working Group) committed to achieving multi-tenancy in Kubernetes.

The Benchmarks are guidelines that validate if a Kubernetes cluster is properly configured for multi-tenancy.

Capsule is an open source multi-tenancy operator, we decided to meet the requirements of MTB. although at the time of writing, it’s in development and not ready for usage. Strictly speaking, we do not claim official conformance to MTB, but just to adhere to the multi-tenancy requirements and best practices promoted by MTB.

MTB BenchmarkMTB ProfileCapsule VersionConformanceNotes
Block access to cluster resourcesL1v0.1.0
Block access to multitenant resourcesL1v0.1.0
Block access to other tenant resourcesL1v0.1.0MTB draft
Block add capabilitiesL1v0.1.0
Require always imagePullPolicyL1v0.1.0
Require run as non-root userL1v0.1.0
Block privileged containersL1v0.1.0
Block privilege escalationL1v0.1.0
Configure namespace resource quotasL1v0.1.0
Block modification of resource quotasL1v0.1.0
Configure namespace object limitsL1v0.1.0
Block use of host path volumesL1v0.1.0
Block use of host networking and portsL1v0.1.0
Block use of host PIDL1v0.1.0
Block use of host IPCL1v0.1.0
Block use of NodePort servicesL1v0.1.0
Require PersistentVolumeClaim for storageL1v0.1.0MTB draft
Require PV reclaim policy of deleteL1v0.1.0MTB draft
Block use of existing PVsL1v0.1.0MTB draft
Block network access across tenant namespacesL1v0.1.0MTB draft
Allow self-service management of Network PoliciesL2v0.1.0
Allow self-service management of RolesL2v0.1.0MTB draft
Allow self-service management of Role BindingsL2v0.1.0MTB draft

Allow self-service management of Network Policies

Profile Applicability: L2

Type: Behavioral

Category: Self-Service Operations

Description: Tenants should be able to perform self-service operations by creating their own network policies in their namespaces.

Rationale: Enables self-service management of network-policies.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  networkPolicies:
    items:
    - ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: oil
      podSelector: {}
      policyTypes:
      - Egress
      - Ingress
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, retrieve the networkpolicies resources in the tenant namespace

kubectl --kubeconfig alice get networkpolicies 
NAME            POD-SELECTOR   AGE
capsule-oil-0   <none>         7m5s

As a tenant, checks for permissions to manage networkpolicy for each verb

kubectl --kubeconfig alice auth can-i get networkpolicies
kubectl --kubeconfig alice auth can-i create networkpolicies
kubectl --kubeconfig alice auth can-i update networkpolicies
kubectl --kubeconfig alice auth can-i patch networkpolicies
kubectl --kubeconfig alice auth can-i delete networkpolicies
kubectl --kubeconfig alice auth can-i deletecollection networkpolicies

Each command must return ‘yes’

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Allow self-service management of Role Bindings

Profile Applicability: L2

Type: Behavioral

Category: Self-Service Operations

Description: Tenants should be able to perform self-service operations by creating their rolebindings in their namespaces.

Rationale: Enables self-service management of roles.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner check for permissions to manage rolebindings for each verb

kubectl --kubeconfig alice auth can-i get rolebindings
kubectl --kubeconfig alice auth can-i create rolebindings
kubectl --kubeconfig alice auth can-i update rolebindings
kubectl --kubeconfig alice auth can-i patch rolebindings
kubectl --kubeconfig alice auth can-i delete rolebindings
kubectl --kubeconfig alice auth can-i deletecollection rolebindings

Each command must return ‘yes’

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Allow self-service management of Roles

Profile Applicability: L2

Type: Behavioral

Category: Self-Service Operations

Description: Tenants should be able to perform self-service operations by creating their own roles in their namespaces.

Rationale: Enables self-service management of roles.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, check for permissions to manage roles for each verb

kubectl --kubeconfig alice auth can-i get roles
kubectl --kubeconfig alice auth can-i create roles
kubectl --kubeconfig alice auth can-i update roles
kubectl --kubeconfig alice auth can-i patch roles
kubectl --kubeconfig alice auth can-i delete roles
kubectl --kubeconfig alice auth can-i deletecollection roles

Each command must return ‘yes’

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Block access to cluster resources

Profile Applicability: L1

Type: Configuration Check

Category: Control Plane Isolation

Description: Tenants should not be able to view, edit, create or delete cluster (non-namespaced) resources such Node, ClusterRole, ClusterRoleBinding, etc.

Rationale: Access controls should be configured for tenants so that a tenant cannot list, create, modify or delete cluster resources

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As cluster admin, run the following command to retrieve the list of non-namespaced resources

kubectl --kubeconfig cluster-admin api-resources --namespaced=false

For all non-namespaced resources, and each verb (get, list, create, update, patch, watch, delete, and deletecollection) issue the following command:

kubectl --kubeconfig alice auth can-i <verb> <resource>

Each command must return no

Exception:

It should, but it does not:

kubectl --kubeconfig alice auth can-i create selfsubjectaccessreviews
yes
kubectl --kubeconfig alice auth can-i create selfsubjectrulesreviews
yes
kubectl --kubeconfig alice auth can-i create namespaces
yes

Any kubernetes user can create SelfSubjectAccessReview and SelfSubjectRulesReviews to checks whether he/she can act. First, two exceptions are not an issue.

kubectl --anyuser auth can-i --list
Resources                                       Non-Resource URLs   Resource Names   Verbs
selfsubjectaccessreviews.authorization.k8s.io   []                  []               [create]
selfsubjectrulesreviews.authorization.k8s.io    []                  []               [create]
                                                [/api/*]            []               [get]
                                                [/api]              []               [get]
                                                [/apis/*]           []               [get]
                                                [/apis]             []               [get]
                                                [/healthz]          []               [get]
                                                [/healthz]          []               [get]
                                                [/livez]            []               [get]
                                                [/livez]            []               [get]
                                                [/openapi/*]        []               [get]
                                                [/openapi]          []               [get]
                                                [/readyz]           []               [get]
                                                [/readyz]           []               [get]
                                                [/version/]         []               [get]
                                                [/version/]         []               [get]
                                                [/version]          []               [get]
                                                [/version]          []               [get]

To enable namespace self-service provisioning, Capsule intentionally gives permissions to create namespaces to all users belonging to the Capsule group:

kubectl describe clusterrolebindings capsule-namespace-provisioner
Name:         capsule-namespace-provisioner
Labels:       <none>
Annotations:  <none>
Role:
  Kind:  ClusterRole
  Name:  capsule-namespace-provisioner
Subjects:
  Kind   Name                Namespace
  ----   ----                ---------
  Group  capsule.clastix.io  

kubectl describe clusterrole capsule-namespace-provisioner
Name:         capsule-namespace-provisioner
Labels:       <none>
Annotations:  <none>
PolicyRule:
  Resources   Non-Resource URLs  Resource Names  Verbs
  ---------   -----------------  --------------  -----
  namespaces  []                 []              [create]

Capsule controls self-service namespace creation by limiting the number of namespaces the user can create by the tenant.spec.namespaceQuota option.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Block access to multitenant resources

Profile Applicability: L1

Type: Behavioral

Category: Tenant Isolation

Description: Each tenant namespace may contain resources set up by the cluster administrator for multi-tenancy, such as role bindings, and network policies. Tenants should not be allowed to modify the namespaced resources created by the cluster administrator for multi-tenancy. However, for some resources such as network policies, tenants can configure additional instances of the resource for their workloads.

Rationale: Tenants can escalate privileges and impact other tenants if they can delete or modify required multi-tenancy resources such as namespace resource quotas or default network policy.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  networkPolicies:
    items:
    - podSelector: {}
      policyTypes:
      - Ingress
      - Egress
    - egress:
      - to:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: oil
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: oil
      podSelector: {}
      policyTypes:
      - Egress
      - Ingress
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, retrieve the networkpolicies resources in the tenant namespace

kubectl --kubeconfig alice get networkpolicies 
NAME            POD-SELECTOR   AGE
capsule-oil-0   <none>         7m5s
capsule-oil-1   <none>         7m5s

As tenant owner try to modify or delete one of the networkpolicies

kubectl --kubeconfig alice delete networkpolicies capsule-oil-0

You should receive an error message denying the edit/delete request

Error from server (Forbidden): networkpolicies.networking.k8s.io "capsule-oil-0" is forbidden:
User "oil" cannot delete resource "networkpolicies" in API group "networking.k8s.io" in the namespace "oil-production"

As tenant owner, you can create an additional networkpolicy inside the namespace

kubectl create -f - << EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: hijacking
  namespace: oil-production
spec:
  egress: 
    - to:
      - ipBlock:
          cidr: 0.0.0.0/0
  podSelector: {}
  policyTypes:
  - Egress
EOF

However, due to the additive nature of networkpolicies, the DENY ALL policy set by the cluster admin, prevents hijacking.

As tenant owner list RBAC permissions set by Capsule

kubectl --kubeconfig alice get rolebindings
NAME                                      ROLE                                    AGE
capsule-oil-0-admin                       ClusterRole/admin                       11h
capsule-oil-1-capsule-namespace-deleter   ClusterRole/capsule-namespace-deleter   11h

As tenant owner, try to change/delete the rolebinding to escalate permissions

kubectl --kubeconfig alice edit/delete rolebinding capsule-oil-0-admin

The rolebinding is immediately recreated by Capsule:

kubectl --kubeconfig alice get rolebindings
NAME                                      ROLE                                    AGE
capsule-oil-0-admin                       ClusterRole/admin                       2s
capsule-oil-1-capsule-namespace-deleter   ClusterRole/capsule-namespace-deleter   11h

However, the tenant owner can create and assign permissions inside the namespace she owns

kubectl create -f - << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
  name: oil-robot:admin
  namespace: oil-production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: admin
subjects:
- kind: ServiceAccount
  name: default
  namespace: oil-production
EOF

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Block access to other tenant resources

Profile Applicability: L1

Type: Behavioral

Category: Tenant Isolation

Description: Each tenant has its own set of resources, such as namespaces, service accounts, secrets, pods, services, etc. Tenants should not be allowed to access each other’s resources.

Rationale: Tenant’s resources must be not accessible by other tenants.

Audit:

As cluster admin, create a couple of tenants

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

and

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: gas
spec:
  owners:
  - kind: User
    name: joe
EOF

./create-user.sh joe gas

As oil tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As gas tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig joe create ns gas-production
kubectl --kubeconfig joe config set-context --current --namespace gas-production

As oil tenant owner, try to retrieve the resources in the gas tenant namespaces

kubectl --kubeconfig alice get serviceaccounts --namespace  gas-production 

You must receive an error message:

Error from server (Forbidden): serviceaccount is forbidden:
User "oil" cannot list resource "serviceaccounts" in API group "" in the namespace "gas-production"

As gas tenant owner, try to retrieve the resources in the oil tenant namespaces

kubectl --kubeconfig joe get serviceaccounts --namespace  oil-production 

You must receive an error message:

Error from server (Forbidden): serviceaccount is forbidden:
User "joe" cannot list resource "serviceaccounts" in API group "" in the namespace "oil-production"

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenants oil gas

Block add capabilities

Profile Applicability: L1

Type: Behavioral Check

Category: Control Plane Isolation

Description: Control Linux capabilities.

Rationale: Linux allows defining fine-grained permissions using capabilities. With Kubernetes, it is possible to add capabilities for pods that escalate the level of kernel access and allow other potentially dangerous behaviors.

Audit:

As cluster admin, define a PodSecurityPolicy with allowedCapabilities and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  # The default set of capabilities are implicitly allowed
  # The empty set means that no additional capabilities may be added beyond the default set
  allowedCapabilities: []
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod and see new capabilities cannot be added in the tenant namespaces

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-settime-cap
  namespace:
  labels:
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
    securityContext:
      capabilities:
        add:
        - SYS_TIME
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block modification of resource quotas

Profile Applicability: L1

Type: Behavioral Check

Category: Tenant Isolation

Description: Tenants should not be able to modify the resource quotas defined in their namespaces

Rationale: Resource quotas must be configured for isolation and fairness between tenants. Tenants should not be able to modify existing resource quotas as they may exhaust cluster resources and impact other tenants.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  resourceQuotas:
    items:
    - hard:
        limits.cpu: "8"
        limits.memory: 16Gi
        requests.cpu: "8"
        requests.memory: 16Gi
    - hard:
        pods: "10"
        services: "50"
    - hard:
        requests.storage: 100Gi
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, check the permissions to modify/delete the quota in the tenant namespace:

kubectl --kubeconfig alice auth can-i create quota
kubectl --kubeconfig alice auth can-i update quota
kubectl --kubeconfig alice auth can-i patch quota
kubectl --kubeconfig alice auth can-i delete quota
kubectl --kubeconfig alice auth can-i deletecollection quota

Each command must return ’no'

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Block network access across tenant namespaces

Profile Applicability: L1

Type: Behavioral

Category: Tenant Isolation

Description: Block network traffic among namespaces from different tenants.

Rationale: Tenants cannot access services and pods in another tenant’s namespaces.

Audit:

As cluster admin, create a couple of tenants

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  networkPolicies:
    items:
    - ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: oil
      podSelector: {}
      policyTypes:
      - Ingress
EOF

./create-user.sh alice oil

and

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: gas
spec:
  owners:
  - kind: User
    name: joe
  networkPolicies:
    items:
    - ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: gas
      podSelector: {}
      policyTypes:
      - Ingress
EOF

./create-user.sh joe gas

As oil tenant owner, run the following commands to create a namespace and resources in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production
kubectl --kubeconfig alice run webserver --image nginx:latest
kubectl --kubeconfig alice expose pod webserver --port 80

As gas tenant owner, run the following commands to create a namespace and resources in the given tenant

kubectl --kubeconfig joe create ns gas-production
kubectl --kubeconfig joe config set-context --current --namespace gas-production
kubectl --kubeconfig joe run webserver --image nginx:latest
kubectl --kubeconfig joe expose pod webserver --port 80

As oil tenant owner, verify you can access the service in oil tenant namespace but not in the gas tenant namespace

kubectl --kubeconfig alice exec webserver -- curl http://webserver.oil-production.svc.cluster.local
kubectl --kubeconfig alice exec webserver -- curl http://webserver.gas-production.svc.cluster.local

Viceversa, as gas tenant owner, verify you can access the service in gas tenant namespace but not in the oil tenant namespace

kubectl --kubeconfig alice exec webserver -- curl http://webserver.oil-production.svc.cluster.local
kubectl --kubeconfig alice exec webserver -- curl http://webserver.gas-production.svc.cluster.local

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenants oil gas

Block privilege escalation

Profile Applicability: L1

Type: Behavioral Check

Category: Control Plane Isolation

Description: Control container permissions.

Rationale: The security allowPrivilegeEscalation setting allows a process to gain more privileges from its parent process. Processes in tenant containers should not be allowed to gain additional privileges.

Audit:

As cluster admin, define a PodSecurityPolicy that sets allowPrivilegeEscalation=false and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod or container that sets allowPrivilegeEscalation=true in its securityContext.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-priviliged-mode
  namespace: oil-production
  labels:
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
    securityContext:
      allowPrivilegeEscalation: true
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block privileged containers

Profile Applicability: L1

Type: Behavioral Check

Category: Control Plane Isolation

Description: Control container permissions.

Rationale: By default a container is not allowed to access any devices on the host, but a “privileged” container can access all devices on the host. A process within a privileged container can also get unrestricted host access. Hence, tenants should not be allowed to run privileged containers.

Audit:

As cluster admin, define a PodSecurityPolicy that sets privileged=false and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod or container that sets privileges in its securityContext.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-priviliged-mode
  namespace:
  labels:
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
    securityContext:
      privileged: true
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block use of existing PVs

Profile Applicability: L1

Type: Configuration Check

Category: Data Isolation

Description: Avoid a tenant to mount existing volumes`.

Rationale: Tenants have to be assured that their Persistent Volumes cannot be reclaimed by other tenants.

Audit:

As cluster admin, create a tenant

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As tenant owner, check if you can access the persistent volumes

kubectl --kubeconfig alice auth can-i get persistentvolumes
kubectl --kubeconfig alice auth can-i list persistentvolumes
kubectl --kubeconfig alice auth can-i watch persistentvolumes

You must receive for all the requests ’no'.

Block use of host IPC

Profile Applicability: L1

Type: Behavioral Check

Category: Host Isolation

Description: Tenants should not be allowed to share the host’s inter-process communication (IPC) namespace.

Rationale: The hostIPC setting allows pods to share the host’s inter-process communication (IPC) namespace allowing potential access to host processes or processes belonging to other tenants.

Audit:

As cluster admin, define a PodSecurityPolicy that restricts hostIPC usage and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  hostIPC: false
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod mounting the host IPC namespace.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-host-ipc
  namespace: oil-production
spec:
  hostIPC: true
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block use of host networking and ports

Profile Applicability: L1

Type: Behavioral Check

Category: Host Isolation

Description: Tenants should not be allowed to use host networking and host ports for their workloads.

Rationale: Using hostPort and hostNetwork allows tenants workloads to share the host networking stack allowing potential snooping of network traffic across application pods.

Audit:

As cluster admin, define a PodSecurityPolicy that restricts hostPort and hostNetwork and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  hostNetwork: false
  hostPorts: [] # empty means no allowed host ports
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod using hostNetwork

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-hostnetwork
  namespace: oil-production
spec:
  hostNetwork: true
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80
EOF

As tenant owner, create a pod defining a container using hostPort

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-hostport
  namespace: oil-production
spec:
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80
      hostPort: 9090
EOF

In both the cases above, you must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block use of host path volumes

Profile Applicability: L1

Type: Behavioral Check

Category: Host Protection

Description: Tenants should not be able to mount host volumes and directories.

Rationale: The use of host volumes and directories can be used to access shared data or escalate privileges and also creates a tight coupling between a tenant workload and a host.

Audit:

As cluster admin, define a PodSecurityPolicy that restricts hostPath volumes and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  volumes: # hostPath is not permitted
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod defining a volume of type hostpath.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-hostpath-volume
  namespace: oil-production
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
    volumeMounts:
    - mountPath: /tmp
      name: volume
  volumes:
  - name: volume
    hostPath:
      # directory location on host
      path: /data
      type: Directory
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block use of host PID

Profile Applicability: L1

Type: Behavioral Check

Category: Host Isolation

Description: Tenants should not be allowed to share the host process ID (PID) namespace.

Rationale: The hostPID setting allows pods to share the host process ID namespace allowing potential privilege escalation. Tenant pods should not be allowed to share the host PID namespace.

Audit:

As cluster admin, define a PodSecurityPolicy that restricts hostPID usage and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  hostPID: false
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod mounting the host PID namespace.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-host-pid
  namespace: oil-production
spec:
  hostPID: true
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Block use of NodePort services

Profile Applicability: L1

Type: Behavioral Check

Category: Host Isolation

Description: Tenants should not be able to create services of type NodePort.

Rationale: the service type NodePorts configures host ports that cannot be secured using Kubernetes network policies and require upstream firewalls. Also, multiple tenants cannot use the same host port numbers.

Audit:

As cluster admin, create a tenant

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  serviceOptions:
    allowedServices:
      nodePort: false
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, creates a service in the tenant namespace having service type of NodePort

kubectl --kubeconfig alice apply -f - << EOF
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
  namespace: oil-production
spec:
  ports:
  - protocol: TCP
    port: 8080
    targetPort: 80
  selector:
    run: nginx
  type: NodePort
EOF

You must receive an error message denying the request:

Error from server
Error from server (NodePort service types are forbidden for the tenant:
error when creating "STDIN": admission webhook "services.capsule.clastix.io" denied the request:
NodePort service types are forbidden for the tenant: please, reach out to the system administrators

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Configure namespace object limits

Profile Applicability: L1

Type: Configuration

Category: Fairness

Description: Namespace resource quotas should be used to allocate, track and limit the number of objects, of a particular type, that can be created within a namespace.

Rationale: Resource quotas must be configured for each tenant namespace, to guarantee isolation and fairness across tenants.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  resourceQuotas:
    items:
    - hard:
        pods: 100
        services: 50
        services.loadbalancers: 3
        services.nodeports: 20
        persistentvolumeclaims: 100
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, retrieve the configured quotas in the tenant namespace:

kubectl --kubeconfig alice get quota
NAME            AGE   REQUEST                 LIMIT
capsule-oil-0   23s   persistentvolumeclaims: 0/100,
                      pods: 0/100, services: 0/50,
                      services.loadbalancers: 0/3,
                      services.nodeports: 0/20  

Make sure that a quota is configured for API objects: PersistentVolumeClaim, LoadBalancer, NodePort, Pods, etc

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Configure namespace resource quotas

Profile Applicability: L1

Type: Configuration

Category: Fairness

Description: Namespace resource quotas should be used to allocate, track, and limit a tenant’s use of shared resources.

Rationale: Resource quotas must be configured for each tenant namespace, to guarantee isolation and fairness across tenants.

Audit:

As cluster admin, create a tenant

kubectl create -f - <<EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  resourceQuotas:
    items:
    - hard:
        limits.cpu: "8"
        limits.memory: 16Gi
        requests.cpu: "8"
        requests.memory: 16Gi
    - hard:
        requests.storage: 100Gi
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, retrieve the configured quotas in the tenant namespace:

kubectl --kubeconfig alice get quota
NAME            AGE   REQUEST                                      LIMIT
capsule-oil-0   24s   requests.cpu: 0/8, requests.memory: 0/16Gi   limits.cpu: 0/8, limits.memory: 0/16Gi                 
capsule-oil-1   24s   requests.storage: 0/10Gi                     

Make sure that a quota is configured for CPU, memory, and storage resources.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Require always imagePullPolicy

Profile Applicability: L1

Type: Configuration Check

Category: Data Isolation

Description: Set the image pull policy to Always for tenant workloads.

Rationale: Tenants have to be assured that their private images can only be used by those who have the credentials to pull them.

Audit:

As cluster admin, create a tenant

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  imagePullPolicies:
  - Always
  owners:
  - kind: User
    name: alice
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, creates a pod in the tenant namespace having imagePullPolicies=IfNotPresent

kubectl --kubeconfig alice apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: oil-production
spec:
  containers:
  - name: nginx
    image: nginx:latest
    imagePullPolicy: IfNotPresent
EOF

You must receive an error message denying the request:

Error from server
(ImagePullPolicy IfNotPresent for container nginx is forbidden, use one of the followings: Always): error when creating "STDIN": admission webhook "pods.capsule.clastix.io" denied the request:
ImagePullPolicy IfNotPresent for container nginx is forbidden, use one of the followings: Always

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil

Require PersistentVolumeClaim for storage

Profile Applicability: L1

Type: Behavioral Check

Category: na

Description: Tenants should not be able to use all volume types except PersistentVolumeClaims.

Rationale: In some scenarios, it would be required to disallow usage of any core volume types except PVCs.

Audit:

As cluster admin, define a PodSecurityPolicy allowing only PersistentVolumeClaim volumes and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  volumes: 
    - 'persistentVolumeClaim'
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
  namespace: oil-production
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod defining a volume of any of the core type except PersistentVolumeClaim. For example:

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-hostpath-volume
  namespace: oil-production
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
    volumeMounts:
    - mountPath: /tmp
      name: volume
  volumes:
  - name: volume
    hostPath:
      # directory location on host
      path: /data
      type: Directory
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

Require PV reclaim policy of delete

Profile Applicability: L1

Type: Configuration Check

Category: Data Isolation

Description: Force a tenant to use a Storage Class with reclaimPolicy=Delete.

Rationale: Tenants have to be assured that their Persistent Volumes cannot be reclaimed by other tenants.

Audit:

As cluster admin, create a Storage Class with reclaimPolicy=Delete

kubectl create -f - << EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: delete-policy
reclaimPolicy: Delete
provisioner: clastix.io/nfs
EOF

As cluster admin, create a tenant and assign the above Storage Class

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  storageClasses:
    allowed:
    - delete-policy
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, creates a Persistent Volume Claim in the tenant namespace missing the Storage Class or using any other Storage Class:

kubectl --kubeconfig alice apply -f - << EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc
  namespace: oil-production
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 12Gi
EOF

You must receive an error message denying the request:

Error from server (A valid Storage Class must be used, one of the following (delete-policy)):
error when creating "STDIN": admission webhook "pvc.capsule.clastix.io" denied the request:
A valid Storage Class must be used, one of the following (delete-policy)

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete storageclass delete-policy

Require run as non-root user

Profile Applicability: L1

Type: Behavioral Check

Category: Control Plane Isolation

Description: Control container permissions.

Rationale: Processes in containers run as the root user (uid 0), by default. To prevent potential compromise of container hosts, specify a least-privileged user ID when building the container image and require that application containers run as non-root users.

Audit:

As cluster admin, define a PodSecurityPolicy with runAsUser=MustRunAsNonRoot and map the policy to a tenant:

kubectl create -f - << EOF
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: tenant
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false
  runAsUser:
    # Require the container to run without root privileges.
    rule: MustRunAsNonRoot
  supplementalGroups:
    rule: MustRunAs
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  fsGroup:
    rule: MustRunAs
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
EOF

Note: make sure PodSecurityPolicy Admission Control is enabled on the APIs server: --enable-admission-plugins=PodSecurityPolicy

Then create a ClusterRole using or granting the said item

kubectl create -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant:psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['tenant']
  verbs: ['use']
EOF

And assign it to the tenant

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - kind: User
    name: alice
  additionalRoleBindings:
  - clusterRoleName: tenant:psp
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"
EOF

./create-user.sh alice oil

As tenant owner, run the following command to create a namespace in the given tenant

kubectl --kubeconfig alice create ns oil-production
kubectl --kubeconfig alice config set-context --current --namespace oil-production

As tenant owner, create a pod or container that does not set runAsNonRoot to true in its securityContext, and runAsUser must not be set to 0.

kubectl --kubeconfig alice apply -f - << EOF 
apiVersion: v1
kind: Pod
metadata:
  name: pod-run-as-root
  namespace: oil-production
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["/bin/sleep", "3600"]
EOF

You must have the pod blocked by PodSecurityPolicy.

Cleanup: As cluster admin, delete all the created resources

kubectl --kubeconfig cluster-admin delete tenant oil
kubectl --kubeconfig cluster-admin delete PodSecurityPolicy tenant
kubectl --kubeconfig cluster-admin delete ClusterRole tenant:psp

2 - Operating

Get started with using tenancy on Kubernetes

Installation

Make sure you have access to a Kubernetes cluster as administrator. See the Artifacthub Page for a complete list of available versions and installation instructions.

$ helm repo add projectcapsule https://projectcapsule.github.io/charts
$ helm install capsule projectcapsule/capsule -n capsule-system --create-namespace

Create your first Tenant

In Capsule, a Tenant is an abstraction to group multiple namespaces in a single entity within a set of boundaries defined by the Cluster Administrator. The tenant is then assigned to a user or group of users who is called Tenant Owner.

Capsule defines a Tenant as Custom Resource with cluster scope.

Create the tenant as cluster admin:

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - name: alice
    kind: User
EOF

You can check the tenant just created

$ kubectl get tenants
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
solar    Active                     0                                 10s

Login as Tenant Owner

Each tenant comes with a delegated user or group of users acting as the tenant admin. In the Capsule jargon, this is called the Tenant Owner. Other users can operate inside a tenant with different levels of permissions and authorizations assigned directly by the Tenant Owner.

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by –capsule-user-group option, which defaults to capsule.clastix.io.

Assignment to a group depends on the authentication strategy in your cluster.

For example, if you are using capsule.clastix.io, users authenticated through a X.509 certificate must have capsule.clastix.io as Organization: -subj "/CN=${USER}/O=capsule.clastix.io"

Users authenticated through an OIDC token must have in their token:

...
"users_groups": [
  "capsule.clastix.io",
  "other_group"
]

The hack/create-user.sh can help you set up a dummy kubeconfig for the alice user acting as owner of a tenant called solar.

./hack/create-user.sh alice solar
...
certificatesigningrequest.certificates.k8s.io/alice-solar created
certificatesigningrequest.certificates.k8s.io/alice-solar approved
kubeconfig file is: alice-solar.kubeconfig
to use it as alice export KUBECONFIG=alice-solar.kubeconfig

Login as tenant owner

$ export KUBECONFIG=alice-solar.kubeconfig

Impersonation

You can simulate this behavior by using impersonation:

kubectl --as alice --as-group capsule.clastix.io ...

Create namespaces

As tenant owner, you can create namespaces:

$ kubectl create namespace solar-production
$ kubectl create namespace solar-development

or

$ kubectl --as alice --as-group capsule.clastix.io create namespace solar-production
$ kubectl --as alice --as-group capsule.clastix.io create namespace solar-development

And operate with fully admin permissions:

$ kubectl -n solar-development run nginx --image=docker.io/nginx 
$ kubectl -n solar-development get pods

Limiting access

Tenant Owners have full administrative permissions limited to only the namespaces in the assigned tenant. They can create any namespaced resource in their namespaces but they do not have access to cluster resources or resources belonging to other tenants they do not own:

$ kubectl -n kube-system get pods
Error from server (Forbidden): pods is forbidden:
User "alice" cannot list resource "pods" in API group "" in the namespace "kube-system"

See the concepts for getting more cool things you can do with Capsule.

2.1 - Installation

Installing Capsule

List of Tenant API changes:

  • Capsule v0.1.0 bump to v1beta1 from v1alpha1.
  • Capsule v0.2.0 bump to v1beta2 from v1beta1, deprecating v1alpha1.
  • Capsule v0.3.0 missing enums required by Capsule Proxy.

This document aims to provide support and a guide on how to perform a clean upgrade to the latest API version in order to avoid service disruption and data loss.

As an installation method, Helm is given for granted. If you are not using Helm, you might experience problems during the upgrade process.

Considerations

We strongly suggest performing a full backup of your Kubernetes cluster, such as storage and etcd. Use your favorite tool according to your needs.

Upgrading from v0.2.x to v0.3.x

A minor bump has been requested due to some missing enums in the Tenant resource.

Scale down the Capsule controller

Using the kubectl or Helm, scale down the Capsule controller manager: this is required to avoid the old Capsule version from processing objects that aren’t yet installed as a CRD.

helm upgrade -n capsule-system capsule --set "replicaCount=0" 

or

kubectl scale deploy capsule-controller-manager --replicas=0 -n capsule-system 

2.2 - Upgrading

Upgrading Capsule

List of Tenant API changes:

  • Capsule v0.1.0 bump to v1beta1 from v1alpha1.
  • Capsule v0.2.0 bump to v1beta2 from v1beta1, deprecating v1alpha1.
  • Capsule v0.3.0 missing enums required by Capsule Proxy.

This document aims to provide support and a guide on how to perform a clean upgrade to the latest API version in order to avoid service disruption and data loss.

As an installation method, Helm is given for granted. If you are not using Helm, you might experience problems during the upgrade process.

Considerations

We strongly suggest performing a full backup of your Kubernetes cluster, such as storage and etcd. Use your favorite tool according to your needs.

Upgrading from v0.2.x to v0.3.x

A minor bump has been requested due to some missing enums in the Tenant resource.

Scale down the Capsule controller

Using the kubectl or Helm, scale down the Capsule controller manager: this is required to avoid the old Capsule version from processing objects that aren’t yet installed as a CRD.

helm upgrade -n capsule-system capsule --set "replicaCount=0" 

or

kubectl scale deploy capsule-controller-manager --replicas=0 -n capsule-system 

2.3 - Authentication

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by userGroups option in the CapsuleConfiguration, which defaults to capsule.clastix.io.

OIDC

In the following guide, we’ll use Keycloak an Open Source Identity and Access Management server capable to authenticate users via OIDC and release JWT tokens as proof of authentication.

Configuring OIDC Server

Configure Keycloak as OIDC server:

  • Add a realm called caas, or use any existing realm instead
  • Add a group capsule.clastix.io
  • Add a user alice assigned to group capsule.clastix.io
  • Add an OIDC client called kubernetes

For the kubernetes client, create protocol mappers called groups and audience If everything is done correctly, now you should be able to authenticate in Keycloak and see user groups in JWT tokens. Use the following snippet to authenticate in Keycloak as alice user:

$ KEYCLOAK=sso.clastix.io
$ REALM=caas
$ OIDC_ISSUER=${KEYCLOAK}/realms/${REALM}

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/token \
     -d grant_type=password \
     -d response_type=id_token \
     -d scope=openid \
     -d client_id=${OIDC_CLIENT_ID} \
     -d client_secret=${OIDC_CLIENT_SECRET} \
     -d username=${USERNAME} \
     -d password=${PASSWORD} | jq

The result will include an ACCESS_TOKEN, a REFRESH_TOKEN, and an ID_TOKEN. The access-token can generally be disregarded for Kubernetes. It would be used if the identity provider was managing roles and permissions for the users but that is done in Kubernetes itself with RBAC. The id-token is short lived while the refresh-token has longer expiration. The refresh-token is used to fetch a new id-token when the id-token expires.

{  
   "access_token":"ACCESS_TOKEN",
   "refresh_token":"REFRESH_TOKEN",
   "id_token": "ID_TOKEN",
   "token_type":"bearer",
   "scope": "openid groups profile email"
}

To introspect the ID_TOKEN token run:

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/introspect \
     -d token=${ID_TOKEN} \
     --user ${OIDC_CLIENT_ID}:${OIDC_CLIENT_SECRET} | jq

The result will be like the following:

{
  "exp": 1601323086,
  "iat": 1601322186,
  "aud": "kubernetes",
  "typ": "ID",
  "azp": "kubernetes",
  "preferred_username": "alice",
  "email_verified": false,
  "acr": "1",
  "groups": [
    "capsule.clastix.io"
  ],
  "client_id": "kubernetes",
  "username": "alice",
  "active": true
}

Configuring Kubernetes API Server

Configuring Kubernetes for OIDC Authentication requires adding several parameters to the API Server. Please, refer to the documentation for details and examples. Most likely, your kube-apiserver.yaml manifest will looks like the following:

spec:
  containers:
  - command:
    - kube-apiserver
    ...
    - --oidc-issuer-url=https://${OIDC_ISSUER}
    - --oidc-ca-file=/etc/kubernetes/oidc/ca.crt
    - --oidc-client-id=${OIDC_CLIENT_ID}
    - --oidc-username-claim=preferred_username
    - --oidc-groups-claim=groups
    - --oidc-username-prefix=-

KinD

As reference, here is an example of a KinD configuration for OIDC Authentication, which can be useful for local testing:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
    kubeadmConfigPatches:
     - |
       kind: ClusterConfiguration
       apiServer:
           extraArgs:
             oidc-issuer-url: https://${OIDC_ISSUER}
             oidc-username-claim: preferred_username
             oidc-client-id: ${OIDC_CLIENT_ID}
             oidc-username-prefix: "keycloak:"
             oidc-groups-claim: groups
             oidc-groups-prefix: "keycloak:"
             enable-admission-plugins: PodNodeSelector       

Configuring kubectl

There are two options to use kubectl with OIDC:

  • OIDC Authenticator
  • Use the –token option

Plugin

One way to use OIDC authentication is the use of a kubectl plugin. The Kubelogin Plugin for kubectl simplifies the process of obtaining an OIDC token and configuring kubectl to use it. Follow the link to obtain installation instructions.

kubectl oidc-login setup \
	--oidc-issuer-url=https://${OIDC_ISSUER} \
	--oidc-client-id=${OIDC_CLIENT_ID} \
	--oidc-client-secret=${OIDC_CLIENT_SECRET}

Manual

To use the OIDC Authenticator, add an oidc user entry to your kubeconfig file:

$ kubectl config set-credentials oidc \
    --auth-provider=oidc \
    --auth-provider-arg=idp-issuer-url=https://${OIDC_ISSUER} \
    --auth-provider-arg=idp-certificate-authority=/path/to/ca.crt \
    --auth-provider-arg=client-id=${OIDC_CLIENT_ID} \
    --auth-provider-arg=client-secret=${OIDC_CLIENT_SECRET} \
    --auth-provider-arg=refresh-token=${REFRESH_TOKEN} \
    --auth-provider-arg=id-token=${ID_TOKEN} \
    --auth-provider-arg=extra-scopes=groups

To use the --token option:

$ kubectl config set-credentials oidc --token=${ID_TOKEN}

Point the kubectl to the URL where the Kubernetes APIs Server is reachable:

$ kubectl config set-cluster mycluster \
    --server=https://kube.projectcapulse.io:6443 \
    --certificate-authority=~/.kube/ca.crt

If your APIs Server is reachable through the capsule-proxy, make sure to use the URL of the capsule-proxy.

Create a new context for the OIDC authenticated users:

$ kubectl config set-context alice-oidc@mycluster \
    --cluster=mycluster \
    --user=oidc

As user alice, you should be able to use kubectl to create some namespaces:

$ kubectl --context alice-oidc@mycluster create namespace oil-production
$ kubectl --context alice-oidc@mycluster create namespace oil-development
$ kubectl --context alice-oidc@mycluster create namespace gas-marketing

Warning: once your ID_TOKEN expires, the kubectl OIDC Authenticator will attempt to refresh automatically your ID_TOKEN using the REFRESH_TOKEN. In case the OIDC uses a self signed CA certificate, make sure to specify it with the idp-certificate-authority option in your kubeconfig file, otherwise you’ll not able to refresh the tokens.

2.4 - Monitoring

Monitoring Capsule Controller and Tenants

The Capsule dashboard allows you to track the health and performance of Capsule manager and tenants, with particular attention to resources saturation, server responses, and latencies. Prometheus and Grafana are requirements for monitoring Capsule.

Quickstart

Metrics

Controller

Proxy

Custom

You can gather more information based on the status of the tenants. These can be scrapped via Kube-State-Metrics CustomResourcesState Metrics. With these you have the possibility to create custom metrics based on the status of the tenants.

Here as an example with the kube-prometheus-stack chart, set the following values:

kube-state-metrics:
  rbac:
    extraRules:
      - apiGroups: [ "capsule.clastix.io" ]
        resources: ["tenants"]
        verbs: [ "list", "watch" ]
  customResourceState:
    enabled: true
    config:
      spec:
        resources:
          - groupVersionKind:
              group: capsule.clastix.io
              kind: "Tenant"
              version: "v1beta2"
            labelsFromPath:
              name: [metadata, name]
            metrics:
              - name: "tenant_size"
                help: "Count of namespaces in the tenant"
                each:
                  type: Gauge
                  gauge:
                    path: [status, size]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_state"
                help: "The operational state of the Tenant"
                each:
                  type: StateSet
                  stateSet:
                    labelName: state
                    path: [status, state]
                    list: [Active, Cordoned]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_namespaces_info"
                help: "Namespaces of a Tenant"
                each:
                  type: Info
                  info:
                    path: [status, namespaces]
                    labelsFromPath:
                      tenant_namespace: []
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]

This example creates three custom metrics:

  • tenant_size is a gauge that counts the number of namespaces in the tenant.
  • tenant_state is a state set that shows the operational state of the tenant.
  • tenant_namespaces_info is an info metric that shows the namespaces of the tenant.

2.5 - Backup & Restore

Velero is a backup and restore solution that performs data protection, disaster recovery and migrates Kubernetes cluster from on-premises to the Cloud or between different Clouds.

When coming to backup and restore in Kubernetes, we have two main requirements:

  • Configurations backup
  • Data backup

The first requirement aims to backup all the resources stored into etcd database, for example: namespaces, pods, services, deployments, etc. The second is about how to backup stateful application data as volumes.

The main limitation of Velero is the multi tenancy. Currently, Velero does not support multi tenancy meaning it can be only used from admin users and so it cannot provided “as a service” to the users. This means that the cluster admin needs to take care of users’ backup.

Assuming you have multiple tenants managed by Capsule, for example oil and gas, as cluster admin, you can to take care of scheduling backups for:

  • Tenant cluster resources
  • Namespaces belonging to each tenant

Create backup of a tenant

Create a backup of the tenant solar. It consists in two different backups:

  • backup of the tenant resource
  • backup of all the resources belonging to the tenant

To backup the oil tenant selectively, label the tenant as:

kubectl label tenant oil capsule.clastix.io/tenant=solar

and create the backup

velero create backup solar-tenant \
    --include-cluster-resources=true \
    --include-resources=tenants.capsule.clastix.io \
    --selector capsule.clastix.io/tenant=solar

resulting in the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-tenant
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: true
  includedNamespaces:
  - '*'
  includedResources:
  - tenants.capsule.clastix.io
  labelSelector:
    matchLabels:
      capsule.clastix.io/tenant: solar
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Create a backup of all the resources belonging to the oil tenant namespaces:

velero create backup solar-namespaces \
    --include-cluster-resources=false \
    --include-namespaces solar-production,solar-development,solar-marketing

resulting to the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-namespaces
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: false
  includedNamespaces:
  - solar-production
  - solar-development
  - solar-marketing
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Velero requires an Object Storage backend where to store backups, you should take care of this requirement before to use Velero.

Restore a tenant from the backup

To recover the tenant after a disaster, or to migrate it to another cluster, create a restore from the previous backups:

velero create restore --from-backup solar-tenant
velero create restore --from-backup solar-namespaces

Using Velero to restore a Capsule tenant can lead to an incomplete recovery of tenant because the namespaces restored with Velero do not have the OwnerReference field used to bind the namespaces to the tenant. For this reason, all restored namespaces are not bound to the tenant:

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    34m
oil  active   9                 8                 {"pool":"oil"}  33m
solar    active   9                 0 # <<<           {"pool":"solar"}    54m

To avoid this problem you can use the script velero-restore.sh located under the hack/ folder:

./velero-restore.sh --kubeconfing /path/to/your/kubeconfig --tenant "oil" restore

Running this command, we are going to patch the tenant’s namespaces manifests that are actually ownerReferences-less. Once the command has finished its run, you got the tenant back.

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    44m
solar  active   9                 8                 {"pool":"oil"}  43m
oil    active   9                 3 # <<<           {"pool":"solar"}    12s

2.6 - Pod Security

Control the security of the pods running in the tenant namespaces

In Kubernetes, by default, workloads run with administrative access, which might be acceptable if there is only a single application running in the cluster or a single user accessing it. This is seldom required and you’ll consequently suffer a noisy neighbour effect along with large security blast radiuses.

Many of these concerns were addressed initially by PodSecurityPolicies which have been present in the Kubernetes APIs since the very early days.

The Pod Security Policies are deprecated in Kubernetes 1.21 and removed entirely in 1.25. As replacement, the Pod Security Standards and Pod Security Admission has been introduced. Capsule support the new standard for tenants under its control as well as the oldest approach.

Pod Security Standards

One of the issues with Pod Security Policies is that it is difficult to apply restrictive permissions on a granular level, increasing security risk. Also the Pod Security Policies get applied when the request is submitted and there is no way of applying them to pods that are already running. For these, and other reasons, the Kubernetes community decided to deprecate the Pod Security Policies.

As the Pod Security Policies get deprecated and removed, the Pod Security Standards is used in place. It defines three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive:

  • Privileged: unrestricted policy, providing the widest possible level of permissions.
  • Baseline: minimally restrictive policy which prevents known privilege escalations.
  • Restricted: heavily restricted policy, following current Pod hardening best practices.

Kubernetes provides a built-in Admission Controller to enforce the Pod Security Standards at either:

  1. cluster level which applies a standard configuration to all namespaces in a cluster
  2. namespace level, one namespace at a time

For the first case, the cluster admin has to configure the Admission Controller and pass the configuration to the kube-apiserver by mean of the --admission-control-config-file extra argument, for example:

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "baseline"
      enforce-version: "latest"
      warn: "restricted"
      warn-version: "latest"
      audit: "restricted"
      audit-version: "latest"
    exemptions:
      usernames: []
      runtimeClasses: []
      namespaces: [kube-system]

For the second case, he can just assign labels to the specific namespace he wants enforce the policy since the Pod Security Admission Controller is enabled by default starting from Kubernetes 1.23+:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: development

Capsule

According to the regular Kubernetes segregation model, the cluster admin has to operate either at cluster level or at namespace level. Since Capsule introduces a further segregation level (the Tenant abstraction), the cluster admin can implement Pod Security Standards at tenant level by simply forcing specific labels on all the namespaces created in the tenant.

As cluster admin, create a tenant with additional labels:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  namespaceOptions:
    additionalMetadata:
      labels:
        pod-security.kubernetes.io/enforce: baseline
        pod-security.kubernetes.io/audit: restricted
        pod-security.kubernetes.io/warn: restricted
  owners:
  - kind: User
    name: alice

All namespaces created by the tenant owner, will inherit the Pod Security labels:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    capsule.clastix.io/tenant: solar
    kubernetes.io/metadata.name: solar-development
    name: solar-development
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: solar-development
  ownerReferences:
  - apiVersion: capsule.clastix.io/v1beta2
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: solar

and the regular Pod Security Admission Controller does the magic:

kubectl --kubeconfig alice-oil.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

The request gets denied:

Error from server (Forbidden): error when creating "STDIN":
pods "nginx" is forbidden: violates PodSecurity "baseline:latest": privileged
(container "nginx" must not set securityContext.privileged=true)

If the tenant owner tries to change o delete the above labels, Capsule will reconcile them to the original tenant manifest set by the cluster admin.

As additional security measure, the cluster admin can also prevent the tenant owner to make an improper usage of the above labels:

kubectl annotate tenant solar \
  capsule.clastix.io/forbidden-namespace-labels-regexp="pod-security.kubernetes.io\/(enforce|warn|audit)"

In that case, the tenant owner gets denied if she tries to use the labels:

kubectl --kubeconfig alice-solar.kubeconfig label ns solar-production \
    pod-security.kubernetes.io/enforce=restricted \
    --overwrite

Error from server (Label pod-security.kubernetes.io/audit is forbidden for namespaces in the current Tenant ...

Pod Security Policies

As stated in the documentation, “PodSecurityPolicies enable fine-grained authorization of pod creation and updates. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the pod specification. The PodSecurityPolicy objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields.”

Using the Pod Security Policies, the cluster admin can impose limits on pod creation, for example the types of volume that can be consumed, the linux user that the process runs as in order to avoid running things as root, and more. From multi-tenancy point of view, the cluster admin has to control how users run pods in their tenants with a different level of permission on tenant basis.

Assume the Kubernetes cluster has been configured with Pod Security Policy Admission Controller enabled in the APIs server: --enable-admission-plugins=PodSecurityPolicy

The cluster admin creates a PodSecurityPolicy:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp:restricted
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false

Then create a ClusterRole using or granting the said item

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp:restricted
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['psp:restricted']
  verbs: ['use']

He can assign this role to all namespaces in a tenant by setting the tenant manifest:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  additionalRoleBindings:
  - clusterRoleName: psp:privileged
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"

With the given specification, Capsule will ensure that all tenant namespaces will contain a RoleBinding for the specified Cluster Role:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: 'capsule-solar-psp:privileged'
  namespace: solar-production
  labels:
    capsule.clastix.io/tenant: solar
subjects:
  - kind: Group
    apiGroup: rbac.authorization.k8s.io
    name: 'system:authenticated'
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: 'psp:privileged'

Capsule admission controller forbids the tenant owner to run privileged pods in solar-production namespace and perform privilege escalation as declared by the above Cluster Role psp:privileged.

As tenant owner, creates a namespace:

kubectl --kubeconfig alice-solar.kubeconfig create ns solar-production

and create a pod with privileged permissions:

kubectl --kubeconfig alice-solar.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

Since the assigned PodSecurityPolicy explicitly disallows privileged containers, the tenant owner will see her request to be rejected by the Pod Security Policy Admission Controller.

2.7 - Controller Options

Understand the Capsule configuration options and how to use them.

The configuration for the capsule controller is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

CapsuleConfiguration

The configuration for Capsule is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

kubectl explain capsuleConfiguration.spec

enableTLSReconciler

Toggles the TLS reconciler, the controller that is able to generate CA and certificates for the webhooks when not using an already provided CA and certificate, or when these are managed externally with Vault, or cert-manager.

forceTenantPrefix

Enforces the Tenant owner, during Namespace creation, to name it using the selected Tenant name as prefix, separated by a dash. This is useful to avoid Namespace name collision in a public CaaS environment.

nodeMetadata

Allows to set the forbidden metadata for the worker nodes that could be patched by a Tenant. This applies only if the Tenant has an active NodeSelector, and the Owner have right to patch their nodes.

overrides

Allows to set different name rather than the canonical one for the Capsule configuration objects, such as webhook secret or configurations.

protectedNamespaceRegex

Disallow creation of namespaces, whose name matches this regexp

userGroups

Names of the groups for Capsule users. Users must have this group to be considered for the Capsule tenancy. If a user does not have any group mentioned here, they are not recognized as a Capsule user.

Controller Options

Depending on the version of the Capsule Controller, the configuration options may vary. You can view the options for the latest version of the Capsule Controller here or by executing the controller locally:

$ docker run ghcr.io/projectcapsule/capsule:v0.6.0-rc0 -h
2024/02/25 13:21:21 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
Usage of /ko-app/capsule:
      --configuration-name string         The CapsuleConfiguration resource name to use (default "default")
      --enable-leader-election            Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.
      --metrics-addr string               The address the metric endpoint binds to. (default ":8080")
      --version                           Print the Capsule version and exit
      --webhook-port int                  The port the webhook server binds to. (default 9443)
      --zap-devel                         Development Mode defaults(encoder=consoleEncoder,logLevel=Debug,stackTraceLevel=Warn). Production Mode defaults(encoder=jsonEncoder,logLevel=Info,stackTraceLevel=Error)
      --zap-encoder encoder               Zap log encoding (one of 'json' or 'console')
      --zap-log-level level               Zap Level to configure the verbosity of logging. Can be one of 'debug', 'info', 'error', or any integer value > 0 which corresponds to custom debug levels of increasing verbosity
      --zap-stacktrace-level level        Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').
      --zap-time-encoding time-encoding   Zap time encoding (one of 'epoch', 'millis', 'nano', 'iso8601', 'rfc3339' or 'rfc3339nano'). Defaults to 'epoch'.

3 - Tenants

Understand principles and concepts of Capsule Tenants

Capsule is a framework to implement multi-tenant and policy-driven scenarios in Kubernetes. In this tutorial, we’ll focus on a hypothetical case covering the main features of the Capsule Operator. This documentation is styled in a tutorial format, and it’s designed to be read in sequence. We’ll start with the basics and then move to more advanced topics.

Acme Corp, our sample organization, is building a Container as a Service platform (CaaS) to serve multiple lines of business, or departments, e.g. Oil, Gas, Solar, Wind, Water. Each department has its team of engineers that are responsible for the development, deployment, and operating of their digital products. We’ll work with the following actors:

  • Bill: the cluster administrator from the operations department of Acme Corp.
  • Alice: the project leader in the Solar & Green departments. She is responsible for a team made of different job responsibilities: e.g. developers, administrators, SRE engineers, etc.
  • Joe: works as a lead developer of a distributed team in Alice’s organization.
  • Bob: is the head of engineering for the Water department, the main and historical line of business at Acme Corp.

This scenario will guide you through the following topics.

3.1 - Permissions

Grant permissions for tenants

Ownership

Capsule introduces the principal, that tenants must have owners. The owner of a tenant is a user or a group of users that have the right to create, delete, and manage the tenant’s namespaces and other tenant resources. However an owner does not have the permissions to manage the tenants they are owner of. This is still done by cluster-administrators.

Group Scope

Capsule selects users, which are eligable to be considered for tenancy by their group. The define the group of users that can be considered for tenancy, you can use the userGroups option in the CapsuleConfiguration.

Another commonly used example if you want to promote serviceaccount to tenant-owners, their group must be present:

apiVersion: capsule.clastix.io/v1beta2
kind: CapsuleConfiguration
metadata:
  name: default
spec:
  userGroups:
  - solar-users
  - system:serviceaccounts:tenant-system

All serviceAccounts in the tenant-system namespace will be considered for tenancy and can be promoted to tenant owners.

Assignment

Learn how to assign ownership to users, groups and serviceaccounts.

Assigning Ownership to Users

Bill, the cluster admin, receives a new request from Acme Corp’s CTO asking for a new tenant to be onboarded and Alice user will be the tenant owner. Bill then assigns Alice’s identity of alice in the Acme Corp. identity management system. Since Alice is a tenant owner, Bill needs to assign alice the Capsule group defined by –capsule-user-group option, which defaults to projectcapsule.dev.

To keep things simple, we assume that Bill just creates a client certificate for authentication using X.509 Certificate Signing Request, so Alice’s certificate has "/CN=alice/O=projectcapsule.dev".

Bill creates a new tenant oil in the CaaS management portal according to the tenant’s profile:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User

Bill checks if the new tenant is created and operational:

kubectl get tenant solar
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
solar    Active                     0                                 33m

Note that namespaces are not yet assigned to the new tenant. The tenant owners are free to create their namespaces in a self-service fashion and without any intervention from Bill.

Once the new tenant solar is in place, Bill sends the login credentials to Alice. Alice can log in using her credentials and check if she can create a namespace

kubectl auth can-i create namespaces
yes

or even delete the namespace

kubectl auth can-i delete ns -n solar-production
yes

However, cluster resources are not accessible to Alice

kubectl auth can-i get namespaces
no

kubectl auth can-i get nodes
no

kubectl auth can-i get persistentvolumes
no

including the Tenant resources

kubectl auth can-i get tenants
no

Group of subjects as tenant owner

In the example above, Bill assigned the ownership of solar tenant to alice user. If another user, e.g. Bob needs to administer the solar tenant, Bill can assign the ownership of solar tenant to such user too:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  - name: bob
    kind: User

However, it’s more likely that Bill assigns the ownership of the solar tenant to a group of users instead of a single one, especially if you use OIDC AUthentication. Bill creates a new group account solar-users in the Acme Corp. identity management system and then he assigns Alice and Bob identities to the solar-users group.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: solar-users
    kind: Group

With the configuration above, any user belonging to the solar-users group will be the owner of the oil tenant with the same permissions of Alice. For example, Bob can log in with his credentials and issue

kubectl auth can-i create namespaces
yes

All the groups you want to promot to Tenant Owners must be part of the Group Scope. You have to add solar-users to the CapsuleConfiguration Group Scope to make it work.

ServiceAccounts

You can use the Group subject to grant serviceaccounts the ownership of a tenant. For example, you can create a group of serviceaccounts and assign it to the tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: system:serviceaccount:tenant-system:robot
    kind: ServiceAccount

Bill can create a Service Account called robot, for example, in the tenant-system namespace and leave it to act as Tenant Owner of the oil tenant

kubectl --as system:serviceaccount:tenant-system:robot --as-group projectcapsule.dev auth can-i create namespaces
yes

since each service account in a namespace is a member of following group:

system:serviceaccounts:{service-account-namespace}

You have to add system:serviceaccounts:{service-account-namespace} to the CapsuleConfiguration Group Scope to make it work.

Owner Roles

By default, all Tenant Owners will be granted with two ClusterRole resources using the RoleBinding API:

  1. admin: the Kubernetes default one, admin, that grants most of the namespace scoped resources
  2. capsule-namespace-deleter: a custom clusterrole, created by Capsule, allowing to delete the created namespaces

You can observe this behavior when you get the tenant solar:

$ kubectl get tnt solar -o yaml
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  labels:
    kubernetes.io/metadata.name: solar
  name: solar
spec:
  ingressOptions:
    hostnameCollisionScope: Disabled
  limitRanges: {}
  networkPolicies: {}
  owners:
  # -- HERE -- #
  - clusterRoles:
    - admin
    - capsule-namespace-deleter
    kind: User
    name: alice
  resourceQuotas:
    scope: Tenant
status:
  namespaces:
  - solar-production
  - solar-system
  size: 2
  state: Active

In the example below, assuming the tenant owner creates a namespace solar-production in Tenant solar, you’ll see the Role Bindings giving the tenant owner full permissions on the tenant namespaces:

$ kubectl get rolebinding -n solar-production
NAME                                        ROLE                                    AGE
capsule-solar-0-admin                       ClusterRole/admin                       111m
capsule-solar-1-capsule-namespace-deleter   ClusterRole/capsule-namespace-deleter   111m

When Alice creates the namespaces, the Capsule controller assigns to Alice the following permissions, so that Alice can act as the admin of all the tenant namespaces:

$ kubectl get rolebinding -n solar-production -o yaml
apiVersion: v1
items:
- apiVersion: rbac.authorization.k8s.io/v1
  kind: RoleBinding
  metadata:
    creationTimestamp: "2024-02-25T14:02:36Z"
    labels:
      capsule.clastix.io/role-binding: 8fb969aaa7a67b71
      capsule.clastix.io/tenant: solar
    name: capsule-solar-0-admin
    namespace: solar-production
    ownerReferences:
    - apiVersion: capsule.clastix.io/v1beta2
      blockOwnerDeletion: true
      controller: true
      kind: Tenant
      name: solar
      uid: 1e6f11b9-960b-4fdd-82ee-7cd91a2db052
    resourceVersion: "2980"
    uid: 939da5ae-7fec-4300-8db2-223d3049b43f
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: admin
  subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: alice
- apiVersion: rbac.authorization.k8s.io/v1
  kind: RoleBinding
  metadata:
    creationTimestamp: "2024-02-25T14:02:36Z"
    labels:
      capsule.clastix.io/role-binding: b8822dde20953fb1
      capsule.clastix.io/tenant: solar
    name: capsule-solar-1-capsule-namespace-deleter
    namespace: solar-production
    ownerReferences:
    - apiVersion: capsule.clastix.io/v1beta2
      blockOwnerDeletion: true
      controller: true
      kind: Tenant
      name: solar
      uid: 1e6f11b9-960b-4fdd-82ee-7cd91a2db052
    resourceVersion: "2982"
    uid: bbb4cd79-ce0d-41b0-a52d-dbed71a9b48a
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: capsule-namespace-deleter
  subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: alice
kind: List
metadata:
  resourceVersion: ""

In some cases, the cluster admin needs to narrow the range of permissions assigned to tenant owners by assigning a Cluster Role with less permissions than above. Capsule supports the dynamic assignment of any ClusterRole resources for each Tenant Owner.

For example, assign user Joe the tenant ownership with only view permissions on tenant namespaces:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  - name: joe
    kind: User
    clusterRoles:
      - view

you’ll see the new Role Bindings assigned to Joe:

$ kubectl get rolebinding -n solar-production
NAME                                        ROLE                                    AGE
capsule-solar-0-admin                       ClusterRole/admin                       114m
capsule-solar-1-capsule-namespace-deleter   ClusterRole/capsule-namespace-deleter   114m
capsule-solar-2-view                        ClusterRole/view                        1s

so that Joe can only view resources in the tenant namespaces:

kubectl --as joe --as-group projectcapsule.dev auth can-i delete pods -n solar-production
no

Please, note that, despite created with more restricted permissions, a tenant owner can still create namespaces in the tenant because he belongs to the projectcapsule.dev group. If you want a user not acting as tenant owner, but still operating in the tenant, you can assign additional RoleBindings without assigning him the tenant ownership.

Custom ClusterRoles are also supported. Assuming the cluster admin creates:

kubectl apply -f - << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: tenant-resources
rules:
- apiGroups: ["capsule.clastix.io"]
  resources: ["tenantresources"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
EOF

These permissions can be granted to Joe

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  - name: joe
    kind: User
    clusterRoles:
      - view
      - tenant-resources

For the given configuration, the resulting RoleBinding resources are the following ones:

$ kubectl -n solar-production get rolebindings
NAME                                              ROLE                                            AGE
capsule-solar-0-admin                               ClusterRole/admin                               90s
capsule-solar-1-capsule-namespace-deleter           ClusterRole/capsule-namespace-deleter           90s
capsule-solar-2-view                                ClusterRole/view                                90s
capsule-solar-3-tenant-resources                    ClusterRole/prometheus-servicemonitors-viewer   25s

Role Aggregation

Sometimes the admin role is missing certain permissions. You can aggregate the admin role with a custom role, for example, gateway-resources:

kubectl apply -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: gateway-resources
  labels:
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["gateway.networking.k8s.io"]
  resources: ["gateways"]
  verbs: ["*"]
EOF

Proxy Owner Authorization

This feature will be deprecated in a future release of Capsule. Instead use ProxySettings

When you are using the Capsule Proxy, the tenant owner can list the cluster-scoped resources. You can control the permissions to cluster scoped resources by defining proxySettings for a tenant owner.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: joe
    kind: User
    clusterRoles:
      - view
      - tenant-resources

Additional Rolebindings

With tenant rolebindings you can distribute namespaced rolebindings to all namespaces which are assigned to a namespace. Essentially it is then ensured the defined rolebindings are present and reconciled in all namespaces of the tenant. This is useful if users should have more insights on tenant basis. Let’s look at an example.

Assuming a cluster-administrator creates the following clusterRole:

kubectl apply -f - << EOF
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus-servicemonitors-viewer
rules:
- apiGroups: ["monitoring.coreos.com"]
  resources: ["servicemonitors"]
  verbs: ["get", "list", "watch"]
EOF

Now the cluster-administrator creates wants to bind this clusterRole in each namespace of the solar tenant. He creates a tenantRoleBinding:

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  additionalRoleBindings:
  - clusterRoleName: 'prometheus-servicemonitors-viewer'
    subjects:
    - apiGroup: rbac.authorization.k8s.io
      kind: User
      name: joe
EOF

As you can see the subjects is a classic rolebinding subject. This way you grant permissions to the subject user Joe, who only can list and watch servicemonitors in the solar tenant namespaces, but has no other permissions.

Custom Resources

Capsule grants admin permissions to the tenant owners but is only limited to their namespaces. To achieve that, it assigns the ClusterRole admin to the tenant owner. This ClusterRole does not permit the installation of custom resources in the namespaces.

In order to leave the tenant owner to create Custom Resources in their namespaces, the cluster admin defines a proper Cluster Role. For example:

kubectl apply -f - << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: argoproj-provisioner
rules:
- apiGroups:
  - argoproj.io
  resources:
  - applications
  - appprojects
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
  - delete
EOF

Bill can assign this role to any namespace in the Alice’s tenant by setting it in the tenant manifest:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  - name: joe
    kind: User
  additionalRoleBindings:
    - clusterRoleName: 'argoproj-provisioner'
      subjects:
        - apiGroup: rbac.authorization.k8s.io
          kind: User
          name: alice
        - apiGroup: rbac.authorization.k8s.io
          kind: User
          name: joe

With the given specification, Capsule will ensure that all Alice’s namespaces will contain a RoleBinding for the specified Cluster Role. For example, in the solar-production namespace, Alice will see:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: capsule-solar-argoproj-provisioner
  namespace: solar-production
subjects:
  - kind: User
    apiGroup: rbac.authorization.k8s.io
    name: alice
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: argoproj-provisioner

With the above example, Capsule is leaving the tenant owner to create namespaced custom resources.

Take Note: a tenant owner having the admin scope on its namespaces only, does not have the permission to create Custom Resources Definitions (CRDs) because this requires a cluster admin permission level. Only Bill, the cluster admin, can create CRDs. This is a known limitation of any multi-tenancy environment based on a single shared control plane.

3.2 - Namespaces

Assign Namespace to tenants

Alice, once logged with her credentials, can create a new namespace in her tenant, as simply issuing:

kubectl create ns solar-production

Alice started the name of the namespace prepended by the name of the tenant: this is not a strict requirement but it is highly suggested because it is likely that many different tenants would like to call their namespaces production, test, or demo, etc.

The enforcement of this naming convention is optional and can be controlled by the cluster administrator with forceTenantPrefix option.

Alice can deploy any resource in any of the namespaces

kubectl -n solar-development run nginx --image=docker.io/nginx 
kubectl -n solar-development get pods

Multiple Tenants

A single team is likely responsible for multiple lines of business. For example, in our sample organization Acme Corp., Alice is responsible for both the Solar and Green lines of business. It’s more likely that Alice requires two different tenants, for example, solar and green to keep things isolated.

By design, the Capsule operator does not permit a hierarchy of tenants, since all tenants are at the same levels. However, we can assign the ownership of multiple tenants to the same user or group of users.

Bill, the cluster admin, creates multiple tenants having alice as owner:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User

and

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: green
spec:
  owners:
  - name: alice
    kind: User

Alternatively, the ownership can be assigned to a group called solar-and-green for both tenants:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: solar-and-green
    kind: Group

See Ownership for more details on how to assign ownership to a group of users.

The two tenants remain isolated from each other in terms of resources assignments, e.g. ResourceQuotas, Nodes, StorageClasses and IngressClasses, and in terms of governance, e.g. NetworkPolicies, PodSecurityPolicies, Trusted Registries, etc.

When Alice logs in, she has access to all namespaces belonging to both the solar and green tenants.

Tenant Prefix

We recommend to use the forceTenantPrefix for production environments.

If the forceTenantPrefix option is enabled, which is not the case by default, the namespaces are automatically assigned to the right tenant by Capsule because the operator does a lookup on the tenant names.

For example, Alice creates a namespace called solar-production and green-production:

kubectl create ns solar-production
kubectl create ns green-production

And they are assigned to the tenant based on their prefix:

$ kubectl get tnt
NAME    STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
green   Active                     1                                 3m26s
solar   Active                     1                                 3m26s

However alice can create any namespace, which does not have a prefix of any of the tenants she owns, for example production:

$ kubectl create ns production
Error from server (Forbidden): admission webhook "owner.namespace.capsule.clastix.io" denied the request: The Namespace prefix used doesn't match any available Tenant

Label

The default behavior, if the forceTenantPrefix option is not enabled, Alice needs to specify the tenant name as a label capsule.clastix.io/tenant=<desired_tenant> in the namespace manifest:

kind: Namespace
apiVersion: v1
metadata:
  name: solar-production
  labels:
    capsule.clastix.io/tenant: solar

If not specified, Capsule will deny with the following message: Unable to assign namespace to tenant:

$ kubectl create ns solar-production
Error from server (Forbidden): admission webhook "owner.namespace.capsule.clastix.io" denied the request: Please use capsule.clastix.io/tenant label when creating a namespace

3.3 - Quotas

Strategies on granting quotas on tenant-basis

With help of Capsule, Bill, the cluster admin, can set and enforce resources quota and limits for Alice’s tenant.

Set resources quota for each namespace in the Alice’s tenant by defining them in the tenant spec:

Resource Quota

With help of Capsule, Bill, the cluster admin, can set and enforce resources quota and limits for Alice’s tenant.

Set resources quota for each namespace in the Alice’s tenant by defining them in the tenant spec:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  namespaceOptions:
    quota: 3
  resourceQuotas:
    scope: Tenant
    items:
    - hard:
        limits.cpu: "8"
        limits.memory: 16Gi
        requests.cpu: "8"
        requests.memory: 16Gi
    - hard:
        pods: "10"

The resource quotas above will be inherited by all the namespaces created by Alice. In our case, when Alice creates the namespace solar-production, Capsule creates the following resource quotas:

kind: ResourceQuota
apiVersion: v1
metadata:
  name: capsule-solar-0
  namespace: solar-production
  labels:
    tenant: solar
spec:
  hard:
    limits.cpu: "8"
    limits.memory: 16Gi
    requests.cpu: "8"
    requests.memory: 16Gi
---
kind: ResourceQuota
apiVersion: v1
metadata:
  name: capsule-oil-1
  namespace: solar-production
  labels:
    tenant: solar
spec:
  hard:
    pods : "10"

Alice can create any resource according to the assigned quotas:

kubectl -n solar-production create deployment nginx --image nginx:latest --replicas 4

At namespace solar-production level, Alice can see the used resources by inspecting the status in ResourceQuota:

kubectl -n solar-production get resourcequota capsule-solar-1 -o yaml
...
status:
  hard:
    pods: "10"
    services: "50"
  used:
    pods: "4"

When defining ResourceQuotas you might want to consider distributing LimitRanges via Tenant Replications:

apiVersion: capsule.clastix.io/v1beta2
kind: TenantResource
metadata:
  name: solar-limitranges
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - namespaceSelector:
        matchLabels:
          capsule.clastix.io/tenant: solar
      rawItems:
        - apiVersion: v1
          kind: LimitRange
          metadata:
            name: cpu-resource-constraint
          spec:
            limits:
            - default: # this section defines default limits
                cpu: 500m
              defaultRequest: # this section defines default requests
                cpu: 500m
              max: # max and min define the limit range
                cpu: "1"
              min:
                cpu: 100m
              type: Container

Tenant Scope

This approach might lead to resource over consumption. Currently we don’t have a way to consistently assure the resource quota at tenant level. See issues issue/49

By setting enforcement at tenant level, i.e. spec.resourceQuotas.scope=Tenant, Capsule aggregates resources usage for all namespaces in the tenant and adjusts all the ResourceQuota usage as aggregate. In such case, Alice can check the used resources at the tenant level by inspecting the annotations in ResourceQuota object of any namespace in the tenant:

kubectl -n solar-production get resourcequotas capsule-solar-1 -o yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  annotations:
    quota.capsule.clastix.io/used-pods: "4"
    quota.capsule.clastix.io/hard-pods: "10"
...

or

kubectl -n solar-development get resourcequotas capsule-solar-1 -o yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  annotations:
    quota.capsule.clastix.io/used-pods: "4"
    quota.capsule.clastix.io/hard-pods: "10"
...

When the aggregate usage for all namespaces crosses the hard quota, then the native ResourceQuota Admission Controller in Kubernetes denies Alice’s request to create resources exceeding the quota:

kubectl -n solar-development create deployment nginx --image nginx:latest --replicas 10

Alice cannot schedule more pods than the admitted at tenant aggregate level.

kubectl -n solar-development get pods
NAME                     READY   STATUS    RESTARTS   AGE
nginx-55649fd747-6fzcx   1/1     Running   0          12s
nginx-55649fd747-7q6x6   1/1     Running   0          12s
nginx-55649fd747-86wr5   1/1     Running   0          12s
nginx-55649fd747-h6kbs   1/1     Running   0          12s
nginx-55649fd747-mlhlq   1/1     Running   0          12s
nginx-55649fd747-t48s5   1/1     Running   0          7s

and

kubectl -n solar-production get pods
NAME                     READY   STATUS    RESTARTS   AGE
nginx-55649fd747-52fsq   1/1     Running   0          22m
nginx-55649fd747-9q8n5   1/1     Running   0          22m
nginx-55649fd747-r8vzr   1/1     Running   0          22m
nginx-55649fd747-tkv7m   1/1     Running   0          22m

Namespace Scope

By setting enforcement at the namespace level, i.e. spec.resourceQuotas.scope=Namespace, Capsule does not aggregate the resources usage and all enforcement is done at the namespace level.

Namespace Quotas

The cluster admin, can control how many namespaces Alice, creates by setting a quota in the tenant manifest spec.namespaceOptions.quota:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  namespaceOptions:
    quota: 3

Alice can create additional namespaces according to the quota:

kubectl create ns solar-development
kubectl create ns solar-test

While Alice creates namespaces, the Capsule controller updates the status of the tenant so Bill, the cluster admin, can check the status:

$ kubectl describe tenant solar
...
status:
  Namespaces:
    solar-development
    solar-production
    solar-test
  Size:   3 # current namespace count
  State:  Active
...

Once the namespace quota assigned to the tenant has been reached, Alice cannot create further namespaces:

$ kubectl create ns solar-training
Error from server (Cannot exceed Namespace quota: please, reach out to the system administrators):
admission webhook "namespace.capsule.clastix.io" denied the request.

The enforcement on the maximum number of namespaces per Tenant is the responsibility of the Capsule controller via its Dynamic Admission Webhook capability.

Custom Resources

This feature is still in an alpha stage and requires a high amount of computing resources due to the dynamic client requests.

Kubernetes offers by default ResourceQuota resources, aimed to limit the number of basic primitives in a Namespace.

Capsule already provides the sharing of these constraints across the Tenant Namespaces, however, limiting the amount of namespaced Custom Resources instances is not upstream-supported.

Starting from Capsule v0.1.1, this can be done using a special annotation in the Tenant manifest.

Imagine the case where a Custom Resource named mysqls in the API group databases.acme.corp/v1 usage must be limited in the Tenant solar: this can be done as follows.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
  annotations:
    quota.resources.capsule.clastix.io/mysqls.databases.acme.corp_v1: "3"
spec:
  additionalRoleBindings:
  - clusterRoleName: mysql-namespace-admin
    subjects:
      - kind: User
        name: alice
  owners:
  - name: alice
    kind: User

The Additional Role Binding referring to the Cluster Role mysql-namespace-admin is required to let Alice manage their Custom Resource instances.

The pattern for the quota.resources.capsule.clastix.io annotation is the following:

  • quota.resources.capsule.clastix.io/${PLURAL_NAME}.${API_GROUP}_${API_VERSION}

You can figure out the required fields using kubectl api-resources.

When alice will create a MySQL instance in one of their Tenant Namespace, the Cluster Administrator can easily retrieve the overall usage.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
  annotations:
    quota.resources.capsule.clastix.io/mysqls.databases.acme.corp_v1: "3"
    used.resources.capsule.clastix.io/mysqls.databases.acme.corp_v1: "1"
spec:
  owners:
  - name: alice
    kind: User

Node Pools

Bill, the cluster admin, can dedicate a pool of worker nodes to the oil tenant, to isolate the tenant applications from other noisy neighbors. To achieve this approach use NodeSelectors.

3.4 - Enforcement

Configure policies and restrictions on tenant-basis

Metadata

Namespaces

The cluster admin can “taint” the namespaces created by tenant owners with additional metadata as labels and annotations. There is no specific semantic assigned to these labels and annotations: they will be assigned to the namespaces in the tenant as they are created. This can help the cluster admin to implement specific use cases as, for example, leave only a given tenant to be backed up by a backup service.

Assigns additional labels and annotations to all namespaces created in the solar tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - name: alice
    kind: User
  namespaceOptions:
    additionalMetadata:
      annotations:
        storagelocationtype: s3
      labels:
        projectcapsule.dev/backup: "true"

When the tenant owner creates a namespace, it inherits the given label and/or annotation:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    storagelocationtype: s3
  labels:
    capsule.clastix.io/tenant: solar
    kubernetes.io/metadata.name: solar-production
    name: solar-production
    projectcapsule.dev/backup: "true"
  name: solar-production
  ownerReferences:
  - apiVersion: capsule.clastix.io/v1beta2
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: solar
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

Deny labels and annotations on Namespaces

By default, capsule allows tenant owners to add and modify any label or annotation on their namespaces.

But there are some scenarios, when tenant owners should not have an ability to add or modify specific labels or annotations (for example, this can be labels used in Kubernetes network policies which are added by cluster administrator).

Bill, the cluster admin, can deny Alice to add specific labels and annotations on namespaces:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  namespaceOptions:
    forbiddenAnnotations:
      denied:
          - foo.acme.net
          - bar.acme.net
      deniedRegex: .*.acme.net 
    forbiddenLabels:
      denied:
          - foo.acme.net
          - bar.acme.net
      deniedRegex: .*.acme.net
  owners:
  - name: alice
    kind: User

Nodes

Due to CVE-2021-25735 this feature is only supported for Kubernetes version older than: v1.18.18, v1.19.10, v1.20.6, v1.21.0

When using capsule together with capsule-proxy, Bill can allow Tenant Owners to modify Nodes.

By default, it will allow tenant owners to add and modify any label or annotation on their nodes.

But there are some scenarios, when tenant owners should not have an ability to add or modify specific labels or annotations (there are some types of labels or annotations, which must be protected from modifications - for example, which are set by cloud-providers or autoscalers).

Bill, the cluster admin, can deny Tenant Owners to add or modify specific labels and annotations on Nodes:

apiVersion: capsule.clastix.io/v1beta2
kind: CapsuleConfiguration
metadata:
  name: default 
spec:
  nodeMetadata:
    forbiddenAnnotations:
      denied:
        - foo.acme.net
        - bar.acme.net
      deniedRegex: .*.acme.net
    forbiddenLabels:
      denied:
        - foo.acme.net
        - bar.acme.net
      deniedRegex: .*.acme.net
  userGroups:
    - projectcapsule.dev
    - system:serviceaccounts:default

Services

The cluster admin can “taint” the services created by the tenant owners with additional metadata as labels and annotations.

Assigns additional labels and annotations to all services created in the solar tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  serviceOptions:
    additionalMetadata:
      annotations:
        storagelocationtype: s3
      labels:
        projectcapsule.dev/backup: "true"

When the tenant owner creates a service in a tenant namespace, it inherits the given label and/or annotation:

apiVersion: v1
kind: Service
metadata:
  name: nginx
  namespace: solar-production
  labels:
    projectcapsule.dev/backup: "true"
  annotations:
    storagelocationtype: s3
spec:
  ports:
  - protocol: TCP
    port: 80 
    targetPort: 8080 
  selector:
    run: nginx
  type: ClusterIP

Services

The cluster admin can “taint” the pods created by the tenant owners with additional metadata as labels and annotations.

Assigns additional labels and annotations to all services created in the solar tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  podOptions:
    additionalMetadata:
      annotations:
        storagelocationtype: s3
      labels:
        projectcapsule.dev/backup: "true"

When the tenant owner creates a service in a tenant namespace, it inherits the given label and/or annotation:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
  labels:
    projectcapsule.dev/backup: "true"
  annotations:
    storagelocationtype: s3
...

Scheduling

LimitRanges

This feature will be deprecated in a future release of Capsule. Instead use TenantReplications

Bill, the cluster admin, can also set Limit Ranges for each namespace in Alice’s tenant by defining limits for pods and containers in the tenant spec:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
...
  limitRanges:
    items:
      - limits:
          - type: Pod
            min:
              cpu: "50m"
              memory: "5Mi"
            max:
              cpu: "1"
              memory: "1Gi"
      - limits:
          - type: Container
            defaultRequest:
              cpu: "100m"
              memory: "10Mi"
            default:
              cpu: "200m"
              memory: "100Mi"
            min:
              cpu: "50m"
              memory: "5Mi"
            max:
              cpu: "1"
              memory: "1Gi"
      - limits:
          - type: PersistentVolumeClaim
            min:
              storage: "1Gi"
            max:
              storage: "10Gi"

Limits will be inherited by all the namespaces created by Alice. In our case, when Alice creates the namespace solar-production, Capsule creates the following:

apiVersion: v1
kind: LimitRange
metadata:
  name: capsule-solar-0
  namespace: solar-production
spec:
  limits:
    - max:
        cpu: "1"
        memory: 1Gi
      min:
        cpu: 50m
        memory: 5Mi
      type: Pod
---
apiVersion: v1
kind: LimitRange
metadata:
  name: capsule-solar-1
  namespace: solar-production
spec:
  limits:
    - default:
        cpu: 200m
        memory: 100Mi
      defaultRequest:
        cpu: 100m
        memory: 10Mi
      max:
        cpu: "1"
        memory: 1Gi
      min:
        cpu: 50m
        memory: 5Mi
      type: Container
---
apiVersion: v1
kind: LimitRange
metadata:
  name: capsule-solar-2
  namespace: solar-production
spec:
  limits:
    - max:
        storage: 10Gi
      min:
        storage: 1Gi
      type: PersistentVolumeClaim

Note: being the limit range specific of single resources, there is no aggregate to count.

Alice doesn’t have permission to change or delete the resources according to the assigned RBAC profile.

kubectl -n solar-production auth can-i patch resourcequota
no
kubectl -n solar-production auth can-i delete resourcequota
no
kubectl -n solar-production auth can-i patch limitranges
no
kubectl -n solar-production auth can-i delete limitranges
no

LimitRange Distribution with TenantReplications

In the future Cluster-Administrators must distribute LimitRanges via TenantReplications. This is a more flexible and powerful way to distribute LimitRanges, as it allows to distribute any kind of resource, not only LimitRanges. Here’s an example of how to distribute a LimitRange to all the namespaces of a tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: TenantResource
metadata:
  name: solar-limitranges
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - namespaceSelector:
        matchLabels:
          capsule.clastix.io/tenant: solar
      rawItems:
        - apiVersion: v1
          kind: LimitRange
          metadata:
            name: cpu-resource-constraint
          spec:
            limits:
            - default: # this section defines default limits
                cpu: 500m
              defaultRequest: # this section defines default requests
                cpu: 500m
              max: # max and min define the limit range
                cpu: "1"
              min:
                cpu: 100m
              type: Container

PriorityClasses

Pods can have priority. Priority indicates the importance of a Pod relative to other Pods. If a Pod cannot be scheduled, the scheduler tries to preempt (evict) lower priority Pods to make scheduling of the pending Pod possible. See Kubernetes documentation.

In a multi-tenant cluster, not all users can be trusted, as a tenant owner could create Pods at the highest possible priorities, causing other Pods to be evicted/not get scheduled.

To prevent misuses of Pod Priority Class, Bill, the cluster admin, can enforce the allowed Pod Priority Class at tenant level:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - name: alice
    kind: User
  priorityClasses:
    matchLabels:
      env: "production"

With the said Tenant specification, Alice can create a Pod resource if spec.priorityClassName equals to:

  • Any PriorityClass which has the label env with the value production

If a Pod is going to use a non-allowed Priority Class, it will be rejected by the Validation Webhook enforcing it.

Assign Pod Priority Class as tenant default

Note: This feature supports type PriorityClass only on API version scheduling.k8s.io/v1

This feature allows specifying a custom default value on a Tenant basis, bypassing the global cluster default (globalDefault=true) that acts only at the cluster level.

It’s possible to assign each tenant a PriorityClass which will be used, if no PriorityClass is set on pod basis:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  priorityClasses:
    default: "tenant-default"
    matchLabels:
      env: "production"

Let’s create a PriorityClass which is used as the default:

kubectl apply -f - << EOF
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: tenant-default
  labels:
    env: "production"
value: 1313
preemptionPolicy: Never
globalDefault: false
description: "This is the default PriorityClass for the solar-tenant"
EOF

Note the globalDefault: false which is important to avoid the PriorityClass to be used as the default for all the tenants. If a Pod has no value for spec.priorityClassName, the default value for PriorityClass (tenant-default) will be used.

RuntimeClasses

Pods can be assigned different runtime classes. With the assigned runtime you can control Container Runtime Interface (CRI) is used for each pod. See Kubernetes documentation for more information.

To prevent misuses of Pod Runtime Classes, Bill, the cluster admin, can enforce the allowed Pod Runtime Class at tenant level:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  runtimeClasses:
    matchLabels:
      env: "production"

With the said Tenant specification, Alice can create a Pod resource if spec.runtimeClassName equals to:

  • any RuntimeClass which has the labelenv with the value production

If a Pod is going to use a non-allowed Runtime Class, it will be rejected by the Validation Webhook enforcing it.

NodeSelector

Bill, the cluster admin, can dedicate a pool of worker nodes to the oil tenant, to isolate the tenant applications from other noisy neighbors.

These nodes are labeled by Bill as pool=renewable

kubectl get nodes --show-labels

NAME                      STATUS   ROLES             AGE   VERSION   LABELS
...
worker06.acme.com         Ready    worker            8d    v1.25.2 pool=renewable
worker07.acme.com         Ready    worker            8d    v1.25.2   pool=renewable
worker08.acme.com         Ready    worker            8d    v1.25.2   pool=renewable

PodNodeSelector

This approach requires PodNodeSelector Admission Controller plugin to be active. If the plugin is not active, the pods will be scheduled to any node. If your distribution does not support this feature, you can use Expression Node Selectors.

The label pool=renewable is defined as .spec.nodeSelector in the tenant manifest:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  nodeSelector:
    pool: renewable
    kubernetes.io/os: linux

The Capsule controller makes sure that any namespace created in the tenant has the annotation: scheduler.alpha.kubernetes.io/node-selector: pool=renewable. This annotation tells the scheduler of Kubernetes to assign the node selector pool=renewable to all the pods deployed in the tenant. The effect is that all the pods deployed by Alice are placed only on the designated pool of nodes.

Multiple node selector labels can be defined as in the following snippet:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  nodeSelector:
    pool: renewable
    kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    hardware: gpu

Any attempt of Alice to change the selector on the pods will result in an error from the PodNodeSelector Admission Controller plugin.

kubectl auth can-i edit ns -n solar-production
no

Node Selector Expressions

Feature TBD

Connectivity

Services

Deny Service Types

Bill, the cluster admin, can prevent the creation of services with specific service types.

NodePort

When dealing with a shared multi-tenant scenario, multiple NodePort services can start becoming cumbersome to manage. The reason behind this could be related to the overlapping needs by the Tenant owners, since a NodePort is going to be open on all nodes and, when using hostNetwork=true, accessible to any Pod although any specific NetworkPolicy.

Bill, the cluster admin, can block the creation of services with NodePort service type for a given tenant

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  serviceOptions:
    allowedServices:
      nodePort: false

With the above configuration, any attempt of Alice to create a Service of type NodePort is denied by the Validation Webhook enforcing it. Default value is true.

ExternalName

Service with the type of ExternalName has been found subject to many security issues. To prevent tenant owners to create services with the type of ExternalName, the cluster admin can prevent a tenant to create them:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  serviceOptions:
    allowedServices:
      externalName: false

With the above configuration, any attempt of Alice to create a Service of type externalName is denied by the Validation Webhook enforcing it. Default value is true.

LoadBalancer

Same as previously, the Service of type of LoadBalancer could be blocked for various reasons. To prevent tenant owners to create these kinds of services, the cluster admin can prevent a tenant to create them:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  serviceOptions:
    allowedServices:
      loadBalancer: false

With the above configuration, any attempt of Alice to create a Service of type LoadBalancer is denied by the Validation Webhook enforcing it. Default value is true.

GatewayClasses

Note: This feature is offered only by API type GatewayClass in group gateway.networking.k8s.io version v1.

GatewayClass is cluster-scoped resource defined by the infrastructure provider. This resource represents a class of Gateways that can be instantiated. Read More

Bill can assign a set of dedicated GatewayClasses to the solar tenant to force the applications in the solar tenant to be published only by the assigned Gateway Controller:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  gatewayOptions:
    allowedClasses:
      matchLabels:
        env: "production"

With the said Tenant specification, Alice can create a Gateway resource if spec.gatewayClassName equals to:

  • Any GatewayClass which has the label env with the value production

If an Gateway is going to use a non-allowed GatewayClass, it will be rejected by the Validation Webhook enforcing it.

Alice can create an Gateway using only an allowed GatewayClass:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: example-gateway
  namespace: solar-production
spec:
  gatewayClassName: customer-class
  listeners:
  - name: http
    protocol: HTTP
    port: 80

Any attempt of Alice to use a non-valid GatewayClass, or missing it, is denied by the Validation Webhook enforcing it.

Assign GatewayClass as tenant default

Note: The Default GatewayClass must have a label which is allowed within the tenant. This behavior is only implemented this way for the GatewayClass default.

This feature allows specifying a custom default value on a Tenant basis. Currently there is no global default feature for a GatewayClass. Each Gateway must have a spec.gatewayClassName set.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  gatewayOptions:
    allowedClasses:
      default: "tenant-default"
      matchLabels:
        env: "production"

Here’s how the Tenant default GatewayClass could look like:

kubectl apply -f - << EOF
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: tenant-default
  labels:
    env: "production"
spec:
  controllerName: example.com/gateway-controller
EOF

If a Gateway has no value for spec.gatewayClassName, the tenant-default GatewayClass is automatically applied to the Gateway resource.

Ingresses

Assign Ingress Hostnames

Bill can control ingress hostnames in the solar tenant to force the applications to be published only using the given hostname or set of hostnames:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  ingressOptions:
    allowedHostnames:
      allowed:
        - solar.acmecorp.com
      allowedRegex: ^.*acmecorp.com$

The Capsule controller assures that all Ingresses created in the tenant can use only one of the valid hostnames. Alice can create an Ingress using any allowed hostname:

kubectl apply -f - << EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx
  namespace: solar-production
spec:
  ingressClassName: solar
  rules:
  - host: web.solar.acmecorp.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx
            port:
              number: 80
EOF

Any attempt of Alice to use a non-valid hostname is denied by the Validation Webhook enforcing it.

Control Hostname collision in Ingresses

In a multi-tenant environment, as more and more ingresses are defined, there is a chance of collision on the hostname leading to unpredictable behavior of the Ingress Controller. Bill, the cluster admin, can enforce hostname collision detection at different scope levels:

  • Cluster
  • Tenant
  • Namespace
  • Disabled (default)
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  - name: joe
    kind: User
  ingressOptions:
    hostnameCollisionScope: Tenant

When a tenant owner creates an Ingress resource, Capsule will check the collision of hostname in the current ingress with all the hostnames already used, depending on the defined scope.

For example, Alice, one of the tenant owners, creates an Ingress:

kubectl apply -f - << EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx
  namespace: solar-production
spec:
  rules:
  - host: web.solar.acmecorp.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx
            port:
              number: 80
EOF

Another user, Joe creates an Ingress having the same hostname:

kubectl apply -f - << EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx
  namespace: solar-development
spec:
  rules:
  - host: web.solar.acmecorp.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx
            port:
              number: 80
EOF

When a collision is detected at scope defined by spec.ingressOptions.hostnameCollisionScope, the creation of the Ingress resource will be rejected by the Validation Webhook enforcing it. When spec.ingressOptions.hostnameCollisionScope=Disabled (default), no collision detection is made at all.

Deny Wildcard Hostname in Ingresses

Bill, the cluster admin, can deny the use of wildcard hostname in Ingresses. Let’s assume that Acme Corp. uses the domain acme.com.

As a tenant owner of solar, Alice creates an Ingress with the host like - host: "*.acme.com". That can lead problems for the water tenant because Alice can deliberately create ingress with host: water.acme.com.

To avoid this kind of problems, Bill can deny the use of wildcard hostnames in the following way:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
    - name: alice
      kind: User
  ingressOptions:
    allowWildcardHostnames: false

Doing this, Alice will not be able to use *.water.acme.com, being the tenant owner of solar and green only.

IngressClasses

An Ingress Controller is used in Kubernetes to publish services and applications outside of the cluster. An Ingress Controller can be provisioned to accept only Ingresses with a given Ingress Class.

Bill can assign a set of dedicated Ingress Classes to the solar tenant to force the applications in the solar tenant to be published only by the assigned Ingress Controller:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  ingressOptions:
    allowedClasses:
      matchLabels:
        env: "production"

With the said Tenant specification, Alice can create a Ingress resource if spec.ingressClassName or metadata.annotations."kubernetes.io/ingress.class" equals to:

  • Any IngressClass which has the label env with the value production

If an Ingress is going to use a non-allowed IngressClass, it will be rejected by the Validation Webhook enforcing it.

Alice can create an Ingress using only an allowed Ingress Class:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx
  namespace: solar-production
spec:
  ingressClassName: legacy
  rules:
  - host: oil.acmecorp.com
    http:
      paths:
      - backend:
          service:
            name: nginx
            port:
              number: 80
        path: /
        pathType: ImplementationSpecific

Any attempt of Alice to use a non-valid Ingress Class, or missing it, is denied by the Validation Webhook enforcing it.

Assign Ingress Class as tenant default

Note: This feature is offered only by API type IngressClass in group networking.k8s.io version v1. However, resource Ingress is supported in networking.k8s.io/v1 and networking.k8s.io/v1beta1

This feature allows specifying a custom default value on a Tenant basis, bypassing the global cluster default (with the annotation metadata.annotations.ingressclass.kubernetes.io/is-default-class=true) that acts only at the cluster level. More information: Default IngressClass

It’s possible to assign each tenant an Ingress Class which will be used, if a class is not set on ingress basis:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  ingressOptions:
    allowedClasses:
      default: "tenant-default"
      matchLabels:
        env: "production"

Here’s how the Tenant default IngressClass could look like:

kubectl apply -f - << EOF
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  labels:
    env: "production"
    app.kubernetes.io/component: controller
  name: tenant-default
  annotations:
    ingressclass.kubernetes.io/is-default-class: "false"
spec:
  controller: k8s.io/customer-nginx
EOF

If an Ingress has no value for spec.ingressClassName or metadata.annotations."kubernetes.io/ingress.class", the tenant-default IngressClass is automatically applied to the Ingress resource.

NetworkPolicies

This feature will be deprecated in a future release of Capsule. Instead use TenantReplications. This is also true if you would like other NetworkPolicy implementation like Cilium.

Kubernetes network policies control network traffic between namespaces and between pods in the same namespace. Bill, the cluster admin, can enforce network traffic isolation between different tenants while leaving to Alice, the tenant owner, the freedom to set isolation between namespaces in the same tenant or even between pods in the same namespace.

To meet this requirement, Bill needs to define network policies that deny pods belonging to Alice’s namespaces to access pods in namespaces belonging to other tenants, e.g. Bob’s tenant water, or in system namespaces, e.g. kube-system.

Keep in mind, that because of how the NetworkPolicies API works, the users can still add a policy which contradicts what the Tenant has set, resulting in users being able to circumvent the initial limitation set by the tenant admin. Two options can be put in place to mitigate this potential privilege escalation: 1. providing a restricted role rather than the default admin one 2. using Calico’s GlobalNetworkPolicy, or Cilium’s CiliumClusterwideNetworkPolicy which are defined at the cluster-level, thus creating an order of packet filtering.

Also, Bill can make sure pods belonging to a tenant namespace cannot access other network infrastructures like cluster nodes, load balancers, and virtual machines running other services.

Bill can set network policies in the tenant manifest, according to the requirements:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  networkPolicies:
    items:
    - policyTypes:
      - Ingress
      - Egress
      egress:
      - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except:
              - 192.168.0.0/16 
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              capsule.clastix.io/tenant: oil
        - podSelector: {}
        - ipBlock:
            cidr: 192.168.0.0/16
      podSelector: {}

The Capsule controller, watching for namespace creation, creates the Network Policies for each namespace in the tenant.

Alice has access to network policies:

kubectl -n solar-production get networkpolicies
NAME              POD-SELECTOR   AGE
capsule-solar-0   <none>         42h

Alice can create, patch, and delete additional network policies within her namespaces:

kubectl -n solar-production auth can-i get networkpolicies
yes

kubectl -n solar-production auth can-i delete networkpolicies
yes

kubectl -n solar-production auth can-i patch networkpolicies
yes

For example, she can create:

kubectl apply -f - << EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  labels:
  name: production-network-policy
  namespace: solar-production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF

Check all the network policies

kubectl -n solar-production get networkpolicies
NAME                          POD-SELECTOR   AGE
capsule-solar-0               <none>         42h
production-network-policy     <none>         3m

And delete the namespace network policies:

kubectl -n solar-production delete networkpolicy production-network-policy

Any attempt of Alice to delete the tenant network policy defined in the tenant manifest is denied by the Validation Webhook enforcing it. Any deletion by a cluster-administrator will cause the network policy to be recreated by the Capsule controller.

NetworkPolicy Distribution with TenantReplications

In the future Cluster-Administrators must distribute NetworkPolicies via TenantReplications. This is a more flexible and powerful way to distribute NetworkPolicies, as it allows to distribute any kind of resource. Here’s an example of how to distribute a CiliumNetworkPolicy to all the namespaces of a tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: TenantResource
metadata:
  name: solar-limitranges
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - namespaceSelector:
        matchLabels:
          capsule.clastix.io/tenant: solar
      rawItems:
        - apiVersion: "cilium.io/v2"
          kind: CiliumNetworkPolicy
          metadata:
            name: "l3-rule"
          spec:
            endpointSelector:
              matchLabels:
                role: backend
            ingress:
            - fromEndpoints:
              - matchLabels:
                  role: frontend

Storage

PersistentVolumes

Any Tenant owner is able to create a PersistentVolumeClaim that, backed by a given StorageClass, will provide volumes for their applications.

In most cases, once a PersistentVolumeClaim is deleted, the bounded PersistentVolume will be recycled due.

However, in some scenarios, the StorageClass or the provisioned PersistentVolume itself could change the retention policy of the volume, keeping it available for recycling and being consumable for another Pod.

In such a scenario, Capsule enforces the Volume mount only to the Namespaces belonging to the Tenant on which it’s been consumed, by adding a label to the Volume as follows.

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rancher.io/local-path
  creationTimestamp: "2022-12-22T09:54:46Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    capsule.clastix.io/tenant: solar
  name: pvc-1b3aa814-3b0c-4912-9bd9-112820da38fe
  resourceVersion: "2743059"
  uid: 9836ae3e-4adb-41d2-a416-0c45c2da41ff
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: melange
    namespace: caladan
    resourceVersion: "2743014"
    uid: 1b3aa814-3b0c-4912-9bd9-112820da38fe

Once the PeristentVolume become available again, it can be referenced by any PersistentVolumeClaim in the solar Tenant Namespace resources.

If another Tenant, like green, tries to use it, it will get an error:

$ kubectl describe pv pvc-9788f5e4-1114-419b-a830-74e7f9a33f5d
Name:              pvc-9788f5e4-1114-419b-a830-74e7f9a33f5d
Labels:            capsule.clastix.io/tenant=solar
Annotations:       pv.kubernetes.io/provisioned-by: rancher.io/local-path
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      standard
Status:            Available
...

$ cat /tmp/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: melange
  namespace:  green-energy
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  volumeName: pvc-9788f5e4-1114-419b-a830-74e7f9a33f5d

$ kubectl apply -f /tmp/pvc.yaml
Error from server: error when creating "/tmp/pvc.yaml": admission webhook "pvc.capsule.clastix.io" denied the request: PeristentVolume pvc-9788f5e4-1114-419b-a830-74e7f9a33f5d cannot be used by the following Tenant, preventing a cross-tenant mount

StorageClasses

Persistent storage infrastructure is provided to tenants. Different types of storage requirements, with different levels of QoS, eg. SSD versus HDD, are available for different tenants according to the tenant’s profile. To meet these different requirements, Bill, the cluster admin can provision different Storage Classes and assign them to the tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  storageClasses:
    matchLabels:
      env: "production"

With the said Tenant specification, Alice can create a Persistent Volume Claims if spec.storageClassName equals to:

  • Any StorageClass which has the label env with the value production

Capsule assures that all Persistent Volume Claims created by Alice will use only one of the valid storage classes. Assume the StorageClass ceph-rbd has the label env: production:

kubectl apply -f - << EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc
  namespace: solar-production
spec:
  storageClassName: ceph-rbd
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 12Gi
EOF

If a Persistent Volume Claim is going to use a non-allowed Storage Class, it will be rejected by the Validation Webhook enforcing it.

Assign Storage Class as tenant default

Note: This feature supports type StorageClass only on API version storage.k8s.io/v1

This feature allows specifying a custom default value on a Tenant basis, bypassing the global cluster default (.metadata.annotations.storageclass.kubernetes.io/is-default-class=true) that acts only at the cluster level. See the Default Storage Class section on Kubernetes documentation.

It’s possible to assign each tenant a StorageClass which will be used, if no value is set on Persistent Volume Claim basis:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - name: alice
    kind: User
  storageClasses:
    default: "tenant-default"
    matchLabels:
      env: "production"

Here’s how the new Storage Class could look like:

kubectl apply -f - << EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: tenant-default
  labels:
    env: production
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
EOF

If a Persistent Volume Claim has no value for spec.storageClassName the tenant-default value will be used on new Persistent Volume Claim resources.

Images

PullPolicy

Bill is a cluster admin providing a Container as a Service platform using shared nodes.

Alice, a Tenant Owner, can start container images using private images: according to the Kubernetes architecture, the kubelet will download the layers on its cache.

Bob, an attacker, could try to schedule a Pod on the same node where Alice is running her Pods backed by private images: they could start new Pods using ImagePullPolicy=IfNotPresent and be able to start them, even without required authentication since the image is cached on the node.

To avoid this kind of attack, Bill, the cluster admin, can force Alice, the tenant owner, to start her Pods using only the allowed values for ImagePullPolicy, enforcing the kubelet to check the authorization first.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  imagePullPolicies:
  - Always

Allowed values are: Always, IfNotPresent, Never. As defined by the Kubernetes API

Any attempt of Alice to use a disallowed imagePullPolicies value is denied by the Validation Webhook enforcing it.

Images Registries

Bill, the cluster admin, can set a strict policy on the applications running into Alice’s tenant: he’d like to allow running just images hosted on a list of specific container registries.

The spec.containerRegistries addresses this task and can provide a combination with hard enforcement using a list of allowed values.

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  containerRegistries:
    allowed:
    - docker.io
    - quay.io
    allowedRegex: 'internal.registry.\\w.tld'

In case of Pod running non-FQCI (non fully qualified container image) containers, the container registry enforcement will disallow the execution. If you would like to run a bbusybox:latest container that is commonly hosted on Docker Hub, the Tenant Owner has to specify its name explicitly, like docker.io/library/busybox:latest.

A Pod running internal.registry.foo.tld/capsule:latest as registry will be allowed, as well internal.registry.bar.tld since these are matching the regular expression.

A catch-all regex entry as .* allows every kind of registry, which would be the same result of unsetting .spec.containerRegistries at all.

Any attempt of Alice to use a not allowed .spec.containerRegistries value is denied by the Validation Webhook enforcing it.

Administration

Cordoning

Bill needs to cordon a Tenant and its Namespaces for several reasons:

  • Avoid accidental resource modification(s) including deletion during a Production Freeze Window
  • During the Kubernetes upgrade, to prevent any workload updates
  • During incidents or outages
  • During planned maintenance of a dedicated nodes pool in a BYOD scenario

With this said, the Tenant Owner and the related Service Account living into managed Namespaces, cannot proceed to any update, create or delete action.

This is possible by just toggling the specific Tenant specification:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  cordoned: true
  owners:
  - kind: User
    name: alice

Any operation performed by Alice, the Tenant Owner, will be rejected by the Admission controller.

Uncordoning can be done by removing the said specification key:

$ cat <<EOF | kubectl apply -f -
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  cordoned: false
  owners:
  - kind: User
    name: alice
EOF

$ kubectl --as alice --as-group projectcapsule.dev -n solar-dev create deployment nginx --image nginx
deployment.apps/nginx created

Status of cordoning is also reported in the state of the tenant:

kubectl get tenants
NAME     STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR    AGE
bronze   Active                     2                                  3d13h
gold     Active                     2                                  3d13h
solar    Cordoned                   4                                  2d11h
silver   Active                     2                                  3d13h

Deletion Protection

Sometimes it is important to protect business critical tenants from accidental deletion. This can be achieved by toggling preventDeletion specification key on the tenant:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  preventDeletion: true

3.5 - Replications

Replicate resources across tenants or namespaces

When developing an Internal Developer Platform the Platform Administrator could want to propagate a set of resources. These could be Secret, ConfigMap, or other kinds of resources that the tenants would require to use the platform. We provide dedicated Custom Resource Definitions to achieve this goal. Either on tenant basis or tenant-wide.

GlobalTenantResource

When developing an Internal Developer Platform the Platform Administrator could want to propagate a set of resources. These could be Secret, ConfigMap, or other kinds of resources that the tenants would require to use the platform.

A generic example could be the container registry secrets, especially in the context where the Tenants can just use a specific registry.

Starting from Capsule v0.2.0, a new set of Custom Resource Definitions have been introduced, such as the GlobalTenantResource, let’s start with a potential use-case using the personas described at the beginning of this document.

Bill created the Tenants for Alice using the Tenant CRD, and labels these resources using the following command:

$ kubectl label tnt/solar energy=renewable
tenant solar labeled

$ kubectl label tnt/green energy=renewable
tenant green labeled

In the said scenario, these Tenants must use container images from a trusted registry, and that would require the usage of specific credentials for the image pull.

The said container registry is deployed in the cluster in the namespace harbor-system, and this Namespace contains all image pull secret for each Tenant, e.g.: a secret named harbor-system/fossil-pull-secret as follows.

$ kubectl -n harbor-system get secret --show-labels
NAME                    TYPE     DATA   AGE   LABELS
renewable-pull-secret   Opaque   1      28s   tenant=renewable

These credentials would be distributed to the Tenant owners manually, or vice-versa, the owners would require those. Such a scenario would be against the concept of the self-service solution offered by Capsule, and Bill can solve this by creating the GlobalTenantResource as follows.

apiVersion: capsule.clastix.io/v1beta2
kind: GlobalTenantResource
metadata:
  name: renewable-pull-secrets
spec:
  tenantSelector:
    matchLabels:
      energy: renewable
  resyncPeriod: 60s
  resources:
    - namespacedItems:
        - apiVersion: v1
          kind: Secret
          namespace: harbor-system
          selector:
            matchLabels:
              tenant: renewable

The GlobalTenantResource is a cluster-scoped resource, thus it has been designed for cluster administrators and cannot be used by Tenant owners: for that purpose, the TenantResource one can help.

Capsule will select all the Tenant resources according to the key tenantSelector. Each object defined in the namespacedItems and matching the provided selector will be replicated into each Namespace bounded to the selected Tenants. Capsule will check every 60 seconds if the resources are replicated and in sync, as defined in the key resyncPeriod.

TenantResource

Although Capsule is supporting a few amounts of personas, it can be used to allow building an Internal Developer Platform used barely by Tenant owners, or users created by these thanks to Service Account.

In a such scenario, a Tenant Owner would like to distribute resources across all the Namespace of their Tenant, without the need to establish a manual procedure, or the need for writing a custom automation.

The Namespaced-scope API TenantResource allows to replicate resources across the Tenant’s Namespace.

The Tenant owners must have proper RBAC configured in order to create, get, update, and delete their TenantResource CRD instances. This can be achieved using the Tenant key additionalRoleBindings or a custom Tenant owner role, compared to the default one (admin). You can for example create this clusterrole, which will aggregate to the admin role, to allow the Tenant Owner to create TenantResource objects. This allows all users with the rolebinding to admin to create TenantResource objects.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: allow-tenant-resources
  labels: 
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["capsule.clastix.io"]
  resources: ["tenantresources"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

For our example, Alice, the project lead for the solar tenant, wants to provision automatically a DataBase resource for each Namespace of their Tenant: these are the Namespace list.

$ kubectl get namespaces -l capsule.clastix.io/tenant=solar --show-labels
NAME           STATUS   AGE   LABELS
solar-1        Active   59s   capsule.clastix.io/tenant=solar,environment=production,kubernetes.io/metadata.name=solar-1,name=solar-1
solar-2        Active   58s   capsule.clastix.io/tenant=solar,environment=production,kubernetes.io/metadata.name=solar-2,name=solar-2
solar-system   Active   62s   capsule.clastix.io/tenant=solar,kubernetes.io/metadata.name=solar-system,name=solar-system

Alice creates a TenantResource in the Tenant namespace solar-system as follows.

apiVersion: capsule.clastix.io/v1beta2
kind: TenantResource
metadata:
  name: solar-db
  namespace: solar-system
spec:
  resyncPeriod: 60s
  resources:
    - additionalMetadata:
        "replicated-by": "capsule" 
      namespaceSelector:
        matchLabels:
          environment: production
      rawItems:
        - apiVersion: postgresql.cnpg.io/v1
          kind: Cluster
          metadata:
            name: postgresql
          spec:
            description: PostgreSQL cluster for the Solar project
            instances: 3
            postgresql:
              pg_hba:
                - hostssl app all all cert
            primaryUpdateStrategy: unsupervised
            storage:
              size: 1Gi

The expected result will be the object Cluster for the API version postgresql.cnpg.io/v1 to get created in all the Solar tenant namespaces matching the label selector declared by the key namespaceSelector.

$ kubectl get clusters.postgresql.cnpg.io -A
NAMESPACE   NAME         AGE   INSTANCES   READY   STATUS                     PRIMARY
solar-1     postgresql   80s   3           3       Cluster in healthy state   postgresql-1
solar-2     postgresql   80s   3           3       Cluster in healthy state   postgresql-1

The TenantResource object has been created in the namespace solar-system that doesn’t satisfy the Namespace selector. Furthermore, Capsule will automatically inject the required labels to avoid a TenantResource could start polluting other Namespaces.

Eventually, using the key namespacedItem, it is possible to reference existing objects to get propagated across the other Tenant namespaces: in this case, a Tenant Owner can just refer to objects in their Namespaces, preventing a possible escalation referring to non owned objects.

4 - Guides

Guides for using the Capsule and Capsule Proxy

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by userGroups option in the CapsuleConfiguration, which defaults to capsule.clastix.io.

In the following guide, we’ll use Keycloak an Open Source Identity and Access Management server capable to authenticate users via OIDC and release JWT tokens as proof of authentication.

4.1 - Namespace Migration Across Tenants

A Step-by-Step Guide to Namespace Migration

Capsule relays on two components to associate given namespace with tenant.

  • Namespace’s OwnerReference.name pointing to the Tenant defintion
  • Namespace’s OwnerReference.uid pointing to the Tenant defintion

If a cluster administrator changes the Namespace by matching the other Tenant with the proper UID and name, the Namespace can be easily transferred.

kubectl get tenants
NAME    STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
solar   Active                     1                                 46s
wind    Active                     1                                 39s

Get tenant’s metadata.uid.

kubectl get tnt wind -o jsonpath='{.metadata.uid}'
0df8e9ee-5f6f-40a4-897d-b80d349ca36f%

While altering ownerReferences name is sufficient on its own, it’s highly recommended to edit the UID to match the output of the previous commands.

kubectl edit ns ns-foo 

If everything is set correctly, the namespace will be correctly recognized as part of the new tenant.

kubectl get tenants
NAME    STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
solar   Active                     0                                 2m22s
wind    Active                     2                                 2m15s

5 - API Reference

API Reference

Packages:

capsule.clastix.io/v1beta1

Resource Types:

ProxySetting

ProxySetting is the Schema for the proxysettings API.

NameTypeDescriptionRequired
apiVersionstringcapsule.clastix.io/v1beta1true
kindstringProxySettingtrue
metadataobjectRefer to the Kubernetes API documentation for the fields of the metadata field.true
specobjectProxySettingSpec defines the additional Capsule Proxy settings for additional users of the Tenant. Resource is Namespace-scoped and applies the settings to the belonged Tenant.false

ProxySetting.spec

ProxySettingSpec defines the additional Capsule Proxy settings for additional users of the Tenant. Resource is Namespace-scoped and applies the settings to the belonged Tenant.

NameTypeDescriptionRequired
subjects[]objectSubjects that should receive additional permissions.true

ProxySetting.spec.subjects[index]

NameTypeDescriptionRequired
kindenumKind of tenant owner. Possible values are “User”, “Group”, and “ServiceAccount”
Enum: User, Group, ServiceAccount
true
namestringName of tenant owner.true
clusterResources[]objectCluster Resources for tenant Owner.false
proxySettings[]objectProxy settings for tenant owner.false

ProxySetting.spec.subjects[index].clusterResources[index]

NameTypeDescriptionRequired
apiGroups[]stringAPIGroups is the name of the APIGroup that contains the resources. If multiple API groups are specified, any action requested against any resource listed will be allowed. ‘*’ represents all resources. Empty string represents v1 api resources.true
operations[]enumOperations which can be executed on the selected resources.
Default: [List]
true
resources[]stringResources is a list of resources this rule applies to. ‘*’ represents all resources.true
selectorobjectSelect all cluster scoped resources with the given label selector.true

ProxySetting.spec.subjects[index].clusterResources[index].selector

Select all cluster scoped resources with the given label selector.

NameTypeDescriptionRequired
matchExpressions[]objectmatchExpressions is a list of label selector requirements. The requirements are ANDed.false
matchLabelsmap[string]stringmatchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is “key”, the operator is “In”, and the values array contains only “value”. The requirements are ANDed.false

ProxySetting.spec.subjects[index].clusterResources[index].selector.matchExpressions[index]

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

NameTypeDescriptionRequired
keystringkey is the label key that the selector applies to.true
operatorstringoperator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.true
values[]stringvalues is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.false

ProxySetting.spec.subjects[index].proxySettings[index]

NameTypeDescriptionRequired
kindenum
Enum: Nodes, StorageClasses, IngressClasses, PriorityClasses, RuntimeClasses, PersistentVolumes
true
operations[]enumtrue

6 - API Reference

API Reference

Packages:

capsule.clastix.io/v1beta2

Resource Types:

CapsuleConfiguration

CapsuleConfiguration is the Schema for the Capsule configuration API.

NameTypeDescriptionRequired
apiVersionstringcapsule.clastix.io/v1beta2true
kindstringCapsuleConfigurationtrue
metadataobjectRefer to the Kubernetes API documentation for the fields of the metadata field.true
specobjectCapsuleConfigurationSpec defines the Capsule configuration.false

CapsuleConfiguration.spec

CapsuleConfigurationSpec defines the Capsule configuration.

NameTypeDescriptionRequired
enableTLSReconcilerbooleanToggles the TLS reconciler, the controller that is able to generate CA and certificates for the webhooks
when not using an already provided CA and certificate, or when these are managed externally with Vault, or cert-manager.
Default: true
true
forceTenantPrefixbooleanEnforces the Tenant owner, during Namespace creation, to name it using the selected Tenant name as prefix,
separated by a dash. This is useful to avoid Namespace name collision in a public CaaS environment.
Default: false
false
nodeMetadataobjectAllows to set the forbidden metadata for the worker nodes that could be patched by a Tenant.
This applies only if the Tenant has an active NodeSelector, and the Owner have right to patch their nodes.false
overridesobjectAllows to set different name rather than the canonical one for the Capsule configuration objects,
such as webhook secret or configurations.
Default: map[TLSSecretName:capsule-tls mutatingWebhookConfigurationName:capsule-mutating-webhook-configuration validatingWebhookConfigurationName:capsule-validating-webhook-configuration]
false
protectedNamespaceRegexstringDisallow creation of namespaces, whose name matches this regexpfalse
userGroups[]stringNames of the groups for Capsule users.
Default: [capsule.clastix.io]
false

CapsuleConfiguration.spec.nodeMetadata

Allows to set the forbidden metadata for the worker nodes that could be patched by a Tenant. This applies only if the Tenant has an active NodeSelector, and the Owner have right to patch their nodes.

NameTypeDescriptionRequired
forbiddenAnnotationsobjectDefine the annotations that a Tenant Owner cannot set for their nodes.true
forbiddenLabelsobjectDefine the labels that a Tenant Owner cannot set for their nodes.true

CapsuleConfiguration.spec.nodeMetadata.forbiddenAnnotations

Define the annotations that a Tenant Owner cannot set for their nodes.

NameTypeDescriptionRequired
denied[]stringfalse
deniedRegexstringfalse

CapsuleConfiguration.spec.nodeMetadata.forbiddenLabels

Define the labels that a Tenant Owner cannot set for their nodes.

NameTypeDescriptionRequired
denied[]stringfalse
deniedRegexstringfalse

CapsuleConfiguration.spec.overrides

Allows to set different name rather than the canonical one for the Capsule configuration objects, such as webhook secret or configurations.

NameTypeDescriptionRequired
TLSSecretNamestringDefines the Secret name used for the webhook server.
Must be in the same Namespace where the Capsule Deployment is deployed.
Default: capsule-tls
true
mutatingWebhookConfigurationNamestringName of the MutatingWebhookConfiguration which contains the dynamic admission controller paths and resources.
Default: capsule-mutating-webhook-configuration
true
validatingWebhookConfigurationNamestringName of the ValidatingWebhookConfiguration which contains the dynamic admission controller paths and resources.
Default: capsule-validating-webhook-configuration
true