This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Operating

Get started with using tenancy on Kubernetes

1: Installation

1.1: Quickstart
1.2: Upgrading
1.3: Managed Kubernetes
1.4: Controller Options

2: Authentication
3: Monitoring
4: Backup & Restore
5: Pod Security
6: Troubleshooting

1 - Installation

Installing Capsule

Make sure you have access to a Kubernetes cluster as administrator. See the Artifacthub Page for a complete list of available versions and installation instructions.

https://artifacthub.io/packages/helm/capsule/capsule

1.1 - Quickstart

In Capsule, a Tenant is an abstraction to group multiple namespaces in a single entity within a set of boundaries defined by the Cluster Administrator. The tenant is then assigned to a user or group of users who is called Tenant Owner. Capsule defines a Tenant as Custom Resource with cluster scope. Create the tenant as cluster admin:

kubectl create -f - << EOF
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: oil
spec:
  owners:
  - name: alice
    kind: User
EOF

You can check the tenant just created

$ kubectl get tenants
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
solar    Active                     0                                 10s

Each tenant comes with a delegated user or group of users acting as the tenant admin. In the Capsule jargon, this is called the Tenant Owner. Other users can operate inside a tenant with different levels of permissions and authorizations assigned directly by the Tenant Owner.

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by –capsule-user-group option, which defaults to capsule.clastix.io.

Assignment to a group depends on the authentication strategy in your cluster.

For example, if you are using capsule.clastix.io, users authenticated through a X.509 certificate must have capsule.clastix.io as Organization: -subj "/CN=${USER}/O=capsule.clastix.io"

Users authenticated through an OIDC token must have in their token:

...
"users_groups": [
  "capsule.clastix.io",
  "other_group"
]

The hack/create-user.sh can help you set up a dummy kubeconfig for the alice user acting as owner of a tenant called solar.

./hack/create-user.sh alice solar
...
certificatesigningrequest.certificates.k8s.io/alice-solar created
certificatesigningrequest.certificates.k8s.io/alice-solar approved
kubeconfig file is: alice-solar.kubeconfig
to use it as alice export KUBECONFIG=alice-solar.kubeconfig

$ export KUBECONFIG=alice-solar.kubeconfig

Impersonation

You can simulate this behavior by using impersonation:

kubectl --as alice --as-group capsule.clastix.io ...

Create namespaces

As tenant owner, you can create namespaces:

$ kubectl create namespace solar-production
$ kubectl create namespace solar-development

$ kubectl --as alice --as-group capsule.clastix.io create namespace solar-production
$ kubectl --as alice --as-group capsule.clastix.io create namespace solar-development

And operate with fully admin permissions:

$ kubectl -n solar-development run nginx --image=docker.io/nginx 
$ kubectl -n solar-development get pods

Limiting access

Tenant Owners have full administrative permissions limited to only the namespaces in the assigned tenant. They can create any namespaced resource in their namespaces but they do not have access to cluster resources or resources belonging to other tenants they do not own:

$ kubectl -n kube-system get pods
Error from server (Forbidden): pods is forbidden:
User "alice" cannot list resource "pods" in API group "" in the namespace "kube-system"

See the concepts for getting more cool things you can do with Capsule.

1.2 - Upgrading

Upgrading Capsule

List of Tenant API changes:

Capsule v0.1.0 bump to v1beta1 from v1alpha1.
Capsule v0.2.0 bump to v1beta2 from v1beta1, deprecating v1alpha1.
Capsule v0.3.0 missing enums required by Capsule Proxy.

This document aims to provide support and a guide on how to perform a clean upgrade to the latest API version in order to avoid service disruption and data loss.

As an installation method, Helm is given for granted. If you are not using Helm, you might experience problems during the upgrade process.

Considerations

We strongly suggest performing a full backup of your Kubernetes cluster, such as storage and etcd. Use your favorite tool according to your needs.

Upgrading from v0.2.x to v0.3.x

A minor bump has been requested due to some missing enums in the Tenant resource.

Scale down the Capsule controller

Using the kubectl or Helm, scale down the Capsule controller manager: this is required to avoid the old Capsule version from processing objects that aren’t yet installed as a CRD.

helm upgrade -n capsule-system capsule --set "replicaCount=0"

kubectl scale deploy capsule-controller-manager --replicas=0 -n capsule-system

1.3 - Managed Kubernetes

Capsule on managed Kubernetes offerings

Capsule Operator can be easily installed on a Managed Kubernetes Service. Since you do not have access to the Kubernetes APIs Server, you should check with the provider of the service:

the default cluster-admin ClusterRole is accessible the following Admission Webhooks are enabled on the APIs Server:

PodNodeSelector
LimitRanger
ResourceQuota
MutatingAdmissionWebhook
ValidatingAdmissionWebhook

AWS EKS

This is an example of how to install AWS EKS cluster and one user manged by Capsule. It is based on Using IAM Groups to manage Kubernetes access

Create EKS cluster:

export AWS_DEFAULT_REGION="eu-west-1"
export AWS_ACCESS_KEY_ID="xxxxx"
export AWS_SECRET_ACCESS_KEY="xxxxx"

eksctl create cluster \
--name=test-k8s \
--managed \
--node-type=t3.small \
--node-volume-size=20 \
--kubeconfig=kubeconfig.conf

Create AWS User alice using CloudFormation, create AWS access files and kubeconfig for such user:

cat > cf.yml << EOF
Parameters:
  ClusterName:
    Type: String
Resources:
  UserAlice:
    Type: AWS::IAM::User
    Properties:
      UserName: !Sub "alice-${ClusterName}"
      Policies:
      - PolicyName: !Sub "alice-${ClusterName}-policy"
        PolicyDocument:
          Version: "2012-10-17"
          Statement:
          - Sid: AllowAssumeOrganizationAccountRole
            Effect: Allow
            Action: sts:AssumeRole
            Resource: !GetAtt RoleAlice.Arn
  AccessKeyAlice:
    Type: AWS::IAM::AccessKey
    Properties:
      UserName: !Ref UserAlice
  RoleAlice:
    Type: AWS::IAM::Role
    Properties:
      Description: !Sub "IAM role for the alice-${ClusterName} user"
      RoleName: !Sub "alice-${ClusterName}"
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
        - Effect: Allow
          Principal:
            AWS: !Sub "arn:aws:iam::${AWS::AccountId}:root"
          Action: sts:AssumeRole
Outputs:
  RoleAliceArn:
    Description: The ARN of the Alice IAM Role
    Value: !GetAtt RoleAlice.Arn
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-RoleAliceArn"
  AccessKeyAlice:
    Description: The AccessKey for Alice user
    Value: !Ref AccessKeyAlice
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-AccessKeyAlice"
  SecretAccessKeyAlice:
    Description: The SecretAccessKey for Alice user
    Value: !GetAtt AccessKeyAlice.SecretAccessKey
    Export:
      Name:
        Fn::Sub: "${AWS::StackName}-SecretAccessKeyAlice"
EOF

eval aws cloudformation deploy --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=test-k8s" \
  --stack-name "test-k8s-users" --template-file cf.yml

AWS_CLOUDFORMATION_DETAILS=$(aws cloudformation describe-stacks --stack-name "test-k8s-users")
ALICE_ROLE_ARN=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"RoleAliceArn\") .OutputValue")
ALICE_USER_ACCESSKEY=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"AccessKeyAlice\") .OutputValue")
ALICE_USER_SECRETACCESSKEY=$(echo "${AWS_CLOUDFORMATION_DETAILS}" | jq -r ".Stacks[0].Outputs[] | select(.OutputKey==\"SecretAccessKeyAlice\") .OutputValue")

eksctl create iamidentitymapping --cluster="test-k8s" --arn="${ALICE_ROLE_ARN}" --username alice --group capsule.clastix.io

cat > aws_config << EOF
[profile alice]
role_arn=${ALICE_ROLE_ARN}
source_profile=alice
EOF

cat > aws_credentials << EOF
[alice]
aws_access_key_id=${ALICE_USER_ACCESSKEY}
aws_secret_access_key=${ALICE_USER_SECRETACCESSKEY}
EOF

eksctl utils write-kubeconfig --cluster=test-k8s --kubeconfig="kubeconfig-alice.conf"
cat >> kubeconfig-alice.conf << EOF
      - name: AWS_PROFILE
        value: alice
      - name: AWS_CONFIG_FILE
        value: aws_config
      - name: AWS_SHARED_CREDENTIALS_FILE
        value: aws_credentials
EOF

Export “admin” kubeconfig to be able to install Capsule:

export KUBECONFIG=kubeconfig.conf

Install Capsule and create a tenant where alice has ownership. Use the default Tenant example:

kubectl apply -f https://raw.githubusercontent.com/clastix/capsule/master/config/samples/capsule_v1beta1_tenant.yaml

Based on the tenant configuration above the user alice should be able to create namespace. Switch to a new terminal and try to create a namespace as user alice:

# Unset AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY if defined
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
kubectl create namespace test --kubeconfig="kubeconfig-alice.conf"

Azure AKS

This reference implementation introduces the recommended starting (baseline) infrastructure architecture for implementing a multi-tenancy Azure AKS cluster using Capsule. See CoAKS.

Charmed Kubernetes

Canonical Charmed Kubernetes is a Kubernetes distribution coming with out-of-the-box tools that support deployments and operational management and make microservice development easier. Combined with Capsule, Charmed Kubernetes allows users to further reduce the operational overhead of Kubernetes setup and management.

The Charm package for Capsule is available to Charmed Kubernetes users via Charmhub.io.

1.4 - Controller Options

Understand the Capsule configuration options and how to use them.

The configuration for the capsule controller is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

CapsuleConfiguration

The configuration for Capsule is done via it’s dedicated configration Custom Resource. You can explain the configuration options and how to use them:

kubectl explain capsuleConfiguration.spec

enableTLSReconciler

Toggles the TLS reconciler, the controller that is able to generate CA and certificates for the webhooks when not using an already provided CA and certificate, or when these are managed externally with Vault, or cert-manager.

forceTenantPrefix

Enforces the Tenant owner, during Namespace creation, to name it using the selected Tenant name as prefix, separated by a dash. This is useful to avoid Namespace name collision in a public CaaS environment.

nodeMetadata

Allows to set the forbidden metadata for the worker nodes that could be patched by a Tenant. This applies only if the Tenant has an active NodeSelector, and the Owner have right to patch their nodes.

overrides

Allows to set different name rather than the canonical one for the Capsule configuration objects, such as webhook secret or configurations.

protectedNamespaceRegex

Disallow creation of namespaces, whose name matches this regexp

userGroups

Names of the groups for Capsule users. Users must have this group to be considered for the Capsule tenancy. If a user does not have any group mentioned here, they are not recognized as a Capsule user.

Controller Options

Depending on the version of the Capsule Controller, the configuration options may vary. You can view the options for the latest version of the Capsule Controller here or by executing the controller locally:

$ docker run ghcr.io/projectcapsule/capsule:v0.6.0-rc0 -h
2024/02/25 13:21:21 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
Usage of /ko-app/capsule:
      --configuration-name string         The CapsuleConfiguration resource name to use (default "default")
      --enable-leader-election            Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.
      --metrics-addr string               The address the metric endpoint binds to. (default ":8080")
      --version                           Print the Capsule version and exit
      --webhook-port int                  The port the webhook server binds to. (default 9443)
      --zap-devel                         Development Mode defaults(encoder=consoleEncoder,logLevel=Debug,stackTraceLevel=Warn). Production Mode defaults(encoder=jsonEncoder,logLevel=Info,stackTraceLevel=Error)
      --zap-encoder encoder               Zap log encoding (one of 'json' or 'console')
      --zap-log-level level               Zap Level to configure the verbosity of logging. Can be one of 'debug', 'info', 'error', or any integer value > 0 which corresponds to custom debug levels of increasing verbosity
      --zap-stacktrace-level level        Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').
      --zap-time-encoding time-encoding   Zap time encoding (one of 'epoch', 'millis', 'nano', 'iso8601', 'rfc3339' or 'rfc3339nano'). Defaults to 'epoch'.

2 - Authentication

Integrate Capsule with Authentication of your Kubernetes cluster

Capsule does not care about the authentication strategy used in the cluster and all the Kubernetes methods of authentication are supported. The only requirement to use Capsule is to assign tenant users to the group defined by userGroups option in the CapsuleConfiguration, which defaults to capsule.clastix.io.

OIDC

In the following guide, we’ll use Keycloak an Open Source Identity and Access Management server capable to authenticate users via OIDC and release JWT tokens as proof of authentication.

Configuring OIDC Server

Configure Keycloak as OIDC server:

Add a realm called caas, or use any existing realm instead
Add a group capsule.clastix.io
Add a user alice assigned to group capsule.clastix.io
Add an OIDC client called kubernetes

For the kubernetes client, create protocol mappers called groups and audience If everything is done correctly, now you should be able to authenticate in Keycloak and see user groups in JWT tokens. Use the following snippet to authenticate in Keycloak as alice user:

$ KEYCLOAK=sso.clastix.io
$ REALM=caas
$ OIDC_ISSUER=${KEYCLOAK}/realms/${REALM}

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/token \
     -d grant_type=password \
     -d response_type=id_token \
     -d scope=openid \
     -d client_id=${OIDC_CLIENT_ID} \
     -d client_secret=${OIDC_CLIENT_SECRET} \
     -d username=${USERNAME} \
     -d password=${PASSWORD} | jq

The result will include an ACCESS_TOKEN, a REFRESH_TOKEN, and an ID_TOKEN. The access-token can generally be disregarded for Kubernetes. It would be used if the identity provider was managing roles and permissions for the users but that is done in Kubernetes itself with RBAC. The id-token is short lived while the refresh-token has longer expiration. The refresh-token is used to fetch a new id-token when the id-token expires.

{  
   "access_token":"ACCESS_TOKEN",
   "refresh_token":"REFRESH_TOKEN",
   "id_token": "ID_TOKEN",
   "token_type":"bearer",
   "scope": "openid groups profile email"
}

To introspect the ID_TOKEN token run:

$ curl -k -s https://${OIDC_ISSUER}/protocol/openid-connect/introspect \
     -d token=${ID_TOKEN} \
     --user ${OIDC_CLIENT_ID}:${OIDC_CLIENT_SECRET} | jq

The result will be like the following:

{
  "exp": 1601323086,
  "iat": 1601322186,
  "aud": "kubernetes",
  "typ": "ID",
  "azp": "kubernetes",
  "preferred_username": "alice",
  "email_verified": false,
  "acr": "1",
  "groups": [
    "capsule.clastix.io"
  ],
  "client_id": "kubernetes",
  "username": "alice",
  "active": true
}

Configuring Kubernetes API Server

Configuring Kubernetes for OIDC Authentication requires adding several parameters to the API Server. Please, refer to the documentation for details and examples. Most likely, your kube-apiserver.yaml manifest will looks like the following:

spec:
  containers:
  - command:
    - kube-apiserver
    ...
    - --oidc-issuer-url=https://${OIDC_ISSUER}
    - --oidc-ca-file=/etc/kubernetes/oidc/ca.crt
    - --oidc-client-id=${OIDC_CLIENT_ID}
    - --oidc-username-claim=preferred_username
    - --oidc-groups-claim=groups
    - --oidc-username-prefix=-

KinD

As reference, here is an example of a KinD configuration for OIDC Authentication, which can be useful for local testing:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
    kubeadmConfigPatches:
     - |
       kind: ClusterConfiguration
       apiServer:
           extraArgs:
             oidc-issuer-url: https://${OIDC_ISSUER}
             oidc-username-claim: preferred_username
             oidc-client-id: ${OIDC_CLIENT_ID}
             oidc-username-prefix: "keycloak:"
             oidc-groups-claim: groups
             oidc-groups-prefix: "keycloak:"
             enable-admission-plugins: PodNodeSelector

Configuring kubectl

There are two options to use kubectl with OIDC:

OIDC Authenticator
Use the –token option

Plugin

One way to use OIDC authentication is the use of a kubectl plugin. The Kubelogin Plugin for kubectl simplifies the process of obtaining an OIDC token and configuring kubectl to use it. Follow the link to obtain installation instructions.

kubectl oidc-login setup \
	--oidc-issuer-url=https://${OIDC_ISSUER} \
	--oidc-client-id=${OIDC_CLIENT_ID} \
	--oidc-client-secret=${OIDC_CLIENT_SECRET}

Manual

To use the OIDC Authenticator, add an oidc user entry to your kubeconfig file:

$ kubectl config set-credentials oidc \
    --auth-provider=oidc \
    --auth-provider-arg=idp-issuer-url=https://${OIDC_ISSUER} \
    --auth-provider-arg=idp-certificate-authority=/path/to/ca.crt \
    --auth-provider-arg=client-id=${OIDC_CLIENT_ID} \
    --auth-provider-arg=client-secret=${OIDC_CLIENT_SECRET} \
    --auth-provider-arg=refresh-token=${REFRESH_TOKEN} \
    --auth-provider-arg=id-token=${ID_TOKEN} \
    --auth-provider-arg=extra-scopes=groups

To use the --token option:

$ kubectl config set-credentials oidc --token=${ID_TOKEN}

Point the kubectl to the URL where the Kubernetes APIs Server is reachable:

$ kubectl config set-cluster mycluster \
    --server=https://kube.projectcapulse.io:6443 \
    --certificate-authority=~/.kube/ca.crt

If your APIs Server is reachable through the capsule-proxy, make sure to use the URL of the capsule-proxy.

Create a new context for the OIDC authenticated users:

$ kubectl config set-context alice-oidc@mycluster \
    --cluster=mycluster \
    --user=oidc

As user alice, you should be able to use kubectl to create some namespaces:

$ kubectl --context alice-oidc@mycluster create namespace oil-production
$ kubectl --context alice-oidc@mycluster create namespace oil-development
$ kubectl --context alice-oidc@mycluster create namespace gas-marketing

Warning: once your ID_TOKEN expires, the kubectl OIDC Authenticator will attempt to refresh automatically your ID_TOKEN using the REFRESH_TOKEN. In case the OIDC uses a self signed CA certificate, make sure to specify it with the idp-certificate-authority option in your kubeconfig file, otherwise you’ll not able to refresh the tokens.

3 - Monitoring

Monitoring Capsule Items and Tenants

The Capsule dashboard allows you to track the health and performance of Capsule manager and tenants, with particular attention to resources saturation, server responses, and latencies. Prometheus and Grafana are requirements for monitoring Capsule.

ResourcePools

Instrumentation for ResourcePools.

Dashboards

Dashboards can be deployed via helm-chart, enable the following values:

monitoring:
  dashboards:
    enabled: true

Capsule / ResourcePools

Dashboard which grants a detailed overview over the ResourcePools

Resourcepool Dashboard

Rules

Example rules to give you some idea, what’s possible.

Alert on ResourcePools usage

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: resourcepool-usage-alert
spec:
groups:
  - name: capsule-pool-usage.rules
    rules:
      - alert: CapsulePoolHighUsageWarning
        expr: |
          capsule_pool_usage_percentage > 90
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: High resource usage in Resourcepool
          description: |
            Resource {{ $labels.resource }} in pool {{ $labels.pool }} is at {{ $value }}% usage for the last 10 minutes.

      - alert: CapsulePoolHighUsageCritical
        expr: |
          capsule_pool_usage_percentage > 95
        for: 10m
        labels:
          severity: critical
        annotations:
          summary: Critical resource usage in Resourcepool
          description: |
            Resource {{ $labels.resource }} in pool {{ $labels.pool }} has exceeded 95% usage for the last 10 minutes.

Metrics

The following Metrics are exposed and can be used for monitoring:

# HELP capsule_claim_condition The current condition status of a claim.
# TYPE capsule_claim_condition gauge
capsule_claim_condition{condition="Bound",name="compute",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-10",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-prod"} 0
capsule_claim_condition{condition="Bound",name="compute-2",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-3",pool="solar-compute",reason="Succeeded",target_namespace="solar-prod"} 1
capsule_claim_condition{condition="Bound",name="compute-4",pool="solar-compute",reason="Succeeded",target_namespace="solar-test"} 1
capsule_claim_condition{condition="Bound",name="compute-5",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-test"} 0
capsule_claim_condition{condition="Bound",name="compute-6",pool="solar-compute",reason="PoolExhausted",target_namespace="solar-test"} 0
capsule_claim_condition{condition="Bound",name="pods",pool="solar-size",reason="Succeeded",target_namespace="solar-test"} 1

# HELP capsule_claim_resource The given amount of resources from the claim
# TYPE capsule_claim_resource gauge
capsule_claim_resource{name="compute",resource="limits.cpu",target_namespace="solar-prod"} 0.375
capsule_claim_resource{name="compute",resource="limits.memory",target_namespace="solar-prod"} 4.02653184e+08
capsule_claim_resource{name="compute",resource="requests.cpu",target_namespace="solar-prod"} 0.375
capsule_claim_resource{name="compute",resource="requests.memory",target_namespace="solar-prod"} 4.02653184e+08
capsule_claim_resource{name="compute-10",resource="limits.memory",target_namespace="solar-prod"} 1.073741824e+10
capsule_claim_resource{name="compute-2",resource="limits.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-2",resource="limits.memory",target_namespace="solar-prod"} 5.36870912e+08
capsule_claim_resource{name="compute-2",resource="requests.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-2",resource="requests.memory",target_namespace="solar-prod"} 5.36870912e+08
capsule_claim_resource{name="compute-3",resource="requests.cpu",target_namespace="solar-prod"} 0.5
capsule_claim_resource{name="compute-4",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_claim_resource{name="compute-5",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_claim_resource{name="compute-6",resource="requests.cpu",target_namespace="solar-test"} 5
capsule_claim_resource{name="pods",resource="pods",target_namespace="solar-test"} 3

# HELP capsule_pool_available Current resource availability for a given resource in a resource pool
# TYPE capsule_pool_available gauge
capsule_pool_available{pool="solar-compute",resource="limits.cpu"} 1.125
capsule_pool_available{pool="solar-compute",resource="limits.memory"} 1.207959552e+09
capsule_pool_available{pool="solar-compute",resource="requests.cpu"} 0.125
capsule_pool_available{pool="solar-compute",resource="requests.memory"} 1.207959552e+09
capsule_pool_available{pool="solar-size",resource="pods"} 4

# HELP capsule_pool_exhaustion Resources become exhausted, when there's not enough available for all claims and the claims get queued
# TYPE capsule_pool_exhaustion gauge
capsule_pool_exhaustion{pool="solar-compute",resource="limits.memory"} 1.073741824e+10
capsule_pool_exhaustion{pool="solar-compute",resource="requests.cpu"} 5.5

# HELP capsule_pool_exhaustion_percentage Resources become exhausted, when there's not enough available for all claims and the claims get queued (Percentage)
# TYPE capsule_pool_exhaustion_percentage gauge
capsule_pool_exhaustion_percentage{pool="solar-compute",resource="limits.memory"} 788.8888888888889
capsule_pool_exhaustion_percentage{pool="solar-compute",resource="requests.cpu"} 4300

# HELP capsule_pool_limit Current resource limit for a given resource in a resource pool
# TYPE capsule_pool_limit gauge
capsule_pool_limit{pool="solar-compute",resource="limits.cpu"} 2
capsule_pool_limit{pool="solar-compute",resource="limits.memory"} 2.147483648e+09
capsule_pool_limit{pool="solar-compute",resource="requests.cpu"} 2
capsule_pool_limit{pool="solar-compute",resource="requests.memory"} 2.147483648e+09
capsule_pool_limit{pool="solar-size",resource="pods"} 7

# HELP capsule_pool_namespace_usage Current resources claimed on namespace basis for a given resource in a resource pool for a specific namespace
# TYPE capsule_pool_namespace_usage gauge
capsule_pool_namespace_usage{pool="solar-compute",resource="limits.cpu",target_namespace="solar-prod"} 0.875
capsule_pool_namespace_usage{pool="solar-compute",resource="limits.memory",target_namespace="solar-prod"} 9.39524096e+08
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-prod"} 1.375
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-test"} 0.5
capsule_pool_namespace_usage{pool="solar-compute",resource="requests.memory",target_namespace="solar-prod"} 9.39524096e+08
capsule_pool_namespace_usage{pool="solar-size",resource="pods",target_namespace="solar-test"} 3

# HELP capsule_pool_namespace_usage_percentage Current resources claimed on namespace basis for a given resource in a resource pool for a specific namespace (percentage)
# TYPE capsule_pool_namespace_usage_percentage gauge
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="limits.cpu",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="limits.memory",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-prod"} 68.75
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.cpu",target_namespace="solar-test"} 25
capsule_pool_namespace_usage_percentage{pool="solar-compute",resource="requests.memory",target_namespace="solar-prod"} 43.75
capsule_pool_namespace_usage_percentage{pool="solar-size",resource="pods",target_namespace="solar-test"} 42.857142857142854

# HELP capsule_pool_resource Type of resource being used in a resource pool
# TYPE capsule_pool_resource gauge
capsule_pool_resource{pool="solar-compute",resource="limits.cpu"} 1
capsule_pool_resource{pool="solar-compute",resource="limits.memory"} 1
capsule_pool_resource{pool="solar-compute",resource="requests.cpu"} 1
capsule_pool_resource{pool="solar-compute",resource="requests.memory"} 1
capsule_pool_resource{pool="solar-size",resource="pods"} 1

# HELP capsule_pool_usage Current resource usage for a given resource in a resource pool
# TYPE capsule_pool_usage gauge
capsule_pool_usage{pool="solar-compute",resource="limits.cpu"} 0.875
capsule_pool_usage{pool="solar-compute",resource="limits.memory"} 9.39524096e+08
capsule_pool_usage{pool="solar-compute",resource="requests.cpu"} 1.875
capsule_pool_usage{pool="solar-compute",resource="requests.memory"} 9.39524096e+08
capsule_pool_usage{pool="solar-size",resource="pods"} 3

# HELP capsule_pool_usage_percentage Current resource usage for a given resource in a resource pool (percentage)
# TYPE capsule_pool_usage_percentage gauge
capsule_pool_usage_percentage{pool="solar-compute",resource="limits.cpu"} 43.75
capsule_pool_usage_percentage{pool="solar-compute",resource="limits.memory"} 43.75
capsule_pool_usage_percentage{pool="solar-compute",resource="requests.cpu"} 93.75
capsule_pool_usage_percentage{pool="solar-compute",resource="requests.memory"} 43.75
capsule_pool_usage_percentage{pool="solar-size",resource="pods"} 42.857142857142854

Quotas

Instrumentation for Quotas.

Metrics

The following Metrics are exposed and can be used for monitoring:

# HELP capsule_tenant_resource_limit Current resource limit for a given resource in a tenant
# TYPE capsule_tenant_resource_limit gauge
capsule_tenant_resource_limit{resource="limits.cpu",resourcequotaindex="0",tenant="solar"} 2
capsule_tenant_resource_limit{resource="limits.memory",resourcequotaindex="0",tenant="solar"} 2.147483648e+09
capsule_tenant_resource_limit{resource="pods",resourcequotaindex="1",tenant="solar"} 7
capsule_tenant_resource_limit{resource="requests.cpu",resourcequotaindex="0",tenant="solar"} 2
capsule_tenant_resource_limit{resource="requests.memory",resourcequotaindex="0",tenant="solar"} 2.147483648e+09

# HELP capsule_tenant_resource_usage Current resource usage for a given resource in a tenant
# TYPE capsule_tenant_resource_usage gauge
capsule_tenant_resource_usage{resource="limits.cpu",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="limits.memory",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="namespaces",resourcequotaindex="",tenant="solar"} 2
capsule_tenant_resource_usage{resource="pods",resourcequotaindex="1",tenant="solar"} 0
capsule_tenant_resource_usage{resource="requests.cpu",resourcequotaindex="0",tenant="solar"} 0
capsule_tenant_resource_usage{resource="requests.memory",resourcequotaindex="0",tenant="solar"} 0

Custom Metrics

You can gather more information based on the status of the tenants. These can be scrapped via Kube-State-Metrics CustomResourcesState Metrics. With these you have the possibility to create custom metrics based on the status of the tenants.

Here as an example with the kube-prometheus-stack chart, set the following values:

kube-state-metrics:
  rbac:
    extraRules:
      - apiGroups: [ "capsule.clastix.io" ]
        resources: ["tenants"]
        verbs: [ "list", "watch" ]
  customResourceState:
    enabled: true
    config:
      spec:
        resources:
          - groupVersionKind:
              group: capsule.clastix.io
              kind: "Tenant"
              version: "v1beta2"
            labelsFromPath:
              name: [metadata, name]
            metrics:
              - name: "tenant_size"
                help: "Count of namespaces in the tenant"
                each:
                  type: Gauge
                  gauge:
                    path: [status, size]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_state"
                help: "The operational state of the Tenant"
                each:
                  type: StateSet
                  stateSet:
                    labelName: state
                    path: [status, state]
                    list: [Active, Cordoned]
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]
              - name: "tenant_namespaces_info"
                help: "Namespaces of a Tenant"
                each:
                  type: Info
                  info:
                    path: [status, namespaces]
                    labelsFromPath:
                      tenant_namespace: []
                commonLabels:
                  custom_metric: "yes"
                labelsFromPath:
                  capsule_tenant: [metadata, name]
                  kind: [ kind ]

This example creates three custom metrics:

tenant_size is a gauge that counts the number of namespaces in the tenant.
tenant_state is a state set that shows the operational state of the tenant.
tenant_namespaces_info is an info metric that shows the namespaces of the tenant.

4 - Backup & Restore

Run Backups and Restores of Tenants

Velero is a backup and restore solution that performs data protection, disaster recovery and migrates Kubernetes cluster from on-premises to the Cloud or between different Clouds.

When coming to backup and restore in Kubernetes, we have two main requirements:

Configurations backup
Data backup

The first requirement aims to backup all the resources stored into etcd database, for example: namespaces, pods, services, deployments, etc. The second is about how to backup stateful application data as volumes.

The main limitation of Velero is the multi tenancy. Currently, Velero does not support multi tenancy meaning it can be only used from admin users and so it cannot provided “as a service” to the users. This means that the cluster admin needs to take care of users’ backup.

Assuming you have multiple tenants managed by Capsule, for example oil and gas, as cluster admin, you can to take care of scheduling backups for:

Tenant cluster resources
Namespaces belonging to each tenant

Create backup of a tenant

Create a backup of the tenant solar. It consists in two different backups:

backup of the tenant resource
backup of all the resources belonging to the tenant

To backup the oil tenant selectively, label the tenant as:

kubectl label tenant oil capsule.clastix.io/tenant=solar

and create the backup

velero create backup solar-tenant \
    --include-cluster-resources=true \
    --include-resources=tenants.capsule.clastix.io \
    --selector capsule.clastix.io/tenant=solar

resulting in the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-tenant
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: true
  includedNamespaces:
  - '*'
  includedResources:
  - tenants.capsule.clastix.io
  labelSelector:
    matchLabels:
      capsule.clastix.io/tenant: solar
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Create a backup of all the resources belonging to the oil tenant namespaces:

velero create backup solar-namespaces \
    --include-cluster-resources=false \
    --include-namespaces solar-production,solar-development,solar-marketing

resulting to the following Velero object:

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: solar-namespaces
spec:
  defaultVolumesToRestic: false
  hooks: {}
  includeClusterResources: false
  includedNamespaces:
  - solar-production
  - solar-development
  - solar-marketing
  metadata: {}
  storageLocation: default
  ttl: 720h0m0s

Velero requires an Object Storage backend where to store backups, you should take care of this requirement before to use Velero.

Restore a tenant from the backup

To recover the tenant after a disaster, or to migrate it to another cluster, create a restore from the previous backups:

velero create restore --from-backup solar-tenant
velero create restore --from-backup solar-namespaces

Using Velero to restore a Capsule tenant can lead to an incomplete recovery of tenant because the namespaces restored with Velero do not have the OwnerReference field used to bind the namespaces to the tenant. For this reason, all restored namespaces are not bound to the tenant:

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    34m
oil  active   9                 8                 {"pool":"oil"}  33m
solar    active   9                 0 # <<<           {"pool":"solar"}    54m

To avoid this problem you can use the script velero-restore.sh located under the hack/ folder:

./velero-restore.sh --kubeconfing /path/to/your/kubeconfig --tenant "oil" restore

Running this command, we are going to patch the tenant’s namespaces manifests that are actually ownerReferences-less. Once the command has finished its run, you got the tenant back.

kubectl get tnt
NAME   STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR     AGE
gas    active   9                 5                 {"pool":"gas"}    44m
solar  active   9                 8                 {"pool":"oil"}  43m
oil    active   9                 3 # <<<           {"pool":"solar"}    12s

5 - Pod Security

Control the security of the pods running in the tenant namespaces

In Kubernetes, by default, workloads run with administrative access, which might be acceptable if there is only a single application running in the cluster or a single user accessing it. This is seldom required and you’ll consequently suffer a noisy neighbour effect along with large security blast radiuses.

Many of these concerns were addressed initially by PodSecurityPolicies which have been present in the Kubernetes APIs since the very early days.

The Pod Security Policies are deprecated in Kubernetes 1.21 and removed entirely in 1.25. As replacement, the Pod Security Standards and Pod Security Admission has been introduced. Capsule support the new standard for tenants under its control as well as the oldest approach.

Pod Security Standards

One of the issues with Pod Security Policies is that it is difficult to apply restrictive permissions on a granular level, increasing security risk. Also the Pod Security Policies get applied when the request is submitted and there is no way of applying them to pods that are already running. For these, and other reasons, the Kubernetes community decided to deprecate the Pod Security Policies.

As the Pod Security Policies get deprecated and removed, the Pod Security Standards is used in place. It defines three different policies to broadly cover the security spectrum. These policies are cumulative and range from highly-permissive to highly-restrictive:

Privileged: unrestricted policy, providing the widest possible level of permissions.
Baseline: minimally restrictive policy which prevents known privilege escalations.
Restricted: heavily restricted policy, following current Pod hardening best practices.

Kubernetes provides a built-in Admission Controller to enforce the Pod Security Standards at either:

cluster level which applies a standard configuration to all namespaces in a cluster
namespace level, one namespace at a time

For the first case, the cluster admin has to configure the Admission Controller and pass the configuration to the kube-apiserver by mean of the --admission-control-config-file extra argument, for example:

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "baseline"
      enforce-version: "latest"
      warn: "restricted"
      warn-version: "latest"
      audit: "restricted"
      audit-version: "latest"
    exemptions:
      usernames: []
      runtimeClasses: []
      namespaces: [kube-system]

For the second case, he can just assign labels to the specific namespace he wants enforce the policy since the Pod Security Admission Controller is enabled by default starting from Kubernetes 1.23+:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: development

Capsule

According to the regular Kubernetes segregation model, the cluster admin has to operate either at cluster level or at namespace level. Since Capsule introduces a further segregation level (the Tenant abstraction), the cluster admin can implement Pod Security Standards at tenant level by simply forcing specific labels on all the namespaces created in the tenant.

As cluster admin, create a tenant with additional labels:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  namespaceOptions:
    additionalMetadata:
      labels:
        pod-security.kubernetes.io/enforce: baseline
        pod-security.kubernetes.io/audit: restricted
        pod-security.kubernetes.io/warn: restricted
  owners:
  - kind: User
    name: alice

All namespaces created by the tenant owner, will inherit the Pod Security labels:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    capsule.clastix.io/tenant: solar
    kubernetes.io/metadata.name: solar-development
    name: solar-development
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  name: solar-development
  ownerReferences:
  - apiVersion: capsule.clastix.io/v1beta2
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: solar

and the regular Pod Security Admission Controller does the magic:

kubectl --kubeconfig alice-oil.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

The request gets denied:

Error from server (Forbidden): error when creating "STDIN":
pods "nginx" is forbidden: violates PodSecurity "baseline:latest": privileged
(container "nginx" must not set securityContext.privileged=true)

If the tenant owner tries to change o delete the above labels, Capsule will reconcile them to the original tenant manifest set by the cluster admin.

As additional security measure, the cluster admin can also prevent the tenant owner to make an improper usage of the above labels:

kubectl annotate tenant solar \
  capsule.clastix.io/forbidden-namespace-labels-regexp="pod-security.kubernetes.io\/(enforce|warn|audit)"

In that case, the tenant owner gets denied if she tries to use the labels:

kubectl --kubeconfig alice-solar.kubeconfig label ns solar-production \
    pod-security.kubernetes.io/enforce=restricted \
    --overwrite

Error from server (Label pod-security.kubernetes.io/audit is forbidden for namespaces in the current Tenant ...

Pod Security Policies

As stated in the documentation, “PodSecurityPolicies enable fine-grained authorization of pod creation and updates. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the pod specification. The PodSecurityPolicy objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields.”

Using the Pod Security Policies, the cluster admin can impose limits on pod creation, for example the types of volume that can be consumed, the linux user that the process runs as in order to avoid running things as root, and more. From multi-tenancy point of view, the cluster admin has to control how users run pods in their tenants with a different level of permission on tenant basis.

Assume the Kubernetes cluster has been configured with Pod Security Policy Admission Controller enabled in the APIs server: --enable-admission-plugins=PodSecurityPolicy

The cluster admin creates a PodSecurityPolicy:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp:restricted
spec:
  privileged: false
  # Required to prevent escalations to root.
  allowPrivilegeEscalation: false

Then create a ClusterRole using or granting the said item

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp:restricted
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  resourceNames: ['psp:restricted']
  verbs: ['use']

He can assign this role to all namespaces in a tenant by setting the tenant manifest:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: solar
spec:
  owners:
  - name: alice
    kind: User
  additionalRoleBindings:
  - clusterRoleName: psp:privileged
    subjects:
    - kind: "Group"
      apiGroup: "rbac.authorization.k8s.io"
      name: "system:authenticated"

With the given specification, Capsule will ensure that all tenant namespaces will contain a RoleBinding for the specified Cluster Role:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: 'capsule-solar-psp:privileged'
  namespace: solar-production
  labels:
    capsule.clastix.io/tenant: solar
subjects:
  - kind: Group
    apiGroup: rbac.authorization.k8s.io
    name: 'system:authenticated'
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: 'psp:privileged'

Capsule admission controller forbids the tenant owner to run privileged pods in solar-production namespace and perform privilege escalation as declared by the above Cluster Role psp:privileged.

As tenant owner, creates a namespace:

kubectl --kubeconfig alice-solar.kubeconfig create ns solar-production

and create a pod with privileged permissions:

kubectl --kubeconfig alice-solar.kubeconfig apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: solar-production
spec:
  containers:
  - image: nginx
    name: nginx
    ports:
    - containerPort: 80
    securityContext:
      privileged: true
EOF

Since the assigned PodSecurityPolicy explicitly disallows privileged containers, the tenant owner will see her request to be rejected by the Pod Security Policy Admission Controller.

6 - Troubleshooting

Different topics when you encounter problems with Capsule

Operating

1 - Installation

1.1 - Quickstart

Login as Tenant Owner

Impersonation

Create namespaces

Limiting access

1.2 - Upgrading

Upgrading from v0.2.x to v0.3.x

Scale down the Capsule controller

1.3 - Managed Kubernetes

AWS EKS

Azure AKS

Charmed Kubernetes

1.4 - Controller Options

CapsuleConfiguration

enableTLSReconciler

forceTenantPrefix

nodeMetadata

overrides

protectedNamespaceRegex

userGroups

Controller Options

2 - Authentication

OIDC

Configuring OIDC Server

Configuring Kubernetes API Server

KinD

Configuring kubectl

3 - Monitoring

ResourcePools

Dashboards

Capsule / ResourcePools

Rules

Metrics

Quotas

Metrics

Custom Metrics

4 - Backup & Restore

Create backup of a tenant

Restore a tenant from the backup

5 - Pod Security

Pod Security Standards

Capsule

Pod Security Policies

6 - Troubleshooting