Kubernetes Cluster
The Kubernetes Cluster blueprint triggers the required workflows to provide a Kubernetes cluster with TKGi. The version of Kubernetes cannot be chosen and depends on the version of TKGi deployed by Swisscom. The current installed versions are:
- TKGi: v1.15.6
- Kubernetes: v1.24.14
The Kubernetes cluster deployed by TKGi is a standard Kubernetes cluster and most of the standard concepts are available. Please note that there might be some differences with what can be found on various Kubernetes offerings such as available annotations or pre-configured Custom Resource Definitions. Documentation regarding TKGi can be found on the VMware documentation. Documentation on Kubernetes is available on its official website.
The cluster is deployed within the chosen Kubernetes environment and its API is accessible through an IP randomly taken from the VIP Pool specified in your environment, using for instance the kubectl
CLI tool on this IP on port 8443. Please note that you should configure your DNS to point the domain selected at provisioning to this IP, though you may also edit your local hosts file or disable host verification in your local Kubernetes CLI configuration.
The plans are described within the service description. Depending on the plan selected, the cluster will be deployed with 1 or 3 Master nodes, on which no containers can be scheduled, and at least 1 Worker node. A Load Balancer is provisioned as well which will by default configured to handle the API service on one IP, with another IP reserved for Ingress traffic, as TKGI comes with a default Ingress Controller provided by VMware NCP.
In this version of the service, privileged containers are enabled.
Plans
Flex cluster plans
You could customize the compute instances (CPU, Memory, Storage) for your clusters via Worker Node Pools.
During cluster creation, you need to choose between these two plans:
Plan | Control Plane | Nodes |
---|---|---|
basic | 1 Master w/ 2 vCPUs, 8GB RAM, 64GB Storage | Customizable via worker node pools |
advanced | 3 Masters w/ 4 vCPUs, 16GB RAM, 128GB Storage | Customizable via worker node pools |
Legacy Plans
Before flex cluster plans (01.06.2023), the service used T-shirt size plans.
Plan | Master | Worker | Worker Count | Worker Storage |
---|---|---|---|---|
2c8r.dev | 1 Master w/ 2 CPU, 8GB RAM | 2 CPU, 8GB RAM | 1 - 10 | 30GB |
2c8r.std | 3 Masters w/ 2 CPU, 8GB RAM | 2 CPU, 8GB RAM | 1 - 10 | 30GB |
4c16r.std | 3 Masters w/ 4 CPU, 16GB RAM | 4 CPU, 16GB RAM | 1 - 30 | 30GB |
4c32r.mem | 3 Masters w/ 4 CPU, 16GB RAM | 4 CPU, 32GB RAM | 1 - 50 | 80GB |
8c32r.std | 3 Masters w/ 4 CPU, 16GB RAM | 8 CPU, 32GB RAM | 1 - 40 | 80GB |
8c64r.mem | 3 Masters w/ 4 CPU, 16GB RAM | 8 CPU, 64GB RAM | 1 - 50 | 150GB |
16c64r.std | 3 Masters w/ 4 CPU, 16GB RAM | 16 CPU, 64GB RAM | 1 - 60 | 150GB |
16c128r.mem | 3 Masters w/ 4 CPU, 16GB RAM | 16 CPU, 128GB RAM | 1 - 50 | 200GB |
Default Storage
If you created, updated or upgraded your cluster after 02.03.2023, it has been migrated to the VMware CSI storage driver. This means that you should use the caas-persistent-storage
storage class. Please read more here.
If you are still running on the legacy VCP storage driver, then use the pks-default-thick-storage
storage class.
Creating a persistent volume claim will trigger the creation of a ReadWriteOnce
Persistent Volume represented as a vmdk
volume in the underlying vSphere Datastore associated with your cluster. Each cluster comes with an allowance of 1TB. This storage could be extended as a day-2 action.
Using hostpath
as storage is highly discouraged since the updates from Swisscom might delete your worker node.
You could attach up to 45 PVs per worker node.
Kubernetes Services
Services with the type loadbalancer
take an IP from the floating pool of your environment. Your workload will be accessible from this IP address. Please note that you can ignore the first IP (169.254.x.y) which is an internal artifact.
$ kubectl get svc http-lb
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
http-lb LoadBalancer 10.100.200.155 169.254.128.3,172.16.200.27 80:32499/TCP 28s
The service type nodePort
is not supported by TKGi.
Upgrade procedure
Upgrading a Swisscom-provisioned (and TKGI-based) Kubernetes cluster updates the Tanzu Kubernetes Grid Integrated Edition version (TKGi system processes and pods) and the Kubernetes version of the cluster.
The upgrade procedure includes the following steps:
- The cluster is positioned in a queue for upgrading (this allows simulateneous triggering of upgrade on many clusters). The cluster could stay in the backend queue for days and eventually the cluster upgrade will get processed.
- One extra node is added to compensate for the rolling upgrade of the cluster where the nodes of the cluster are drained and updated one after another sequentially. Important: the extra node is not billed.
- The upgrade of the cluster is started:
The masters are updated one by one for availability. This includes TKGi system processes and Kubernetes processes (e.g. kube-apiserver, kube-controller-manager, etc.).
The nodes are updated one by one. Drain might get stuck because of customer defined PodDiscuptionBudgets. In such cases, the drain is forced after 30mins.
The node update includes TKGi system processes and Kubernetes processes (e.g. kubelet).
Once the node is updated, it is marked as schedulable again.
- The extra node is removed.
Upgrading of a cluster could be useful even when the cluster is already on the latest version.
- It propagates any platform changes done by Swisscom.
- It could be used as a trigger for one-time migrations (e.g. storage driver migration, container runtime migration). In such cases, additional information is advertised.
- It could be used as a trigger for regular maintenance operations (e.g. certificates rotation).
- It corrects any configuration drift on the cluster - manual changes on system processes will be reverted.
Important: Skipping MINOR versions when upgrading is unsupported. For example, upgrading from TKGi 1.11.x to 1.13.x is highly not recommended.
Migration to containerd CRI
A container runtime is software that can execute the containers that make up a Kubernetes pod. The kubelet process uses the container runtime interface (CRI) as an abstraction so that you can use any compatible container runtime.
In its earliest releases, Kubernetes offered compatibility with one container runtime: Docker. Later in the Kubernetes project's history, cluster operators wanted to adopt additional container runtimes. The CRI was designed to allow this kind of flexibility - and the kubelet began supporting CRI. However, because Docker existed before the CRI specification was invented, the Kubernetes project created an adapter component, dockershim. The dockershim adapter allows the kubelet to interact with Docker as if Docker were a CRI compatible runtime.
Kubernetes v1.24 removes the support for Docker as a container runtime by removing the built-in dockershim component. Upgrading your cluster to Kubernetes v1.24 will force the migration to containerd.
However, you are able to migrate to containerd before Kubernetes v1.24 is released (TKGi 1.14).
The migration happens automatically if you update or upgrade a cluster after 02.03.2023.
You could continue using the same Docker images after the migration.
You could find more information regarding the official migration process on the official Kubernetes web site.
Before Kubernetes v1.24 is released (TKGi 1.14) we are able to roll back a cluster to Docker. If you need this please open a support ticket.
Migration to VMware CSI storage driver
The Container Storage Interface (CSI) was designed to help Kubernetes replace its existing, in-tree storage driver mechanisms - especially vendor specific plugins. Support for using CSI drivers was introduced to make it easier to add and maintain new integrations between Kubernetes and storage backend technologies.
The VMware's in-tree VCP storage driver is getting deprecated and should be replaced with the VMware's CSI storage driver.
The migration of the existing legacy PVCs to CSI happens seamlessly once triggered.
The migration happens automatically if you update or upgrade a cluster after 02.03.2023.
However, after the cluster is migrated to use CSI, you should start using the new caas-persistent-storage
storage class when requesting new PVCs. Using the old pks-default-thick-storage
storage class will result in an error.
If you use additional persistent storage, you do not need to define your customer storage class anymore. This is explained here.
Once all clusters are migrated to CSI, the new "caas-persistent-storage" storage class will become the default one.
The CSI driver includes many improvements over the legacy VCP driver, including support for features such as volume expansion. Please read more on the official VMware web site.
Volumes snapshots with the VMware's CSI driver will be avilable with TKGi 1.16.
ETCD encryption
All Kubernetes clusters created after 02.03.2023 are with encrypted ETCD database. The encryption key is managed on the provider side.
Foundation & Failure domains
Once your cluster is created, you will find 2 fields called Foundation and Failure Domains. Named after Swiss rivers such as Limmat, Aare, Sihl, the foundation field represents the production stack where your kubernetes clusters are running. All clusters under the same foundation will be subject to the same maintenance operations, such as new kubernetes versions, general maintenance and so on. This field will be used in our communication regarding maintenance operations or any relevant information. Please specify it if creating a support ticket.
The Failure Domains field indicates the 3 geographic locations of the datacenters used to provide the HA setup. Master and Worker nodes will be distributed across the 3 locations. Therefore one third of your workload will be distributed in each location. This means you will have compute resources (workers) in every location. You can influence the pod placement stategy if you use the kubernetes topology field in your scheduling strategy.