Version: 1.22.X

v1.22.0 to 1.22.1 Upgrade Guide

This guide describes the steps to follow to upgrade the Kubernetes Fury Distribution (KFD) from v1.22.0 to v1.22.1

If you are running a custom set of modules, or different versions than the ones included with each release of KFD, please refer to each module's release notes.

Notice that the guide will not cover changes related to the cloud provider, ingresses or pod placement changes. Only changes related to KFD and its modules.

ℹī¸ INFO starting from 1.22.1, 1.23.3 and 1.24.0, due to the size of some resources, you will need to use the --server-side flag when performing kubectl apply. Server side apply behaves slighly different than client-side, please read the official documentation first.

⛔ī¸ IMPORTANT we strongly recommend reading the whole guide before starting the upgrade process to identify possible blockers.

⚠ī¸ WARNING the upgrade process involves downtime of some components.

Upgrade procedure​

As a high-level overview, the upgrade procedure consists on:

  1. Upgrading KFD (all the core modules).
  2. Upgrading the Kubernetes cluster itself.

1. Upgrade KFD​

The suggested approach to upgrade the distribution is to do it one module at a time, to reduce the risk of errors and to make the process more manageable.

Networking module upgrade​

To upgrade the Networking module to the new version, update the version on the Furyfile.yml file to the new version:

networking: v1.10.0

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

Apply your Kustomize project that uses Networking module packages as bases with:

⛔ī¸ IMPORTANT you may want to limit the scope of the command to only the networking module, otherwise, the first time you apply with --server-side other pods may also be restarted.

The same applies to the rest of the modules, we will not include this warning in every step for simplicity.

kustomize build <your-project-path/monitoring-base> | kubectl apply -f - --server-side --force-conflicts

Wait until all Calico pods are restarted and running. You can check Calico's Grafana dashboard "General / Felix Dashboard (Calico)" and the "Networking / *" dashboards to make sure everything is working as expected.

Monitoring module upgrade​

⚠ī¸ WARNING downtime for the Monitoring stack is expected during this process.

To upgrade the Monitoring module to the new version, update the version on the Furyfile.yml file to the new version:

monitoring: v2.0.1

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

This time, before applying the project, you need to do some manual steps on the existing resources:

Since the new release ships changes to some immutable fields, the upgrade process will involve the deletion and recreation of some resources.

# Prometheus Operator
kubectl delete deployments.apps prometheus-operator -n monitoring

# Prometheus Operated
kubectl delete poddisruptionbudgets.policy prometheus-k8s -n monitoring
kubectl delete prometheus-k8s-scrape
kubectl delete prometheus-k8s-scrape
kubectl delete prometheus-k8s-rules -n monitoring

# Alertmanager Operated
kubectl delete poddisruptionbudget.policy alertmanager-main -n monitoring

# Remove Goldpinger (deprecated)
kubectl delete goldpinger -n monitoring
kubectl delete service goldpinger -n monitoring
kubectl delete daemonset.apps goldpinger -n monitoring
kubectl delete goldpinger
kubectl delete serviceaccount goldpinger -n monitoring
kubectl delete goldpinger:cluster:view -n monitoring
kubectl delete -n monitoring configmaps goldpinger-grafana-dashboard

# Grafana
kubectl delete deployments.apps grafana -n monitoring

# Kube Proxy Metrics
kubectl delete deployments.apps kube-state-metrics -n monitoring

# Remove Metrics Server (deprecated)
kubectl delete
kubectl delete service metrics-server -n kube-system
kubectl delete deployment.apps metrics-server -n kube-system
kubectl delete metrics-server:system:auth-delegator
kubectl delete system:metrics-server
kubectl delete system:aggregated-metrics-reader
kubectl delete system:metrics-server
kubectl delete metrics-server-auth-reader -n kube-system
kubectl delete serviceaccount metrics-server -n kube-system
kubectl delete metrics-server-tls -n kube-system
kubectl delete metrics-server-ca -n kube-system
kubectl delete metrics-server-ca -n kube-system
kubectl delete metrics-server-selfsign -n kube-system
kubectl delete secret metrics-server-ca metrics-server-tls -n kube-system
# Node Exporter
kubectl delete daemonsets.apps node-exporter -n monitoring

# x509 Exporter
kubectl delete serviceaccount x509-certificate-exporter-node -n monitoring
kubectl delete x509-certificate-exporter-node
kubectl delete x509-certificate-exporter-node
kubectl delete daemonset.apps x509-certificate-exporter-nodes -n monitoring

Replace metrics-server with prometheus-adapter package as a base in your project, to replace the functionalities provided by metrics-server.

Delete goldpinger from your Kustomize resources.

Add blackbox-exporter to your Kustomize base.

Alertmanager configuration now expects 3 new secrets infra-slack-webhook, k8s-slack-webhook and healthchecks-webhook in the monitoring namespace with the endpoints where to send the alerts in the url key. We recommend you add them to your Kustomize base.

Example commands to create the secrets:

$ kubectl create secret generic infra-slack-webhook -n monitoring --from-literal url="<your endpoint URL, don't forget to include http(s):// >"
secret/infra-slack-webhook created

$ kubectl create secret generic healthchecks-webhook -n monitoring --from-literal url="<your endpoint URL, don't forget to include http(s):// >"
secret/healthchecks-webhook created

$ kubectl create secret generic k8s-slack-webhook -n monitoring --from-literal url="<your endpoint URL, don't forget to include http(s):// >"
secret/k8s-slack-webhook created

Then apply your Kustomize project that uses Monitoring module packages as bases with:

kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

Wait a minute and check that you can see metrics in Grafana, both old and new, check that all Prometheus Targets are up and that Alertmanager is working as expected.

Logging module upgrade​

ℹī¸ INFO the Logging module has gone under a big refactor, the ElasticSearch stack has been replaced with OpenSearch. Read carefully the instructions.

⚠ī¸ WARNING downtime of the Logging stack is expected during this process.

To upgrade the Logging module to the new version, update the version on the Furyfile.yml file to the new version:

logging: v3.0.1

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

Since this upgrade changes the major version, there are some manual steps involving breaking changes that you need to do before applying the project:

Remove the old fluentd and fluentbit stack:

kubectl delete ds fluentbit -n logging
kubectl delete sts fluentd -n logging

Remove fluentd, elasticsearch-single (or elasticsearch-triple), kibana and curator from your Kustomize project and replace them with the logging-operator, logging-operated, opensearch-single or opensearch-triple, opensearch-dashboards, configs bases on your Kustomize project.

Apply your Kustomize project that uses Logging module packages as bases with:

kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

💡 TIP you may need to apply twice or thrice because of the new CRDs need some time to be available.

ℹī¸ INFO index patterns may take a while to be created in OpenSearch Dashboards. There's a cronjob that runs every hour that creates them.

All the logs will now flow to the new OpenSearch stack.

💡 TIP don't forget create the ingress for OpenSearch Dashbaords (Kibana replacement).

By default the service is called opensearch-dashboards in the monitoring namespace, and the web interface listens on the port 5601.

You can leave the old Elasticsearch/Kibana stack running and remove it after you've verified that everything is working as expected and you don't need the data stored in ElasticSearch anymore. To do so, run the following commands:

kubectl delete statefulset elasticsearch -n logging
kubectl delete service elasticsearch -n logging
kubectl delete prometheusrule es-rules -n logging
kubectl delete servicemonitor elasticsearch -n logging
kubectl delete deployment kibana -n logging
kubectl delete service kibana -n logging
kubectl delete cronjob curator -n logging

ℹī¸ INFO you may need to delete additional resources created in your Kustomize base, Ingress objects for example.

💡 TIP we recommend leaving the ElasticSearch/Kibana stack up for a breif period (like 30 days) and then proceed to delete it.

Beware that you'll need the necessary resources to have both solutions running simultaneously though.

Ingress module upgrade​

⚠ī¸ WARNING some downtime of the NGINX Ingress Controller is expected during the upgrade process.

To upgrade the Ingress module to the new version, update the version on the Furyfile.yml file to the new version:

ingress: v1.13.1

💡 TIP1 external-dns is now part if the Ingress module, you may want to switch to it if you were already using it.

💡 TIP2 if you are on AWS, we have added 2 new modules to the Ingress modules to manage IAM permissions for cert-manager and external-dns.

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

⛔ī¸ IMPORTANT if you are using the dual NGINX Ingress Controller package, make sure that all your ingresses have the .spec.ingressClass field set and that they don't have the annotation before proceeding.

cert-manager has been bumped several versions, please check the upgrade guides in the official documentation. In particular, the update from v1.7 to v1.8 includes some changes to the spec.privateKey.rotationPolicy field, read carefully if you were using it or you had the --feature-gates=ServerSideApply=true flag in the cert-manager controller.

Here you can find the relevant upgrade docs:

Apply your Kustomize project that uses Ingress module packages as bases with:

# For NGINX Ingress Controller SINGLE
kubectl delete ingressclass nginx -n ingress-nginx
# For NGINX Ingress Controller DUAL
kubectl delete ingressclass external internal -n ingress-nginx
# Delete cert-manager deployments to update labels
kubectl delete -n cert-manager deployments.apps cert-manager cert-manager-webhook cert-manager-cainjector
# finally
kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

ℹī¸ INFO you may need to apply twice or thrice because a new Validating webhook is added with this release and it needs some time to come up.

Disaster Recovery module upgrade​

To upgrade the Disaster Recovery module to the new version, update the version on the Furyfile.yml file to the new version:

dr: v1.10.0

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

Apply your Kustomize project that uses Ingress module packages as bases with:

kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

ℹī¸ INFO velero-eks has been deprecated, please use the new aws-velero terraform module instead in case you haven't migrated yet.

Check that all velero's pods are up and running, you may want to manually trigger a backup to test that everything is working as expected. For example:

# create a backup
velero backup create --from-schedule manifests test-upgrade -n kube-system
# ... wait a moment
# check that Phase is completed
velero get backup -n kube-system test-upgrade
# you may want to see some details
velero backup describe test-upgrade -n kube-system

💡 TIP you can use a port-forward minio'UI and login to check that the backups are there.

OPA module upgrade​

To upgrade the OPA module to the new version, update the version on the Furyfile.yml file to the new version:

⚠ī¸ WARNING the http.send OPA built-in is disabled. Check if there are custom rules using the built-in before proceeding. Read here for more details.

opa: v1.7.3

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

Apply your Kustomize project that uses OPA module packages as bases with:

kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

You can try to deploy a pod that is not compliant with the rules deployed in the cluster and also check in Gatekeeper Policy Manager for new violations of the constraints.

ℹī¸ INFO seeing errors like http: TLS handshake error from EOF in Gatekeeper Controller logs is normal. The error is considered harmless. See Gatekeeper's issue #2142 for reference.

Auth module upgrade​

The Auth module is a new addition to KFD, there is no previous version to upgrade from, but, you could have been using Pomerium, Dex and Gangway which were previously included in the Ingress and on-premises modules respectively.

ℹī¸ INFO Pomerium's version has not changed and Dex has been updated for compatibility with Kubernetes 1.24.x, there are no breaking changes.

If you were using these components, adjust your Kustomize project to use the new auth module as a base:

auth: v0.0.2

Then, download the new modules with furyctl with the following command:

furyctl vendor -H

💡 TIP be sure to enable the customHTMLTemplatesDir: /custom-templates config option in Gangway's configuration to use the Fury branded templates. See the example configuration file.

Apply your Kustomize project that uses Auth module packages as bases with:

kustomize build <your-project-path> | kubectl apply -f - --server-side --force-conflicts

🎉 CONGRATULATIONS you have now successfully updated all the core modules to KFD 1.23.3

2. Upgrade Kubernetes​

Being that the underlying Kubernetes cluster could have been created in several different ways, the upgrade of Kubernetes itself is considered out of the scope of this guide.

Please refer to the corresponding documentation for upgrade instructions.

For clusters created with Furyctl:

For clusters created with Fury on-premises: