#+TITLE: OpenShift Advanced Cluster Management Observability #+AUTHOR: James Blair #+DATE: <2024-01-09 Tue 08:00> * Introduction This document captures the environment setup steps for a ~30 minute live demo of the [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Red Hat Advanced Cluster Management]] observability feature for [[https://www.redhat.com/en/technologies/cloud-computing/openshift][OpenShift]]. * Pre-requisites This guide assumes you: - Have access to an Amazon Web Services account with permissions to be able to create resources including ~s3~ buckets and ~ec2~ instances. In my case I have an AWS Blank Open Environment provisioned through the Red Hat [[https://demo.redhat.com][demo system]]. - Already have the ~aws~ and ~oc~ cli utilities installed. - Have registered for a Red Hat account (required for obtaining an OpenShift install image pull secret). * 1 - Logging into aws locally Our first step is to login to our aws account locally via the ~aws~ cli which will prompt for four values: #+begin_src tmux aws configure #+end_src * 2 - Creating s3 bucket After logging into aws lets confirm our permissions are working by creating the ~s3~ bucket we will need later on. #+begin_src tmux aws s3 mb "s3://open-cluster-management-observability" --region "$(aws configure get region)" #+end_src * 3 - Install openshift clusters With our aws credentials working let's move on to deploying the hub and single node openshift cluster required for the live demo. ** 3.1 Download installer tools Our first step will be to ensure we have an up to date version of the ~openshift-install~ cli tool. We can download it as follows: #+begin_src tmux # Download the installer wget "https://mirror.openshift.com/pub/openshift-v4/$(uname -m)/clients/ocp/stable/openshift-install-linux.tar.gz" # Extract the archive tar xf openshift-install-linux.tar.gz openshift-install && rm openshift-install-linux.tar.gz* #+end_src ** 3.2 Obtain install pull secret Next we have a manual step to login to the Red Hat Hybrid Cloud Console and obtain our *Pull Secret* which will be required for our installation configuration. Open the [[https://console.redhat.com/openshift/create/local][Console]] and click *Download pull secret*. This will download a file called ~pull-secret.txt~ which will be used later on. Once the file downloads ensure it is copied or moved to the directory you will be running the remaining commands on this guide from. ** 3.3 Create ssh key For access to our soon to be created cluster nodes we need an ssh key, let's generate those now via ~ssh-keygen~. #+begin_src tmux ssh-keygen -t rsa -b 4096 -f ~/.ssh/hubkey -q -N "" <<< y ssh-keygen -t rsa -b 4096 -f ~/.ssh/snokey -q -N "" <<< y #+end_src ** 3.3 Initiate the hub cluster install Once our install tooling is available let's kick off the installation of our hub cluster by creating a configuration file and then running ~openshift-install~. #+begin_src tmux cat << EOF > hub/install-config.yaml additionalTrustBundlePolicy: Proxyonly apiVersion: v1 baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//') compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: {} replicas: 0 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 3 metadata: creationTimestamp: null name: hub networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 platform: aws: region: $(aws configure get region) publish: External pullSecret: | $(cat pull-secret.txt) EOF #+end_src Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour. #+begin_src tmux ./openshift-install create cluster --dir hub --log-level info #+end_src ** 3.4 Initiate the sno cluster install We can run our single node openshift cluster install at the same time in a separate terminal to speed things up. The process is the same we will first create an ~install-config.yaml~ file, then run ~openshift-install~. #+begin_src tmux cat << EOF > sno/install-config.yaml additionalTrustBundlePolicy: Proxyonly apiVersion: v1 baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//') compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: {} replicas: 0 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 1 metadata: creationTimestamp: null name: sno networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 platform: aws: region: $(aws configure get region) publish: External pullSecret: | $(cat pull-secret.txt) EOF #+end_src Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour. #+begin_src tmux ./openshift-install create cluster --dir sno --log-level info #+end_src * 4 - Install advanced cluster management To make use of the Red Hat Advanced Cluster Management Observability feature we need to first install [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Advanced Cluster Management]] on our hub cluster via the acm operator. Let's get started by creating an ~OperatorGroup~ and ~Subscription~ which will install the operator. #+begin_src tmux oc create namespace open-cluster-management cat << EOF | oc apply --filename - apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: acm-operator-group namespace: open-cluster-management spec: targetNamespaces: - open-cluster-management --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: acm-operator-subscription namespace: open-cluster-management spec: sourceNamespace: openshift-marketplace source: redhat-operators channel: release-2.9 installPlanApproval: Automatic name: advanced-cluster-management EOF #+end_src Once the operator is installed we can create the ~MultiClusterHub~ resource to install Advanced Cluster Management. Note: It can take up to ten minutes for this to complete. #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: operator.open-cluster-management.io/v1 kind: MultiClusterHub metadata: name: multiclusterhub namespace: open-cluster-management spec: {} EOF #+end_src * 5 - Enable acm observability Now, with our clusters deployed and acm installed we can enable the observability service by creating a ~MultiClusterObservability~ custom resource instance on the ~hub~ cluster. Our first step towards this is to create two secrets. #+begin_src tmux oc create namespace open-cluster-management-observability DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-` oc create secret generic multiclusterhub-operator-pull-secret \ -n open-cluster-management-observability \ --from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \ --type=kubernetes.io/dockerconfigjson cat << EOF | oc apply --filename - apiVersion: v1 kind: Secret metadata: name: thanos-object-storage namespace: open-cluster-management-observability type: Opaque stringData: thanos.yaml: | type: s3 config: bucket: open-cluster-management-observability endpoint: s3.$(aws configure get region).amazonaws.com insecure: true access_key: $(aws configure get aws_access_key_id) secret_key: $(aws configure get aws_secret_access_key) EOF #+end_src Once the two required secrets exist we can create the ~MultiClusterObservability~ resource as follows: #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: observability.open-cluster-management.io/v1beta2 kind: MultiClusterObservability metadata: name: observability spec: observabilityAddonSpec: {} storageConfig: metricObjectStorage: name: thanos-object-storage key: thanos.yaml EOF #+end_src After creating the resource and waiting briefyl we can access the grafana console via the ~Route~ to confirm everything is running: #+begin_src tmux echo "https://$(oc get route -n open-cluster-management-observability grafana -o jsonpath={.spec.host})" #+end_src * 6 - Import the single node openshift cluster into acm #+begin_src tmux oc new-project sno oc label namespace sno cluster.open-cluster-management.io/managedCluster=sno #+end_src #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: cluster.open-cluster-management.io/v1 kind: ManagedCluster metadata: name: sno spec: hubAcceptsClient: true --- apiVersion: agent.open-cluster-management.io/v1 kind: KlusterletAddonConfig metadata: name: sno namespace: sno spec: clusterName: sno clusterNamespace: sno applicationManager: enabled: true certPolicyController: enabled: true clusterLabels: cloud: auto-detect vendor: auto-detect iamPolicyController: enabled: true policyController: enabled: true searchCollector: enabled: true version: 2.0.0 EOF #+end_src The ManagedCluster-Import-Controller will generate a secret named ~sno-import~. The ~sno-import~ secret contains the ~import.yaml~ that the user applies to a managed cluster to install ~klusterlet~. #+begin_src tmux oc get secret sno-import -n sno -o jsonpath={.data.crds\\.yaml} | base64 --decode > klusterlet-crd.yaml oc get secret sno-import -n sno -o jsonpath={.data.import\\.yaml} | base64 --decode > import.yaml oc --kubeconfig sno/auth/kubeconfig apply --filename klusterlet-crd.yaml oc --kubeconfig sno/auth/kubeconfig apply --filename import.yaml #+end_src If everything works fine you should see JOINED and AVAILABLE sno cluster from within your hub cluster #+begin_src tmux ❯ kubectl get managedcluster -n sno NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE local-cluster true https://api.hub..com:6443 True True 5h12m sno true https://api.cluster-vzmvz..com:6443 True True 31m #+end_src * 7 - Creating the edge workload on SNO For edge scenarios we only send metrics to the hub cluster if certain thresholds are hit for a certain period of time (here 70% for more than 2 minutes - you can see this configuration in the open-cluster-management-addon-observability namespace under ConfigMaps observability-metrics-allowlist in the collect_rules section under SNOHighCPUUsage). In order to hit that trigger we now deploy a cpu-heavy workload in order for sno-cluster metrics being sent to the ACM hub cluster. Let's get started by creating a new project on the sno cluster: #+begin_src tmux oc new-project cpu-load-test #+end_src and deploy the cpu-load-container workload on a busybox container #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: apps/v1 kind: Deployment metadata: name: cpu-load-test spec: replicas: 5 selector: matchLabels: app: cpu-load-test template: metadata: labels: app: cpu-load-test spec: containers: - name: cpu-load-container image: busybox command: ["/bin/sh", "-c"] args: - while true; do echo "Performing CPU load..."; dd if=/dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 1000 | head -n 1000000 > /dev/null; done EOF #+end_src