#+TITLE: OpenShift Advanced Cluster Management Observability #+AUTHOR: James Blair #+DATE: <2024-01-09 Tue 08:00> * Introduction This document captures the environment setup steps for a ~30 minute live demo of the [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Red Hat Advanced Cluster Management]] observability feature for [[https://www.redhat.com/en/technologies/cloud-computing/openshift][Openshift]]. * Pre-requisites This guide assumes you: - Have access to an Amazon Web Services account with permissions to be able to create resources including ~s3~ buckets and ~ec2~ instances. In my case I have an AWS Blank Open Environment provisioned through the Red Hat [[https://demo.redhat.com][demo system]]. - Already have the ~aws~ and ~oc~ cli utilities installed. - Have registered for a Red Hat account (required for obtaining an OpenShift install image pull secret). * 1 - Logging into aws locally Our first step is to login to our aws account locally via the ~aws~ cli which will prompt for four values: #+begin_src tmux aws configure #+end_src * 2 - Creating s3 bucket After logging into aws lets confirm our permissions are working by creating the ~s3~ bucket we will need later on. #+begin_src tmux aws s3 mb "s3://open-cluster-management-observability" --region "$(aws configure get region)" #+end_src * 3 - Install openshift clusters With our aws credentials working let's move on to deploying the hub and single node openshift cluster required for the live demo. ** 3.1 Download installer tools Our first step will be to ensure we have the ~openshift-install~ cli tool. We can download it as follows: #+begin_src tmux # Download the installer wget "https://mirror.openshift.com/pub/openshift-v4/$(uname -m)/clients/ocp/stable/openshift-install-linux.tar.gz" # Extract the archive tar xf openshift-install-linux.tar.gz #+end_src ** 3.2 Obtain install pull secret Next we have a manual step to login to the Red Hat Hybrid Cloud Console and obtain our **Pull Secret** which will be required for our installation configuration. Open the [[https://console.redhat.com/openshift/create/local][Console]] and click **Download pull secret**. This will download a file called ~pull-secret.txt~ which will be used later on. ** 3.3 Create ssh key For access to our soon to be created clusters we need an ssh key, let's generate those now via ~ssh-keygen~. #+begin_src tmux ssh-keygen -t rsa -b 4096 -f ~/.ssh/hubkey -q -N "" ssh-keygen -t rsa -b 4096 -f ~/.ssh/snokey -q -N "" #+end_src ** 3.3 Initiate the hub cluster install Once our install tooling is available let's kick off the installation of our hub cluster by creating a configuration file and then running ~openshift-install~. #+begin_src tmux cat << EOF > hub/install-config.yaml additionalTrustBundlePolicy: Proxyonly apiVersion: v1 baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[].Name' -r | sed 's/.$//') compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: {} replicas: 3 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 3 metadata: creationTimestamp: null name: hub networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 platform: aws: region: $(aws configure get region) publish: External pullSecret: | $(cat pull-secret.txt) sshKey: | $(cat ~/.ssh/hubkey.pub) EOF #+end_src Once the configuration file is created we can kick off the install with ~openshift-install~ as follows: #+begin_src tmux ./openshift-install create cluster --dir hub --log-level info #+end_src ** 3.4 Initiate the sno cluster install We can run our single node openshift cluster install at the same time in a separate terminal to speed things up. The process is the same we will first create an ~install-config.yaml~ file, then run ~openshift-install~. #+begin_src tmux cat << EOF > sno/install-config.yaml additionalTrustBundlePolicy: Proxyonly apiVersion: v1 baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[].Name' -r | sed 's/.$//') compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: {} replicas: 0 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 1 metadata: creationTimestamp: null name: sno networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OVNKubernetes serviceNetwork: - 172.30.0.0/16 platform: aws: region: $(aws configure get region) publish: External pullSecret: | $(cat pull-secret.txt) sshKey: | $(cat ~/.ssh/snokey.pub) EOF #+end_src Once the configuration file is created we can kick off the install with ~openshift-install~ as follows: #+begin_src tmux ./openshift-install create cluster --dir sno --log-level info #+end_src * 4 - Install advanced cluster management To make use of the Red Hat Advanced Cluster Management Observability feature we need to first install advanced cluster management on our hub cluster via the acm operator. Let's get started by creating an ~OperatorGroup~ and ~Subscription~ which will install the operator. #+begin_src tmux oc create namespace open-cluster-management cat << EOF | oc apply --filename - apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: acm-operator-group namespace: open-cluster-management spec: targetNamespaces: - open-cluster-management --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: acm-operator-subscription namespace: open-cluster-management spec: sourceNamespace: openshift-marketplace source: redhat-operators channel: release-2.9 installPlanApproval: Automatic name: advanced-cluster-management EOF #+end_src Once the operator is installed we can create the ~MultiClusterHub~ resource to install Advanced Cluster Management. Note: It can take up to ten minutes for this to complete. #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: operator.open-cluster-management.io/v1 kind: MultiClusterHub metadata: name: multiclusterhub namespace: open-cluster-management spec: {} EOF #+end_src * 5 - Enable acm observability Now, with our clusters deployed and acm installed we can enable the observability service by creating a ~MultiClusterObservability~ custom resource instance on the ~hub~ cluster. Our first step towards this is to create two secrets. #+begin_src tmux oc create namespace open-cluster-management-observability DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-` oc create secret generic multiclusterhub-operator-pull-secret \ -n open-cluster-management-observability \ --from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \ --type=kubernetes.io/dockerconfigjson cat << EOF | oc apply --filename - apiVersion: v1 kind: Secret metadata: name: thanos-object-storage namespace: open-cluster-management-observability type: Opaque stringData: thanos.yaml: | type: s3 config: bucket: open-cluster-management-observability endpoint: s3.$(aws configure get region).amazonaws.com insecure: true access_key: $(aws configure get aws_access_key_id) secret_key: $(aws configure get aws_secret_access_key) EOF #+end_src Once the two required secrets exist we can create the ~MultiClusterObservability~ resource as follows: #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: observability.open-cluster-management.io/v1beta2 kind: MultiClusterObservability metadata: name: observability spec: observabilityAddonSpec: {} storageConfig: metricObjectStorage: name: thanos-object-storage key: thanos.yaml EOF #+end_src After creating the resource and waiting briefyl we can access the grafana console via the ~Route~ to confirm everything is running: #+begin_src tmux echo "https://$(oc get route -n open-cluster-management-observability grafana -o jsonpath={.spec.host})" #+end_src * 6 - Import the single node openshift cluster into acm #+begin_src tmux oc new-project sno oc label namespace sno cluster.open-cluster-management.io/managedCluster=sno #+end_src #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: cluster.open-cluster-management.io/v1 kind: ManagedCluster metadata: name: sno spec: hubAcceptsClient: true --- apiVersion: agent.open-cluster-management.io/v1 kind: KlusterletAddonConfig metadata: name: sno namespace: sno spec: clusterName: sno clusterNamespace: sno applicationManager: enabled: true certPolicyController: enabled: true clusterLabels: cloud: auto-detect vendor: auto-detect iamPolicyController: enabled: true policyController: enabled: true searchCollector: enabled: true version: 2.0.0 EOF #+end_src The ManagedCluster-Import-Controller will generate a secret named ~sno-import~. The ~sno-import~ secret contains the ~import.yaml~ that the user applies to a managed cluster to install ~klusterlet~. #+begin_src tmux oc get secret sno-import -n sno -o jsonpath={.data.crds\\.yaml} | base64 --decode > klusterlet-crd.yaml oc get secret sno-import -n sno -o jsonpath={.data.import\\.yaml} | base64 --decode > import.yaml oc --kubeconfig sno/auth/kubeconfig apply --filename klusterlet-crd.yaml oc --kubeconfig sno/auth/kubeconfig apply --filename import.yaml #+end_src If everything works fine you should see JOINED and AVAILABLE sno cluster from within your hub cluster #+begin_src tmux ❯ kubectl get managedcluster -n sno NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE local-cluster true https://api.hub..com:6443 True True 5h12m sno true https://api.cluster-vzmvz..com:6443 True True 31m #+end_src * 7 - Creating the edge workload on SNO For edge scenarios we only send metrics to the hub cluster if certain thresholds are hit for a certain period of time (here 70% for more than 2 minutes - you can see this configuration in the open-cluster-management-addon-observability namespace under ConfigMaps observability-metrics-allowlist in the collect_rules section under SNOHighCPUUsage). In order to hit that trigger we now deploy a cpu-heavy workload in order for sno-cluster metrics being sent to the ACM hub cluster. Let's get started by creating a new project on the sno cluster: #+begin_src tmux oc new-project cpu-load-test #+end_src and deploy the cpu-load-container workload on a busybox container #+begin_src tmux cat << EOF | oc apply --filename - apiVersion: apps/v1 kind: Deployment metadata: name: cpu-load-test spec: replicas: 5 selector: matchLabels: app: cpu-load-test template: metadata: labels: app: cpu-load-test spec: containers: - name: cpu-load-container image: busybox command: ["/bin/sh", "-c"] args: - while true; do echo "Performing CPU load..."; dd if=/dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 1000 | head -n 1000000 > /dev/null; done EOF #+end_src