380 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Org Mode
		
	
	
	
	
	
			
		
		
	
	
			380 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Org Mode
		
	
	
	
	
	
#+TITLE: OpenShift Advanced Cluster Management Observability
 | 
						||
#+AUTHOR: James Blair
 | 
						||
#+DATE: <2024-01-09 Tue 08:00>
 | 
						||
 | 
						||
* Introduction
 | 
						||
 | 
						||
This document captures the environment setup steps for a ~30 minute live demo of the [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Red Hat Advanced Cluster Management]] observability feature for [[https://www.redhat.com/en/technologies/cloud-computing/openshift][OpenShift]].
 | 
						||
 | 
						||
 | 
						||
* Pre-requisites
 | 
						||
 | 
						||
This guide assumes you:
 | 
						||
 | 
						||
- Have access to an Amazon Web Services account with permissions to be able to create resources including ~s3~ buckets and ~ec2~ instances. In my case I have an AWS Blank Open Environment provisioned through the Red Hat [[https://demo.redhat.com][demo system]].
 | 
						||
 | 
						||
- Already have the ~aws~ and ~oc~ cli utilities installed.
 | 
						||
 | 
						||
- Have registered for a Red Hat account (required for obtaining an OpenShift install image pull secret).
 | 
						||
 | 
						||
 | 
						||
* 1 - Logging into aws locally
 | 
						||
 | 
						||
Our first step is to login to our aws account locally via the ~aws~ cli which will prompt for four values:
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
aws configure
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
* 2 - Creating s3 bucket
 | 
						||
 | 
						||
After logging into aws lets confirm our permissions are working by creating the ~s3~ bucket we will need later on.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
aws s3 mb "s3://open-cluster-management-observability" --region "$(aws configure get region)"
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
* 3 - Install openshift clusters
 | 
						||
 | 
						||
With our aws credentials working let's move on to deploying the hub and single node openshift cluster required for the live demo.
 | 
						||
 | 
						||
 | 
						||
** 3.1 Download installer tools
 | 
						||
 | 
						||
Our first step will be to ensure we have an up to date version of the ~openshift-install~ cli tool. We can download it as follows:
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
# Download the installer
 | 
						||
wget "https://mirror.openshift.com/pub/openshift-v4/$(uname -m)/clients/ocp/stable/openshift-install-linux.tar.gz"
 | 
						||
 | 
						||
# Extract the archive
 | 
						||
tar xf openshift-install-linux.tar.gz openshift-install && rm openshift-install-linux.tar.gz*
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
** 3.2 Obtain install pull secret
 | 
						||
 | 
						||
Next we have a manual step to login to the Red Hat Hybrid Cloud Console and obtain our *Pull Secret* which will be required for our installation configuration.
 | 
						||
 | 
						||
Open the [[https://console.redhat.com/openshift/create/local][Console]] and click *Download pull secret*. This will download a file called ~pull-secret.txt~ which will be used later on.
 | 
						||
 | 
						||
Once the file downloads ensure it is copied or moved to the directory you will be running the remaining commands on this guide from.
 | 
						||
 | 
						||
 | 
						||
** 3.3 Initiate the hub cluster install
 | 
						||
 | 
						||
Once our install tooling is available let's kick off the installation of our hub cluster by creating a configuration file and then running ~openshift-install~.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF > hub/install-config.yaml
 | 
						||
additionalTrustBundlePolicy: Proxyonly
 | 
						||
apiVersion: v1
 | 
						||
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
 | 
						||
compute:
 | 
						||
- architecture: amd64
 | 
						||
  hyperthreading: Enabled
 | 
						||
  name: worker
 | 
						||
  platform: {}
 | 
						||
  replicas: 0
 | 
						||
controlPlane:
 | 
						||
  architecture: amd64
 | 
						||
  hyperthreading: Enabled
 | 
						||
  name: master
 | 
						||
  platform: {}
 | 
						||
  replicas: 3
 | 
						||
metadata:
 | 
						||
  creationTimestamp: null
 | 
						||
  name: hub
 | 
						||
networking:
 | 
						||
  clusterNetwork:
 | 
						||
  - cidr: 10.128.0.0/14
 | 
						||
    hostPrefix: 23
 | 
						||
  machineNetwork:
 | 
						||
  - cidr: 10.0.0.0/16
 | 
						||
  networkType: OVNKubernetes
 | 
						||
  serviceNetwork:
 | 
						||
  - 172.30.0.0/16
 | 
						||
platform:
 | 
						||
  aws:
 | 
						||
    region: $(aws configure get region)
 | 
						||
publish: External
 | 
						||
pullSecret: |
 | 
						||
  $(cat pull-secret.txt)
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
./openshift-install create cluster --dir hub --log-level info
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
** 3.4 Initiate the sno cluster install
 | 
						||
 | 
						||
We can run our single node openshift cluster install at the same time in a separate terminal to speed things up.  The process is the same we will first create an ~install-config.yaml~ file, then run ~openshift-install~.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF > sno/install-config.yaml
 | 
						||
additionalTrustBundlePolicy: Proxyonly
 | 
						||
apiVersion: v1
 | 
						||
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
 | 
						||
compute:
 | 
						||
- architecture: amd64
 | 
						||
  hyperthreading: Enabled
 | 
						||
  name: worker
 | 
						||
  platform: {}
 | 
						||
  replicas: 0
 | 
						||
controlPlane:
 | 
						||
  architecture: amd64
 | 
						||
  hyperthreading: Enabled
 | 
						||
  name: master
 | 
						||
  platform: {}
 | 
						||
  replicas: 1
 | 
						||
metadata:
 | 
						||
  creationTimestamp: null
 | 
						||
  name: sno
 | 
						||
networking:
 | 
						||
  clusterNetwork:
 | 
						||
  - cidr: 10.128.0.0/14
 | 
						||
    hostPrefix: 23
 | 
						||
  machineNetwork:
 | 
						||
  - cidr: 10.0.0.0/16
 | 
						||
  networkType: OVNKubernetes
 | 
						||
  serviceNetwork:
 | 
						||
  - 172.30.0.0/16
 | 
						||
platform:
 | 
						||
  aws:
 | 
						||
    region: $(aws configure get region)
 | 
						||
publish: External
 | 
						||
pullSecret: |
 | 
						||
  $(cat pull-secret.txt)
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
./openshift-install create cluster --dir sno --log-level info
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
* 4 - Install advanced cluster management
 | 
						||
 | 
						||
To make use of the Red Hat Advanced Cluster Management Observability feature we need to first install [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Advanced Cluster Management]] on our hub cluster via the acm operator.
 | 
						||
 | 
						||
Let's get started by creating an ~OperatorGroup~ and ~Subscription~ which will install the operator.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
oc --kubeconfig hub/auth/kubeconfig create namespace open-cluster-management
 | 
						||
 | 
						||
cat << EOF | oc --kubeconfig hub/auth/kubeconfig apply --filename -
 | 
						||
apiVersion: operators.coreos.com/v1
 | 
						||
kind: OperatorGroup
 | 
						||
metadata:
 | 
						||
  name: acm-operator-group
 | 
						||
  namespace: open-cluster-management
 | 
						||
spec:
 | 
						||
  targetNamespaces:
 | 
						||
    - open-cluster-management
 | 
						||
 | 
						||
---
 | 
						||
apiVersion: operators.coreos.com/v1alpha1
 | 
						||
kind: Subscription
 | 
						||
metadata:
 | 
						||
  name: acm-operator-subscription
 | 
						||
  namespace: open-cluster-management
 | 
						||
spec:
 | 
						||
  sourceNamespace: openshift-marketplace
 | 
						||
  source: redhat-operators
 | 
						||
  channel: release-2.9
 | 
						||
  installPlanApproval: Automatic
 | 
						||
  name: advanced-cluster-management
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
Once the operator is installed we can create the ~MultiClusterHub~ resource to install Advanced Cluster Management.
 | 
						||
 | 
						||
Note: It can take up to ten minutes for this to complete.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF | oc --kubeconfig hub/auth/kubeconfig apply --filename -
 | 
						||
apiVersion: operator.open-cluster-management.io/v1
 | 
						||
kind: MultiClusterHub
 | 
						||
metadata:
 | 
						||
  name: multiclusterhub
 | 
						||
  namespace: open-cluster-management
 | 
						||
  spec: {}
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
* 5 - Enable acm observability
 | 
						||
 | 
						||
Now, with our clusters deployed and acm installed we can enable the observability service by creating a ~MultiClusterObservability~ custom resource instance on the ~hub~ cluster.
 | 
						||
 | 
						||
Our first step towards this is to create two secrets.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
oc --kubeconfig hub/auth/kubeconfig create namespace open-cluster-management-observability
 | 
						||
 | 
						||
DOCKER_CONFIG_JSON=`oc --kubeconfig hub/auth/kubeconfig extract secret/pull-secret -n openshift-config --to=-`
 | 
						||
 | 
						||
oc --kubeconfig hub/auth/kubeconfig create secret generic multiclusterhub-operator-pull-secret \
 | 
						||
    -n open-cluster-management-observability \
 | 
						||
    --from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \
 | 
						||
    --type=kubernetes.io/dockerconfigjson
 | 
						||
 | 
						||
 | 
						||
cat << EOF | oc --kubeconfig hub/auth/kubeconfig apply --filename -
 | 
						||
apiVersion: v1
 | 
						||
kind: Secret
 | 
						||
metadata:
 | 
						||
  name: thanos-object-storage
 | 
						||
  namespace: open-cluster-management-observability
 | 
						||
type: Opaque
 | 
						||
stringData:
 | 
						||
  thanos.yaml: |
 | 
						||
    type: s3
 | 
						||
    config:
 | 
						||
      bucket: open-cluster-management-observability
 | 
						||
      endpoint: s3.$(aws configure get region).amazonaws.com
 | 
						||
      insecure: true
 | 
						||
      access_key: $(aws configure get aws_access_key_id)
 | 
						||
      secret_key: $(aws configure get aws_secret_access_key)
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
Once the two required secrets exist we can create the ~MultiClusterObservability~ resource as follows:
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF | oc --kubeconfig hub/auth/kubeconfig apply --filename -
 | 
						||
apiVersion: observability.open-cluster-management.io/v1beta2
 | 
						||
kind: MultiClusterObservability
 | 
						||
metadata:
 | 
						||
  name: observability
 | 
						||
spec:
 | 
						||
  observabilityAddonSpec: {}
 | 
						||
  storageConfig:
 | 
						||
    metricObjectStorage:
 | 
						||
      name: thanos-object-storage
 | 
						||
      key: thanos.yaml
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
After creating the resource and waiting briefyl we can access the grafana console via the ~Route~ to confirm everything is running:
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
echo "https://$(oc --kubeconfig hub/auth/kubeconfig get route -n open-cluster-management-observability grafana -o jsonpath={.spec.host})"
 | 
						||
#+end_src
 | 
						||
 | 
						||
 | 
						||
* 6 - Import the single node openshift cluster into acm
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
oc --kubeconfig hub/auth/kubeconfig new-project sno
 | 
						||
oc --kubeconfig hub/auth/kubeconfig label namespace sno cluster.open-cluster-management.io/managedCluster=sno
 | 
						||
#+end_src
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF | oc --kubeconfig hub/auth/kubeconfig apply --filename -
 | 
						||
apiVersion: cluster.open-cluster-management.io/v1
 | 
						||
kind: ManagedCluster
 | 
						||
metadata:
 | 
						||
  name: sno
 | 
						||
spec:
 | 
						||
  hubAcceptsClient: true
 | 
						||
 | 
						||
---
 | 
						||
apiVersion: agent.open-cluster-management.io/v1
 | 
						||
kind: KlusterletAddonConfig
 | 
						||
metadata:
 | 
						||
  name: sno
 | 
						||
  namespace: sno
 | 
						||
spec:
 | 
						||
  clusterName: sno
 | 
						||
  clusterNamespace: sno
 | 
						||
  applicationManager:
 | 
						||
    enabled: true
 | 
						||
  certPolicyController:
 | 
						||
    enabled: true
 | 
						||
  clusterLabels:
 | 
						||
    cloud: auto-detect
 | 
						||
    vendor: auto-detect
 | 
						||
  iamPolicyController:
 | 
						||
    enabled: true
 | 
						||
  policyController:
 | 
						||
    enabled: true
 | 
						||
  searchCollector:
 | 
						||
    enabled: true
 | 
						||
  version: 2.0.0
 | 
						||
EOF
 | 
						||
#+end_src
 | 
						||
 | 
						||
The ManagedCluster-Import-Controller will generate a secret named ~sno-import~. The ~sno-import~ secret contains the ~import.yaml~ that the user applies to a managed cluster to install ~klusterlet~.
 | 
						||
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
oc --kubeconfig hub/auth/kubeconfig get secret sno-import -n sno -o jsonpath={.data.crds\\.yaml} | base64 --decode > klusterlet-crd.yaml
 | 
						||
oc --kubeconfig hub/auth/kubeconfig get secret sno-import -n sno -o jsonpath={.data.import\\.yaml} | base64 --decode > import.yaml
 | 
						||
 | 
						||
oc --kubeconfig sno/auth/kubeconfig apply --filename klusterlet-crd.yaml
 | 
						||
oc --kubeconfig sno/auth/kubeconfig apply --filename import.yaml
 | 
						||
#+end_src
 | 
						||
 | 
						||
If everything works fine you should see ~JOINED~ and ~AVAILABLE~ sno cluster from within your hub cluster.
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
❯ kubectl get managedcluster -n sno
 | 
						||
NAME            HUB ACCEPTED   MANAGED CLUSTER URLS                                           JOINED   AVAILABLE   AGE
 | 
						||
local-cluster   true           https://api.hub.<yourdomain>.com:6443                          True     True        5h12m
 | 
						||
sno             true           https://api.cluster-vzmvz.<yourdomain>.com:6443                True     True        31m
 | 
						||
#+end_src
 | 
						||
 | 
						||
* 7 - Creating the edge workload
 | 
						||
 | 
						||
For edge scenarios we only send metrics to the hub cluster if certain thresholds are hit for a certain period of time (here ~70%~ cpu for more than 2 minutes) - you can see this configuration in the ~open-cluster-management-addon-observability~ namespace under ConfigMaps observability-metrics-allowlist in the collect_rules section under SNOHighCPUUsage).
 | 
						||
 | 
						||
In order to hit that trigger we now deploy a cpu-heavy workload in order for sno-cluster metrics being sent to the ACM hub cluster.
 | 
						||
 | 
						||
Let's get started by creating a new project on the sno cluster:
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
oc new-project cpu-load-test
 | 
						||
#+end_src
 | 
						||
 | 
						||
and deploy the cpu-load-container workload on a busybox container
 | 
						||
 | 
						||
#+begin_src tmux
 | 
						||
cat << EOF | oc apply --filename -
 | 
						||
apiVersion: apps/v1
 | 
						||
kind: Deployment
 | 
						||
metadata:
 | 
						||
  name: cpu-load-test
 | 
						||
spec:
 | 
						||
  replicas: 5
 | 
						||
  selector:
 | 
						||
    matchLabels:
 | 
						||
      app: cpu-load-test
 | 
						||
  template:
 | 
						||
    metadata:
 | 
						||
      labels:
 | 
						||
        app: cpu-load-test
 | 
						||
    spec:
 | 
						||
      containers:
 | 
						||
      - name: cpu-load-container
 | 
						||
        image: busybox
 | 
						||
        command: ["/bin/sh", "-c"]
 | 
						||
        args:
 | 
						||
          - while true; do
 | 
						||
              echo "Performing CPU load...";
 | 
						||
              dd if=/dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 1000 | head -n 1000000 > /dev/null;
 | 
						||
            done
 | 
						||
EOF
 | 
						||
#+end_src
 |