Files
talks/2024-01-09-openshift-acm-sno-o11y/README.org

387 lines
12 KiB
Org Mode
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#+TITLE: OpenShift Advanced Cluster Management Observability
#+AUTHOR: James Blair
#+DATE: <2024-01-09 Tue 08:00>
* Introduction
This document captures the environment setup steps for a ~30 minute live demo of the [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Red Hat Advanced Cluster Management]] observability feature for [[https://www.redhat.com/en/technologies/cloud-computing/openshift][OpenShift]].
* Pre-requisites
This guide assumes you:
- Have access to an Amazon Web Services account with permissions to be able to create resources including ~s3~ buckets and ~ec2~ instances. In my case I have an AWS Blank Open Environment provisioned through the Red Hat [[https://demo.redhat.com][demo system]].
- Already have the ~aws~ and ~oc~ cli utilities installed.
- Have registered for a Red Hat account (required for obtaining an OpenShift install image pull secret).
* 1 - Logging into aws locally
Our first step is to login to our aws account locally via the ~aws~ cli which will prompt for four values:
#+begin_src tmux
aws configure
#+end_src
* 2 - Creating s3 bucket
After logging into aws lets confirm our permissions are working by creating the ~s3~ bucket we will need later on.
#+begin_src tmux
aws s3 mb "s3://open-cluster-management-observability" --region "$(aws configure get region)"
#+end_src
* 3 - Install openshift clusters
With our aws credentials working let's move on to deploying the hub and single node openshift cluster required for the live demo.
** 3.1 Download installer tools
Our first step will be to ensure we have an up to date version of the ~openshift-install~ cli tool. We can download it as follows:
#+begin_src tmux
# Download the installer
wget "https://mirror.openshift.com/pub/openshift-v4/$(uname -m)/clients/ocp/stable/openshift-install-linux.tar.gz"
# Extract the archive
tar xf openshift-install-linux.tar.gz openshift-install && rm openshift-install-linux.tar.gz*
#+end_src
** 3.2 Obtain install pull secret
Next we have a manual step to login to the Red Hat Hybrid Cloud Console and obtain our *Pull Secret* which will be required for our installation configuration.
Open the [[https://console.redhat.com/openshift/create/local][Console]] and click *Download pull secret*. This will download a file called ~pull-secret.txt~ which will be used later on.
Once the file downloads ensure it is copied or moved to the directory you will be running the remaining commands on this guide from.
** 3.3 Create ssh key
For access to our soon to be created cluster nodes we need an ssh key, let's generate those now via ~ssh-keygen~.
#+begin_src tmux
ssh-keygen -t rsa -b 4096 -f ~/.ssh/hubkey -q -N "" <<< y
ssh-keygen -t rsa -b 4096 -f ~/.ssh/snokey -q -N "" <<< y
#+end_src
** 3.3 Initiate the hub cluster install
Once our install tooling is available let's kick off the installation of our hub cluster by creating a configuration file and then running ~openshift-install~.
#+begin_src tmux
cat << EOF > hub/install-config.yaml
additionalTrustBundlePolicy: Proxyonly
apiVersion: v1
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
compute:
- architecture: amd64
hyperthreading: Enabled
name: worker
platform: {}
replicas: 0
controlPlane:
architecture: amd64
hyperthreading: Enabled
name: master
platform: {}
replicas: 3
metadata:
creationTimestamp: null
name: hub
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 10.0.0.0/16
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
aws:
region: $(aws configure get region)
publish: External
pullSecret: |
$(cat pull-secret.txt)
EOF
#+end_src
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
#+begin_src tmux
./openshift-install create cluster --dir hub --log-level info
#+end_src
** 3.4 Initiate the sno cluster install
We can run our single node openshift cluster install at the same time in a separate terminal to speed things up. The process is the same we will first create an ~install-config.yaml~ file, then run ~openshift-install~.
#+begin_src tmux
cat << EOF > sno/install-config.yaml
additionalTrustBundlePolicy: Proxyonly
apiVersion: v1
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
compute:
- architecture: amd64
hyperthreading: Enabled
name: worker
platform: {}
replicas: 0
controlPlane:
architecture: amd64
hyperthreading: Enabled
name: master
platform: {}
replicas: 1
metadata:
creationTimestamp: null
name: sno
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 10.0.0.0/16
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
aws:
region: $(aws configure get region)
publish: External
pullSecret: |
$(cat pull-secret.txt)
EOF
#+end_src
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
#+begin_src tmux
./openshift-install create cluster --dir sno --log-level info
#+end_src
* 4 - Install advanced cluster management
To make use of the Red Hat Advanced Cluster Management Observability feature we need to first install [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Advanced Cluster Management]] on our hub cluster via the acm operator.
Let's get started by creating an ~OperatorGroup~ and ~Subscription~ which will install the operator.
#+begin_src tmux
oc create namespace open-cluster-management
cat << EOF | oc apply --filename -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: acm-operator-group
namespace: open-cluster-management
spec:
targetNamespaces:
- open-cluster-management
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: acm-operator-subscription
namespace: open-cluster-management
spec:
sourceNamespace: openshift-marketplace
source: redhat-operators
channel: release-2.9
installPlanApproval: Automatic
name: advanced-cluster-management
EOF
#+end_src
Once the operator is installed we can create the ~MultiClusterHub~ resource to install Advanced Cluster Management.
Note: It can take up to ten minutes for this to complete.
#+begin_src tmux
cat << EOF | oc apply --filename -
apiVersion: operator.open-cluster-management.io/v1
kind: MultiClusterHub
metadata:
name: multiclusterhub
namespace: open-cluster-management
spec: {}
EOF
#+end_src
* 5 - Enable acm observability
Now, with our clusters deployed and acm installed we can enable the observability service by creating a ~MultiClusterObservability~ custom resource instance on the ~hub~ cluster.
Our first step towards this is to create two secrets.
#+begin_src tmux
oc create namespace open-cluster-management-observability
DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-`
oc create secret generic multiclusterhub-operator-pull-secret \
-n open-cluster-management-observability \
--from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \
--type=kubernetes.io/dockerconfigjson
cat << EOF | oc apply --filename -
apiVersion: v1
kind: Secret
metadata:
name: thanos-object-storage
namespace: open-cluster-management-observability
type: Opaque
stringData:
thanos.yaml: |
type: s3
config:
bucket: open-cluster-management-observability
endpoint: s3.$(aws configure get region).amazonaws.com
insecure: true
access_key: $(aws configure get aws_access_key_id)
secret_key: $(aws configure get aws_secret_access_key)
EOF
#+end_src
Once the two required secrets exist we can create the ~MultiClusterObservability~ resource as follows:
#+begin_src tmux
cat << EOF | oc apply --filename -
apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
observabilityAddonSpec: {}
storageConfig:
metricObjectStorage:
name: thanos-object-storage
key: thanos.yaml
EOF
#+end_src
After creating the resource and waiting briefyl we can access the grafana console via the ~Route~ to confirm everything is running:
#+begin_src tmux
echo "https://$(oc get route -n open-cluster-management-observability grafana -o jsonpath={.spec.host})"
#+end_src
* 6 - Import the single node openshift cluster into acm
#+begin_src tmux
oc new-project sno
oc label namespace sno cluster.open-cluster-management.io/managedCluster=sno
#+end_src
#+begin_src tmux
cat << EOF | oc apply --filename -
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: sno
spec:
hubAcceptsClient: true
---
apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: sno
namespace: sno
spec:
clusterName: sno
clusterNamespace: sno
applicationManager:
enabled: true
certPolicyController:
enabled: true
clusterLabels:
cloud: auto-detect
vendor: auto-detect
iamPolicyController:
enabled: true
policyController:
enabled: true
searchCollector:
enabled: true
version: 2.0.0
EOF
#+end_src
The ManagedCluster-Import-Controller will generate a secret named ~sno-import~. The ~sno-import~ secret contains the ~import.yaml~ that the user applies to a managed cluster to install ~klusterlet~.
#+begin_src tmux
oc get secret sno-import -n sno -o jsonpath={.data.crds\\.yaml} | base64 --decode > klusterlet-crd.yaml
oc get secret sno-import -n sno -o jsonpath={.data.import\\.yaml} | base64 --decode > import.yaml
oc --kubeconfig sno/auth/kubeconfig apply --filename klusterlet-crd.yaml
oc --kubeconfig sno/auth/kubeconfig apply --filename import.yaml
#+end_src
If everything works fine you should see JOINED and AVAILABLE sno cluster from within your hub cluster
#+begin_src tmux
kubectl get managedcluster -n sno
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
local-cluster true https://api.hub.<yourdomain>.com:6443 True True 5h12m
sno true https://api.cluster-vzmvz.<yourdomain>.com:6443 True True 31m
#+end_src
* 7 - Creating the edge workload on SNO
For edge scenarios we only send metrics to the hub cluster if certain thresholds are hit for a certain period of time (here 70% for more than 2 minutes - you can see this configuration in the open-cluster-management-addon-observability namespace under ConfigMaps observability-metrics-allowlist in the collect_rules section under SNOHighCPUUsage).
In order to hit that trigger we now deploy a cpu-heavy workload in order for sno-cluster metrics being sent to the ACM hub cluster.
Let's get started by creating a new project on the sno cluster:
#+begin_src tmux
oc new-project cpu-load-test
#+end_src
and deploy the cpu-load-container workload on a busybox container
#+begin_src tmux
cat << EOF | oc apply --filename -
apiVersion: apps/v1
kind: Deployment
metadata:
name: cpu-load-test
spec:
replicas: 5
selector:
matchLabels:
app: cpu-load-test
template:
metadata:
labels:
app: cpu-load-test
spec:
containers:
- name: cpu-load-container
image: busybox
command: ["/bin/sh", "-c"]
args:
- while true; do
echo "Performing CPU load...";
dd if=/dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 1000 | head -n 1000000 > /dev/null;
done
EOF
#+end_src