387 lines
12 KiB
Org Mode
387 lines
12 KiB
Org Mode
#+TITLE: OpenShift Advanced Cluster Management Observability
|
||
#+AUTHOR: James Blair
|
||
#+DATE: <2024-01-09 Tue 08:00>
|
||
|
||
* Introduction
|
||
|
||
This document captures the environment setup steps for a ~30 minute live demo of the [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Red Hat Advanced Cluster Management]] observability feature for [[https://www.redhat.com/en/technologies/cloud-computing/openshift][OpenShift]].
|
||
|
||
|
||
* Pre-requisites
|
||
|
||
This guide assumes you:
|
||
|
||
- Have access to an Amazon Web Services account with permissions to be able to create resources including ~s3~ buckets and ~ec2~ instances. In my case I have an AWS Blank Open Environment provisioned through the Red Hat [[https://demo.redhat.com][demo system]].
|
||
|
||
- Already have the ~aws~ and ~oc~ cli utilities installed.
|
||
|
||
- Have registered for a Red Hat account (required for obtaining an OpenShift install image pull secret).
|
||
|
||
|
||
* 1 - Logging into aws locally
|
||
|
||
Our first step is to login to our aws account locally via the ~aws~ cli which will prompt for four values:
|
||
|
||
#+begin_src tmux
|
||
aws configure
|
||
#+end_src
|
||
|
||
|
||
* 2 - Creating s3 bucket
|
||
|
||
After logging into aws lets confirm our permissions are working by creating the ~s3~ bucket we will need later on.
|
||
|
||
#+begin_src tmux
|
||
aws s3 mb "s3://open-cluster-management-observability" --region "$(aws configure get region)"
|
||
#+end_src
|
||
|
||
|
||
* 3 - Install openshift clusters
|
||
|
||
With our aws credentials working let's move on to deploying the hub and single node openshift cluster required for the live demo.
|
||
|
||
|
||
** 3.1 Download installer tools
|
||
|
||
Our first step will be to ensure we have an up to date version of the ~openshift-install~ cli tool. We can download it as follows:
|
||
|
||
#+begin_src tmux
|
||
# Download the installer
|
||
wget "https://mirror.openshift.com/pub/openshift-v4/$(uname -m)/clients/ocp/stable/openshift-install-linux.tar.gz"
|
||
|
||
# Extract the archive
|
||
tar xf openshift-install-linux.tar.gz openshift-install && rm openshift-install-linux.tar.gz*
|
||
#+end_src
|
||
|
||
|
||
** 3.2 Obtain install pull secret
|
||
|
||
Next we have a manual step to login to the Red Hat Hybrid Cloud Console and obtain our *Pull Secret* which will be required for our installation configuration.
|
||
|
||
Open the [[https://console.redhat.com/openshift/create/local][Console]] and click *Download pull secret*. This will download a file called ~pull-secret.txt~ which will be used later on.
|
||
|
||
Once the file downloads ensure it is copied or moved to the directory you will be running the remaining commands on this guide from.
|
||
|
||
|
||
** 3.3 Create ssh key
|
||
|
||
For access to our soon to be created cluster nodes we need an ssh key, let's generate those now via ~ssh-keygen~.
|
||
|
||
#+begin_src tmux
|
||
ssh-keygen -t rsa -b 4096 -f ~/.ssh/hubkey -q -N "" <<< y
|
||
ssh-keygen -t rsa -b 4096 -f ~/.ssh/snokey -q -N "" <<< y
|
||
#+end_src
|
||
|
||
|
||
** 3.3 Initiate the hub cluster install
|
||
|
||
Once our install tooling is available let's kick off the installation of our hub cluster by creating a configuration file and then running ~openshift-install~.
|
||
|
||
#+begin_src tmux
|
||
cat << EOF > hub/install-config.yaml
|
||
additionalTrustBundlePolicy: Proxyonly
|
||
apiVersion: v1
|
||
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
|
||
compute:
|
||
- architecture: amd64
|
||
hyperthreading: Enabled
|
||
name: worker
|
||
platform: {}
|
||
replicas: 0
|
||
controlPlane:
|
||
architecture: amd64
|
||
hyperthreading: Enabled
|
||
name: master
|
||
platform: {}
|
||
replicas: 3
|
||
metadata:
|
||
creationTimestamp: null
|
||
name: hub
|
||
networking:
|
||
clusterNetwork:
|
||
- cidr: 10.128.0.0/14
|
||
hostPrefix: 23
|
||
machineNetwork:
|
||
- cidr: 10.0.0.0/16
|
||
networkType: OVNKubernetes
|
||
serviceNetwork:
|
||
- 172.30.0.0/16
|
||
platform:
|
||
aws:
|
||
region: $(aws configure get region)
|
||
publish: External
|
||
pullSecret: |
|
||
$(cat pull-secret.txt)
|
||
EOF
|
||
#+end_src
|
||
|
||
|
||
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
|
||
|
||
#+begin_src tmux
|
||
./openshift-install create cluster --dir hub --log-level info
|
||
#+end_src
|
||
|
||
|
||
** 3.4 Initiate the sno cluster install
|
||
|
||
We can run our single node openshift cluster install at the same time in a separate terminal to speed things up. The process is the same we will first create an ~install-config.yaml~ file, then run ~openshift-install~.
|
||
|
||
#+begin_src tmux
|
||
cat << EOF > sno/install-config.yaml
|
||
additionalTrustBundlePolicy: Proxyonly
|
||
apiVersion: v1
|
||
baseDomain: $(aws route53 list-hosted-zones | jq '.HostedZones[0].Name' -r | sed 's/.$//')
|
||
compute:
|
||
- architecture: amd64
|
||
hyperthreading: Enabled
|
||
name: worker
|
||
platform: {}
|
||
replicas: 0
|
||
controlPlane:
|
||
architecture: amd64
|
||
hyperthreading: Enabled
|
||
name: master
|
||
platform: {}
|
||
replicas: 1
|
||
metadata:
|
||
creationTimestamp: null
|
||
name: sno
|
||
networking:
|
||
clusterNetwork:
|
||
- cidr: 10.128.0.0/14
|
||
hostPrefix: 23
|
||
machineNetwork:
|
||
- cidr: 10.0.0.0/16
|
||
networkType: OVNKubernetes
|
||
serviceNetwork:
|
||
- 172.30.0.0/16
|
||
platform:
|
||
aws:
|
||
region: $(aws configure get region)
|
||
publish: External
|
||
pullSecret: |
|
||
$(cat pull-secret.txt)
|
||
EOF
|
||
#+end_src
|
||
|
||
Once the configuration file is created we can kick off the install with ~openshift-install~ as follows. The install process will generally take about half an hour.
|
||
|
||
#+begin_src tmux
|
||
./openshift-install create cluster --dir sno --log-level info
|
||
#+end_src
|
||
|
||
|
||
* 4 - Install advanced cluster management
|
||
|
||
To make use of the Red Hat Advanced Cluster Management Observability feature we need to first install [[https://www.redhat.com/en/technologies/management/advanced-cluster-management][Advanced Cluster Management]] on our hub cluster via the acm operator.
|
||
|
||
Let's get started by creating an ~OperatorGroup~ and ~Subscription~ which will install the operator.
|
||
|
||
#+begin_src tmux
|
||
oc create namespace open-cluster-management
|
||
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: operators.coreos.com/v1
|
||
kind: OperatorGroup
|
||
metadata:
|
||
name: acm-operator-group
|
||
namespace: open-cluster-management
|
||
spec:
|
||
targetNamespaces:
|
||
- open-cluster-management
|
||
|
||
---
|
||
apiVersion: operators.coreos.com/v1alpha1
|
||
kind: Subscription
|
||
metadata:
|
||
name: acm-operator-subscription
|
||
namespace: open-cluster-management
|
||
spec:
|
||
sourceNamespace: openshift-marketplace
|
||
source: redhat-operators
|
||
channel: release-2.9
|
||
installPlanApproval: Automatic
|
||
name: advanced-cluster-management
|
||
EOF
|
||
#+end_src
|
||
|
||
|
||
Once the operator is installed we can create the ~MultiClusterHub~ resource to install Advanced Cluster Management.
|
||
|
||
Note: It can take up to ten minutes for this to complete.
|
||
|
||
#+begin_src tmux
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: operator.open-cluster-management.io/v1
|
||
kind: MultiClusterHub
|
||
metadata:
|
||
name: multiclusterhub
|
||
namespace: open-cluster-management
|
||
spec: {}
|
||
EOF
|
||
#+end_src
|
||
|
||
|
||
* 5 - Enable acm observability
|
||
|
||
Now, with our clusters deployed and acm installed we can enable the observability service by creating a ~MultiClusterObservability~ custom resource instance on the ~hub~ cluster.
|
||
|
||
Our first step towards this is to create two secrets.
|
||
|
||
#+begin_src tmux
|
||
oc create namespace open-cluster-management-observability
|
||
|
||
DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-`
|
||
|
||
oc create secret generic multiclusterhub-operator-pull-secret \
|
||
-n open-cluster-management-observability \
|
||
--from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \
|
||
--type=kubernetes.io/dockerconfigjson
|
||
|
||
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: v1
|
||
kind: Secret
|
||
metadata:
|
||
name: thanos-object-storage
|
||
namespace: open-cluster-management-observability
|
||
type: Opaque
|
||
stringData:
|
||
thanos.yaml: |
|
||
type: s3
|
||
config:
|
||
bucket: open-cluster-management-observability
|
||
endpoint: s3.$(aws configure get region).amazonaws.com
|
||
insecure: true
|
||
access_key: $(aws configure get aws_access_key_id)
|
||
secret_key: $(aws configure get aws_secret_access_key)
|
||
EOF
|
||
#+end_src
|
||
|
||
|
||
Once the two required secrets exist we can create the ~MultiClusterObservability~ resource as follows:
|
||
|
||
#+begin_src tmux
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: observability.open-cluster-management.io/v1beta2
|
||
kind: MultiClusterObservability
|
||
metadata:
|
||
name: observability
|
||
spec:
|
||
observabilityAddonSpec: {}
|
||
storageConfig:
|
||
metricObjectStorage:
|
||
name: thanos-object-storage
|
||
key: thanos.yaml
|
||
EOF
|
||
#+end_src
|
||
|
||
After creating the resource and waiting briefyl we can access the grafana console via the ~Route~ to confirm everything is running:
|
||
|
||
#+begin_src tmux
|
||
echo "https://$(oc get route -n open-cluster-management-observability grafana -o jsonpath={.spec.host})"
|
||
#+end_src
|
||
|
||
|
||
* 6 - Import the single node openshift cluster into acm
|
||
|
||
#+begin_src tmux
|
||
oc new-project sno
|
||
oc label namespace sno cluster.open-cluster-management.io/managedCluster=sno
|
||
#+end_src
|
||
|
||
#+begin_src tmux
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: cluster.open-cluster-management.io/v1
|
||
kind: ManagedCluster
|
||
metadata:
|
||
name: sno
|
||
spec:
|
||
hubAcceptsClient: true
|
||
|
||
---
|
||
apiVersion: agent.open-cluster-management.io/v1
|
||
kind: KlusterletAddonConfig
|
||
metadata:
|
||
name: sno
|
||
namespace: sno
|
||
spec:
|
||
clusterName: sno
|
||
clusterNamespace: sno
|
||
applicationManager:
|
||
enabled: true
|
||
certPolicyController:
|
||
enabled: true
|
||
clusterLabels:
|
||
cloud: auto-detect
|
||
vendor: auto-detect
|
||
iamPolicyController:
|
||
enabled: true
|
||
policyController:
|
||
enabled: true
|
||
searchCollector:
|
||
enabled: true
|
||
version: 2.0.0
|
||
EOF
|
||
#+end_src
|
||
|
||
The ManagedCluster-Import-Controller will generate a secret named ~sno-import~. The ~sno-import~ secret contains the ~import.yaml~ that the user applies to a managed cluster to install ~klusterlet~.
|
||
|
||
|
||
#+begin_src tmux
|
||
oc get secret sno-import -n sno -o jsonpath={.data.crds\\.yaml} | base64 --decode > klusterlet-crd.yaml
|
||
oc get secret sno-import -n sno -o jsonpath={.data.import\\.yaml} | base64 --decode > import.yaml
|
||
|
||
oc --kubeconfig sno/auth/kubeconfig apply --filename klusterlet-crd.yaml
|
||
oc --kubeconfig sno/auth/kubeconfig apply --filename import.yaml
|
||
#+end_src
|
||
|
||
If everything works fine you should see JOINED and AVAILABLE sno cluster from within your hub cluster
|
||
|
||
#+begin_src tmux
|
||
❯ kubectl get managedcluster -n sno
|
||
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
|
||
local-cluster true https://api.hub.<yourdomain>.com:6443 True True 5h12m
|
||
sno true https://api.cluster-vzmvz.<yourdomain>.com:6443 True True 31m
|
||
#+end_src
|
||
|
||
* 7 - Creating the edge workload on SNO
|
||
For edge scenarios we only send metrics to the hub cluster if certain thresholds are hit for a certain period of time (here 70% for more than 2 minutes - you can see this configuration in the open-cluster-management-addon-observability namespace under ConfigMaps observability-metrics-allowlist in the collect_rules section under SNOHighCPUUsage).
|
||
In order to hit that trigger we now deploy a cpu-heavy workload in order for sno-cluster metrics being sent to the ACM hub cluster.
|
||
|
||
Let's get started by creating a new project on the sno cluster:
|
||
#+begin_src tmux
|
||
oc new-project cpu-load-test
|
||
#+end_src
|
||
|
||
and deploy the cpu-load-container workload on a busybox container
|
||
|
||
#+begin_src tmux
|
||
cat << EOF | oc apply --filename -
|
||
apiVersion: apps/v1
|
||
kind: Deployment
|
||
metadata:
|
||
name: cpu-load-test
|
||
spec:
|
||
replicas: 5
|
||
selector:
|
||
matchLabels:
|
||
app: cpu-load-test
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: cpu-load-test
|
||
spec:
|
||
containers:
|
||
- name: cpu-load-container
|
||
image: busybox
|
||
command: ["/bin/sh", "-c"]
|
||
args:
|
||
- while true; do
|
||
echo "Performing CPU load...";
|
||
dd if=/dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 1000 | head -n 1000000 > /dev/null;
|
||
done
|
||
EOF
|
||
#+end_src
|