PART 1 — Red Hat OpenStack Platform (OSP) Service Telemetry Framework (STF) with OpenShift Container Platform (OCP4)
INTRODUCTION TO SERVICE TELEMETRY FRAMEWORK
RHOSP 16 traditional Service Telemetry is disabled by default and the Service Telemetry Framework is recommended. One of the central pieces when operating a Red Hat OpenStack environment is the monitoring system. You can use the centralized information in your monitoring system as the source for alerts, visualization, or the source of truth for orchestration frameworks.
Service Telemetry Framework (STF) is an application that runs on Red Hat OpenShift Container Platform (OCP) that provides metrics and events data collection from OpenStack infrastructure, fast and reliable transport of data, and built-in data storage and alerting capabilities
What’s Service Telemetry Framework ?
The Service Telemetry Framework runs on Red Hat OpenShift Container Platform (OCP). This OCP must be independent of the RHOSP subject to telemetry.
It’s all about RHOSP Telemetry Architecture. There is a desire to centrally manage multiple cloud environments using the Service Telemetry Framework, and OCP can operate a scalable monitoring environment.
Service Telemetry Framework architecture
The RHOSP and OCP environments and the components to be deployed are as follows. Red Hat supports the data collection components of the Service Telemetry Framework, collectd / Ceilometer, and the transport components AMQ Interconnect and Smart Gateway. Prometheus, ElasticSearch, and the visualization component Grafana use community-supported ones. We use both collectd and Ceilometer for data collection because collectd alone cannot collect OpenStack metrics.
- Red Hat OpenShift Container Platform 4.7
- AMQ Certificate Manager Operator
- Elastic Cloud is a Kubernetes Operator
- Service Telemetry Operator(Service Telemetry Framework 1.3 or newer)
- Grafana operator
- Red Hat OpenStack Platform 16.x
- AMQ Interconnect
- Ceilometer
- Collectd
- Be sure to read the reference link below to get clear Simple and Entry-Level information
Link: https://red.ht/3w4iPc2
In order to better understand STF, I will explain this with a Short-Demo Installation article at the minimal and simple level. This will give you a better understanding of this technology, which is often used in Enterprise Banking & eCommerce, Telco and Cloud sectors.
We will deploy 2 core deployments:
- Red Hat OpenStack 16.x
- Red Hat OpenShift 4.7 or newer
STF would be installed as an OCP application. It uses the following components:
- collectd to collect metrics
- Prometheus as time-series data storage
- ElasticSearch as events data storage
- An AMQP 1.x compatible messaging bus to shuttle the metrics to STF for storage in Prometheus
Smart Gateway to pick metrics and events from the AMQP 1.x bus and to deliver events to ElasticSearch or to provide metrics to Prometheus.
This configs will have 3 instances:
- 1 x Bastion/Jump Host
- 1 x RHOSP standalone 16.x (All-In-One)
- 1 x Code Ready containers (CRC) instance where the STF workload will be installed. (If you have OCP in your lab, pls go forward with same steps.)
- We will be using the 192.168.47.0 private management network. (Of course — you can use different network IP pool)
- During this article we will be stressing our system so that CPU and memory alerts are triggered. In order to ease this task, Please be sure installed the “stress-ng” command in RHOSP 16
***I will be briefly showing you the “Stress Tests” in Part 3 with examples***
**Installation STF “stress-ng” tool on Host.
- Open a terminal and connect to the OSP all-in-one host.
- Install stress-ng:
- sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
- sudo dnf install -y stress-ng
The underlying system is an OSP 16 which is sending and collecting data via collectd and Ceilometer, that data is transported via the AMQP interconnect message bus, and is stored into the storage backend which is consisting of Prometheus and elasticsearch.
The following table describes the application of the client and server components:
1.Infrastructure Contents to be deployed step by step as the follows below;
- Deploy Service Telemetry Framework in OCP environment
- Create a Service Telemetry object in OCP
- Configure RHOSP to utilize the Service Telemetry Framework
- Operation checking.
2.Deploy Service Telemetry Framework in OCP environment
- First, create a namespace for service-telemetry. (STF)
$ oc new-project service-telemetry
- Create an OperatorGroup.
$ oc apply -f — <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: service-telemetry-operator-group
namespace: service-telemetry
spec:
targetNamespaces:
- service-telemetry
- Enable OperatorHub.io Community Catalog Source to use community operators such as ElasticSearch and Grafana.
$ oc apply -f — <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: operatorhubio-operators
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/operator-framework/upstream-community-operators:latest
displayName: OperatorHub.io Operators
publisher: OperatorHub.io
EOF
- Enable Red Hat STF Operators Catalog Source to take advantage of the Service Telemetry Framework.
$ oc apply -f
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: redhat-operators-stf
namespace: openshift-marketplace
spec:
displayName: Red Hat STF Operators
image: quay.io/redhat-operators-stf/stf-catalog:v4.7
publisher: Red Hat
sourceType: grpc
updateStrategy:
registryPoll:
interval: 30m
EOF
- Deploy the AMQ Certificate Manager Operator.
$ oc apply -f — <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: amq7-cert-manager-operator
namespace: openshift-operators
spec:
channel: alpha
installPlanApproval: Automatic
name: amq7-cert-manager-operator
source: redhat-operators-stf
sourceNamespace: openshift-marketplace
targetNamespaces: global
EOF
- Check the ClusterServiceVersion and make sure it is “Succeeded”.
$ oc get — namespace openshift-operators csv
- Deploy the Kubernetes Operator Elastic Cloud
$ oc apply -f — <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: elastic-cloud-eck
namespace: service-telemetry
spec:
channel: stable
installPlanApproval: Automatic
name: elastic-cloud-eck
source: operatorhubio-operators
sourceNamespace: openshift-marketplace
EOF
- Deploy the Service Telemetry Operator.
$ oc apply -f
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: service-telemetry-operator
namespace: service-telemetry
spec:
channel: stable-1.3
installPlanApproval: Automatic
name: service-telemetry-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
- We’ll do this in a later step in the official documentation, but we’ll deploy the Grafana Operator to take advantage of Dashboard.
$ oc apply -f — <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: grafana-operator
namespace: service-telemetry
spec:
channel: alpha
installPlanApproval: Automatic
name: grafana-operator
source: operatorhubio-operators
sourceNamespace: openshift-marketplace
EOF
- Make sure all required components are successfully deployed.
$ oc get csv — namespace service-telemetry
- Open OpenShift Platform console, it’s listed under “Installed Operators” in “Project — service-telemetry” (Please check it out)
Create a Service Telemetry object in OCP
The main parameters of the ServiceTelemetry object are:
- Alerting
Create alert rules in Prometheus and send alerts to Alertmanager.
- Backends
Enables storage and specifies storage for storing metrics and events. Currently the metric backend is Prometheus and the event backend is ElasticSearch.
- Clouds
Define the cloud that connects to the STF and specify the metric and event collector.
- Graphing
Set to visualize the metrics collected by collectd using Grafana.
- High-Availability
Set the redundancy of STF components. Currently STF is not a complete fault tolerant system and the metrics and events being recovered are not guaranteed.
- Transport
Enables and configures the STF message bus. Currently only AMQ Interconnect is supported.
Create a Service Telemetry object by setting enabled: true for the service you want to use, such as alerting or graphing. Specify the clouds parameter for OpenStack metrics and event collection.
apiVersion: infra.watch/v1beta1
kind: ServiceTelemetry
metadata:
name: default
spec:
alerting:
enabled: true
alertmanager:
storage:
strategy: persistent
persistent:
storageSelector: {}
pvcStorageRequest: 30G
backends:
metrics:
prometheus:
enabled: true
scrapeInterval: 10s
storage:
strategy: persistent
retention: 24h
persistent:
storageSelector: {}
pvcStorageRequest: 30G
events:
elasticsearch:
enabled: true
storage:
strategy: persistent
persistent:
pvcStorageRequest: 30Gi
graphing:
enabled: true
claw:
ingressEnabled: false
adminPassword: secret
adminUser: root
disableSignoutMenu: false
transport:
qdr:
enabled: true
web:
enabled: false
highAvailability:
enabled: false
clouds:
- name: cloud1
metrics:
collectors:
- collectorType: collectd
subscriptionAddress: collectd/telemetry
- collectorType: ceilometer
subscriptionAddress: anycast/ceilometer/metering.sample
events:
collectors:
- collectorType: collectd
subscriptionAddress: collectd/notify
- collectorType: ceilometer
subscriptionAddress: anycast/ceilometer/event.sample
When the ServiceTelemetry object is created, the pod will run according to the settings.
$ oc get pods
You can check it not only in the CLI but also in Developer => Topology in the OpenShift console.
Note:
You are responsible for the direct implementation of the article in live and test environments. This article only describes the STF and provides the most basic of installation steps and tests. At some points, content was created with help from different sources.
References:
https://docs.openshift.com/container-platform/4.7/monitoring/configuring-the-monitoring-stack.html
https://github.com/infrawatch/service-telemetry-operator/blob/master/deploy/alerts/alerts.yaml
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2
https://developers.redhat.com/products/codeready-containers/overview