Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Panel
Table of Contents
maxlevel2

Introduction

What is Observability?

In the ever-evolving landscape of distributed system operations, ensuring the reliability, performance, and scalability of complex applications has become increasingly more difficult. System Observability has emerged as a critical practice that empowers IT organizations to effectively monitor and gain deep insights into the inner workings of their software systems. By systematically collecting and analyzing data about applications, infrastructure, and user interactions, observability enables teams to proactively identify, diagnose, and resolve issues, ultimately leading to enhanced user experiences and operational efficiency.

What is OpenTelemetry?

OpenTelemetry is an open-source project that standardizes the collection of telemetry data from software systems, making it easier for organizations to gain holistic visibility into their environments. By seamlessly integrating with various programming languages, frameworks, and cloud platforms, OpenTelemetry simplifies the instrumentation of applications, allowing developers and operators to collect rich, actionable data about their systems' behavior.  The adoption of OpenTelemetry by software vendors and Application Performance Monitoring (APM) tools represents a significant shift in the observability landscape. OpenTelemetry has gained substantial traction across the industry due to its open-source, vendor-neutral approach and its ability to standardize telemetry data collection.

Many software vendors have started incorporating OpenTelemetry into their frameworks and libraries. Major cloud service providers like AWS, Azure, and Google Cloud have also embraced OpenTelemetry. In addition, many APM tools have integrated OpenTelemetry into their offerings. This integration allows users of these APM solutions to easily collect and visualize telemetry data from their applications instrumented with OpenTelemetry. It enhances the compatibility and flexibility of APM tools, making them more versatile in heterogeneous technology stacks.

Solution Architecture (Component Description)

Image Removed

Key Features (Controller, OMS, Agent, Extensions)

How to Get Started

Introduction

The following will provide a minimal setup to get started with Observability for Universal Automation Center.

The set-up is based on widely used Open Source tools.

The set-up is not intended for production use. To use the here provided set-up in a production environment, further configurations with regard to security have to be applied.

...


Panel
Table of Contents
maxlevel2

Introduction

What is Observability?

In the ever-evolving landscape of distributed system operations, ensuring the reliability, performance, and scalability of complex applications has become increasingly more difficult. System Observability has emerged as a critical practice that empowers IT organizations to effectively monitor and gain deep insights into the inner workings of their software systems. By systematically collecting and analyzing data about applications, infrastructure, and user interactions, observability enables teams to proactively identify, diagnose, and resolve issues, ultimately leading to enhanced user experiences and operational efficiency.

What is OpenTelemetry?

OpenTelemetry is an open-source project that standardizes the collection of telemetry data from software systems, making it easier for organizations to gain holistic visibility into their environments. By seamlessly integrating with various programming languages, frameworks, and cloud platforms, OpenTelemetry simplifies the instrumentation of applications, allowing developers and operators to collect rich, actionable data about their systems' behavior.  The adoption of OpenTelemetry by software vendors and Application Performance Monitoring (APM) tools represents a significant shift in the observability landscape. OpenTelemetry has gained substantial traction across the industry due to its open-source, vendor-neutral approach and its ability to standardize telemetry data collection.

Many software vendors have started incorporating OpenTelemetry into their frameworks and libraries. Major cloud service providers like AWS, Azure, and Google Cloud have also embraced OpenTelemetry. In addition, many APM tools have integrated OpenTelemetry into their offerings. This integration allows users of these APM solutions to easily collect and visualize telemetry data from their applications instrumented with OpenTelemetry. It enhances the compatibility and flexibility of APM tools, making them more versatile in heterogeneous technology stacks.

Solution Architecture (Component Description)

Image Added


Key Features (Controller, OMS, Agent, Extensions)


How to Get Started

Introduction

The following will provide a minimal setup to get started with Observability for Universal Automation Center.

The set-up is based on widely used Open Source tools.

The set-up is not intended for production use. To use the here provided set-up in a production environment, further configurations with regard to security have to be applied.

The set-up allows collecting Metrics and Trace data from Universal Automation Center. The collected Metrics data is stored in Prometheus for analysis in Grafana.

The collected Trace data is stored in Elasticsearch for analysis in Jaeger. The Jaeger UI is embed in the Universal Controller.

Jaeger, Prometheus and Grafana are selected for this Get Started Guide as examples. Any other data store or analysis tool could also be used.   

Metrics

Metrics data can be collected from Universal Controller, Universal Agent, OMS and Universal Tasks of type Extension.

Metrics data is pulled through the Prometheus metrics Web Service endpoint (Metrics API) and via user-defined Universal Event Open Telemetry metrics, which is exported to an Open Telemetry metrics collector (OTEL Collector).

The collected Metrics data exported to Prometheus for analysis in Grafana.

To enable Open Telemetry metrics, an Open Telemetry (OTEL) collector with a Prometheus exporter need to be configured.

Trace

Universal Controller will manually instrument Open Telemetry trace on Universal Controller (UC), OMS, Universal Agent (UA), and Universal Task Extension interactions associated with task instance executions, agent registration, and Universal Task of type Extension deployment.

The collected Trace data is stored in Elasticsearch for analysis in Jaeger

To enable tracing an Open Telemetry span exporter must be configured. The Jaeger UI is embed in the Universal Controller.

Jaeger, Prometheus and Grafana are selected for this Get Started Guide as examples. Any other data store or analysis tool could also be used.   

Metrics

Metrics data can be collected from Universal Controller, Universal Agent, OMS and Universal Tasks of type Extension.

Metrics data is pulled through the Prometheus metrics Web Service endpoint (Metrics API) and via user-defined Universal Event Open Telemetry metrics, which is exported to an Open Telemetry metrics collector (OTEL Collector).

The collected Metrics data exported to Prometheus for analysis in Grafana.

To enable Open Telemetry metrics, an Open Telemetry (OTEL) collector with a Prometheus exporter need to be configured.

Trace

Universal Controller will manually instrument Open Telemetry trace on Universal Controller (UC), OMS, Universal Agent (UA), and Universal Task Extension interactions associated with task instance executions, agent registration, and Universal Task of type Extension deployment.

The collected Trace data is stored in Elasticsearch for analysis in Jaeger. 

To enable tracing an Open Telemetry span exporter must be configured. 

Observability Architecture Image Removed

Prerequisites

The sample set will done on a single on-premise Linux server. 

Server Requirements

  • Linux Server 
    • Memory: 16GB RAM
    • Storage: 70GB Net storage 
    • CPU: 4 CPU
    • Distribution: Any major Linux distribution 
    • For the installation and configurations of the required Observability tools Administrative privileges are required
  • Ports

The Following default ports will be used. 

...

Application

...

4317 (grpc), 4318 (http)

Pre-Installed Software Components

It is assumed that following components are installed and configured properly:

  • Universal Agent 7.5.0.0 or higher
  • Universal Controller 7.5.0.0 or higher

Please refer to the documentation for Installation and Applying Maintenance - Universal Controller 7.4.x - Stonebranch Documentation (atlassian.net)

and Universal Agent 7.4.x for UNIX Quick Start Guide - Universal Agent 7.4.x - Stonebranch Documentation (atlassian.net) for further information on how to install Universal Agent and Universal Controller.

Required Software for the Observability  

The following Opensource Software needs to be installed and configured for use with Universal Automation Center.

Note: This Startup Guide has been tested with the provide Software Version in the table below. 

...

otelcol-contrib

...

jaeger

...

prometheus

...

grafana-enterprise

...

Configuration

Open Source Setup

It is important to follow the installation in the here given order, because the Software components have dependencies between each other.

Example:

  • Jaeger needs Elasticsearch to store the trace data.
  • OTEL Collector needs Prometheus to store the metrics data.
  • Grafana needs Prometheus as data source for displaying the dashboards

Set up Elasticsearch

Description:

Elasticsearch is a distributed, RESTful search and analytics engine designed for real-time search and data storage. It is used for log and event data analysis, full-text search, and more.

In this set-up Elasticsearch is used as the storage backend for Jaeger.

Installation Steps:

Official Documentation: Elasticsearch Installation Guide

...

 



Observability Architecture Image Added


Prerequisites

The sample set will done on a single on-premise Linux server. 

Server Requirements

  • Linux Server 
    • Memory: 16GB RAM
    • Storage: 70GB Net storage 
    • CPU: 4 CPU
    • Distribution: Any major Linux distribution 
    • For the installation and configurations of the required Observability tools Administrative privileges are required
  • Ports

The Following default ports will be used. 

Application

Port
Prometheushttp: 9090
Grafana:http:3000
Jaegerhttp:16686
Elastichttp:9200
OTEL Collector

4317 (grpc), 4318 (http)


Pre-Installed Software Components

It is assumed that following components are installed and configured properly:

  • Universal Agent 7.5.0.0 or higher
  • Universal Controller 7.5.0.0 or higher

Please refer to the documentation for Installation and Applying Maintenance - Universal Controller 7.4.x - Stonebranch Documentation (atlassian.net)

and Universal Agent 7.4.x for UNIX Quick Start Guide - Universal Agent 7.4.x - Stonebranch Documentation (atlassian.net) for further information on how to install Universal Agent and Universal Controller.

Required Software for the Observability  

The following Opensource Software needs to be installed and configured for use with Universal Automation Center.

Note: This Startup Guide has been tested with the provide Software Version in the table below. 

Configuration

Open Source Setup

It is important to follow the installation in the here given order, because the Software components have dependencies between each other.

Example:

  • Jaeger needs Elasticsearch to store the trace data.
  • OTEL Collector needs Prometheus to store the metrics data.
  • Grafana needs Prometheus as data source for displaying the dashboards

Set up Elasticsearch

Description:

Elasticsearch is a distributed, RESTful search and analytics engine designed for real-time search and data storage. It is used for log and event data analysis, full-text search, and more.

In this set-up Elasticsearch is used as the storage backend for Jaeger.

Installation Steps:

Follow the official documentation to install Elasticsearch on the Linux Server.

Install the Version listed in under Required Software for the Observability


Official Documentation: Elasticsearch Installation Guide

Configuration Files:
  • elasticsearch.yml: Main configuration file for Elasticsearch, containing cluster, node, network, memory, and other settings.
  • ~/elasticsearch/config/elasticsearch.yml
    ~/elasticsearch/config/jvm.options
    ~/elasticsearch/config/jvm.options.d/jvm_heap_size.options

Set JVM Memory Options:

create the file: ~/elasticsearch/config/jvm.options.d/jvm_heap_size.options

Note: Make sure you have enough memory ( ps -weaf |grep -i java ), otherwise Universal Controller does not work

~/elasticsearch/config/jvm.options.d/jvm_heap_size.options
cat jvm.options.d/jvm_heap_size.options
-Xms1g
-Xmx1g

Test the Installation:

ss -tuln | grep 9200

curl -XGET "http://127.0.0.1:9200"

Setup up Jaeger

Description:

Jaeger is an open-source distributed tracing system used for monitoring and troubleshooting microservices-based applications. 

In this set-up Universal Controller will manually instrument Open Telemetry trace on Universal Controller (UC), OMS, Universal Agent (UA), and Universal Task Extension interactions associated with task instance executions, agent registration, and Universal Task of type Extension deployment.

The collected Trace data is stored in Elasticsearch for analysis in Jaeger. 

...

, agent registration, and Universal Task of type Extension deployment.

The collected Trace data is stored in Elasticsearch for analysis in Jaeger. 

Installation Steps:
Configuration Files:
  • jaeger-agent-config.yaml: Configuration for the Jaeger agent, responsible for trace collection.
  • jaeger-collector-config.yaml: Configuration for the Jaeger collector, handling trace reception and processing.
  • jaeger-query-config.yaml: Configuration for the Jaeger query service, used for trace querying and visualization.

Official Documentation: Jaeger Installation Guide

Test the Installation:

http://ps1.stonebranchdev.cloud:16686/

Setup OTEL Collector

Description:

OpenTelemetry Collector is a vendor-agnostic observability data collector that gathers traces, metrics, and other telemetry data from various sources and sends it to different backends for analysis.

In this set-up OpenTelemetry collects Metrics data from Universal Controller, Universal Agent, OMS and Universal Tasks of type Extension.

Installation Steps:
Configuration Files:
  • config.yaml: Primary configuration file for the OpenTelemetry Collector, defining data sources, exporters, processors, and components.

Official Documentation: OpenTelemetry Collector Installation

...

Set up Prometheus

Description:

Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. It collects metrics from monitored targets, stores them, and provides powerful querying and alerting capabilities.

In this set-up Prometheus is used to store the Metrics data retrieved via Opentelemetry and the Universal Controller Metrics REST API. 

...

REST API. 

Installation Steps:
Configuration Files:
  • prometheus.yml: Main configuration file for Prometheus, defining scrape targets (what to monitor), alerting rules, and other settings.

Official Documentation: Prometheus Installation Guide

Test the Installation:

http://ps1.stonebranchdev.cloud:9090/

Set up Grafana

Description:

Grafana is an open-source platform for monitoring and observability that allows you to create, explore, and share dynamic dashboards and visualizations for various data sources, including time series databases.

In the this set-up Grafana is used to Visualize and Analyze the Metrics data store in Prometheus data source (time series database). 

...

Visualize and Analyze the Metrics data store in Prometheus data source (time series database). 

Installation Steps:
Configuration Files:
  • grafana.ini: Grafana's main configuration file, including database connections, server settings, and global configurations.
  • datasources.yaml: Configuration for data sources (e.g., Prometheus) that Grafana connects to.
  • dashboards: Grafana dashboards are often defined as JSON files that can be imported into Grafana.

Official Documentation: Grafana Installation Guide

Test the Installation:

http://ps1.stonebranchdev.cloud:3000/

Universal Controller 

Description:

...

- open telemetry visualization URL: http://ps1.stonebranchdev.cloud:16686/trace/${traceId}?uiFind=${spanId}&uiEmbed=v0
- open telemetry visualization Iframe : True

...