UAC Utility: System Monitor
- 1 Disclaimer
- 2 Version Information
- 3 Overview
- 3.1 Key Features
- 4 Requirements
- 5 Input Fields
- 6 Supported Actions
- 7 Action Output
- 8 Exit Codes
- 9 Observability
- 10 How To
- 11 Document References
- 12 Changelog
Disclaimer
Your use of this download is governed by Stonebranch’s Terms of Use.
Version Information
Template Name | Extension Name | Version | Status |
|---|---|---|---|
System Monitor | ue-system-monitor | 1 (Current 1.0.0) | Fixes and new Features are introduced. |
Refer to Changelog for version history information.
Overview
The System Monitor integration provides a powerful tool for users to track system metrics, such as CPU usage, memory consumption, disk activity, and network performance from both Linux and Windows hosts. By leveraging OpenTelemetry, these metrics can be seamlessly published to observability platforms, enabling real-time infrastructure monitoring. This integration facilitates the detection of performance bottlenecks and potential system failures but also allows for proactive management of resources. By making infrastructure metrics observable, System Monitor enhances the ability to correlate system behavior with application performance, leading to better overall visibility and system reliability.
Key Features
Feature | Description |
|---|---|
Observe and Publish Metrics | Observe and Publish Metrics from Linux and Windows Hosts. The following categories are supported.
|
Filtering | Filtering abilities for Disk/Filesystem and Network Interface metrics |
Other Configuration Options |
|
Requirements
This integration requires a Universal Agent and a Python runtime to execute the Universal Task.
Area | Details |
|---|---|
Python Version | Requires Python 3.11 |
Universal Agent Compatibility |
|
Universal Controller Compatibility | Universal Controller Version >= 7.6.0.0. |
Open Telemetry | Universal Agent should be configured to send Open Telemetry data. |
There should never be two task instances running simultaneously on the same system, as this can lead to inconsistent metric values and unreliable data. Although a warning appears on the default Grafana dashboard provided, our software does not automatically prevent multiple task instances from running simultaneously on the same system, so this must be managed operationally.
The provided Grafana dashboard makes use of metric attributes that are attached by Universal Agent using the Agent default configuration. If any of these options are changed, such as otel_uip_service_name, which can be configured inside of the uags.conf file, appropriate changes must be made to the queries used on the dashboard.
Input Fields
Name | Type | Description | Version Information |
|---|---|---|---|
Action | Choice | Possible values are
| Introduced in 1.0.0 |
Provide Configuration As | Choice | Specifies how System Monitor configuration is provided. Available options are:
Available if Action is “System Monitor” | Introduced in 1.0.0 |
Collection Interval (sec) | Int | How often metrics are retrieved. Default value is 15 seconds. Note The Collection Interval determines the collection frequency of metrics and therefore how often metrics are sent to the OTEL Collector. This data can be pulled (scraped) by the intended Timeseries database (e.g. Prometheus) at configurable intervals. To optimize resource utilization and ensure granular metrics retrieval, it is recommended to align these values. | Introduced in 1.0.0 |
Configuration | Large Text | System Monitor configuration as Text Default value: metrics:
system:
cpu: # Enable this Metric.
memory:
load_average:
paging:
disk:
filesystem:
network:
processes:For more information on the System Monitor configuration options, see YAML Configuration Options | Introduced in 1.0.0 |
Configuration | Script | System Monitor configuration as UC Script. This allows the configuration to be shared across multiple task definitions. For more information on the System Monitor configuration options, see YAML Configuration Options | Introduced in 1.0.0 |
Supported Actions
Action: System Monitor
Configuration examples
Provide configuration as YAML text. Collection interval is set to 15 seconds, and the default YAML configuration is used, activating all metrics and applying no filters. | Provide configuration using the "System Monitor - Full Configuration" UAC Script, setting a collection interval of 10 seconds. |
System Monitor Configuration Options
The configuration, provided as either plain text or a UC Script, defines the System Monitor's behavior, specifying which metrics to be published and any desired filtering options. Written in YAML format, configurations must adhere to a defined hierarchical structure.
The metrics and system settings must always be present in configuration files. The activation or filtering of any other metrics is optional.
The configuration allows you to:
Enable or disable metric categories: Choose which system metrics categories to collect.
Filter specific resources: Apply include/exclude filters on specific attribute values using strict mode or regex. If any "include" filters are activated for a specific attribute, no "exclude" filters can be activated for the same attribute (they are mutually exclusive).
A configuration example that demonstrates all the applicable options is the following
YAML Field | Description |
|---|---|
| Necessary as top-level key of the YAML configuration. Required |
| Enables monitor of host uptime and acts as the root key for any additional metrics provided. All following metrics (such as cpu) are marked for activation with the inclusion of the relevant key in the configuration file. Required |
| Enables CPU related metrics. |
| Enables Load Average (1, 5 and 15 minute) metrics. |
| Enables Memory metrics |
| Enables Paging/Swap metrics. |
| Enables Disk metrics. Filtering options are available:
The above filtering options are mutually exclusive (both should not be set) |
| Enables Filesystems metrics. Filtering options are available:
The above filtering options are mutually exclusive (both should not be set)
The above filtering options are mutually exclusive (both should not be set)
The above filtering options are mutually exclusive (both should not be set) If filtering options for devices/types and mountpoints are used at the same time, a logical AND is applied. |
| Enables Network metrics. Filtering options are available:
The above filtering options are mutually exclusive (both should not be set) |
| Enables Process count metric. |
System Monitor Configuration Examples
# | Configuration | Description |
|---|---|---|
1 | Default Configurationmetrics:
system:
cpu:
memory:
load_average:
paging:
disk:
filesystem:
network:
processes: | Default configuration that enables all available metrics without applying any filters to the configurations.
|
2 |
Configuration Excluding specific Disksmetrics:
system:
cpu:
memory:
load_average:
paging:
disk:
exclude_devices:
devices: ["sda", "sdb"]
match_type: strict
filesystem:
network:
processes:
| This configuration filters disk metrics so as not to report for the disks named 'sda' and 'sdb'.
|
3 | Configuration Excluding a set of Filesystemsmetrics:
system:
cpu:
memory:
load_average:
paging:
disk:
filesystem:
exclude_devices:
devices: ["dev/loop.*"]
match_type: regex
network:
processes: | This configuration filters disk metrics so as to exclude reporting for any filesystems whose names start with ‘dev/loop’. |
4 |
Configuration Including unsafe-only specific Network Interfacesmetrics:
system:
cpu:
memory:
load_average:
paging:
disk:
filesystem:
network:
include_devices:
devices: ["Ethernet", "Wireless"]
match_type: strict
processes:
| This configuration filters network metrics to include reports originating only from the ‘Ethernet' and 'Wireless’ network interfaces. |
5 | Configuration Including numerous filtersmetrics:
system:
cpu:
memory:
load_average:
paging:
disk:
exclude_devices:
devices: ["loop.*"]
match_type: regex
filesystem:
exclude_devices:
devices: ["/dev/loop.*"]
match_type: regex
include_types:
types: ["xfs"]
match_type: strict
exclude_mountpoints:
mountpoints: ["/var/lib/snapd/.*"]
match_type: strict
network:
exclude_devices:
devices: ["lo"]
match_type: strict
processes: | This configuration applies several filters to tailor the collected metrics as follows:
|
Action Output
Output Type | Description | Examples |
|---|---|---|
EXTENSION | The extension output provides the following information:
| Successful scenario{
"exit_code": 0,
"status_description": "Task cancelled successfully",
"invocation": {
"extension": "ue-system-monitor",
"version": "1.0.0",
"fields": { ... }
}
}Failing scenario{
"exit_code": 20,
"status_description": "Data Validation Error: Duplicate key detected in configuration file",
"invocation": {
"extension": "ue-system-monitor",
"version": "1.0.0",
"fields": { ... }
},
"result": {
"errors": [
"Data Validation Error: Duplicate key detected in configuration file"
]
}
} |
STDERR | Universal Extension Task log information |
|
Exit Codes
Exit Code | Status | Status Description | Meaning |
|---|---|---|---|
0 | Success | “Success: << Task cancelled successfully.>>“ | Successful execution and subsequent cancellation. |
1 | Failure | “Execution Failed: <<Error Description>>” | Raised in case of an unexpected error during execution |
20 | Failure | “Data Validation Error: <<Error Description>>“ | Validation error related to input fields or the YAML Configuration provided. * See STDERR for more detailed error descriptions. |
Observability
System CPU metrics
Metric: system.cpu.time
Name | Instrument Type | Unit (UCUM) | Attributes | Description |
|---|---|---|---|---|
| Counter | s | As defined on Metric Attributes List | Observes the CPU time spent on the system. |
Metric Attributes List:
Attribute Name | Description |
|---|---|
| The CPU mode on which time was spent. Possible values are:
Platform-specific fields:
Note: Not all attributes might be available as this relates to the platform and version operating system version. |
| The logical CPU number |
Metric: system.cpu.utilization
Name | Instrument Type | Unit (UCUM) | Attributes | Description |
|---|---|---|---|---|
| Gauge | 1 | As defined on Metric Attributes List | Observes the CPU utilization on the system. |
Metric Attributes List:
Attribute Name | Description |
|---|---|
| The CPU mode on which time was spent. Possible values are:
Platform-specific fields:
Note: Not all attributes might be available as this relates to the platform and version operating system version. |
| The logical CPU number |
Metric: system.cpu.physical.count
Name | Instrument Type | Unit (UCUM) | Attributes | Description |
|---|---|---|---|---|