Monitoring with Prometheus

Cadenza exposes metrics that can be used for monitoring a Cadenza instance using an Open Metrics API.

This API is available on <cadenza-base-url>/monitoring/prometheus, e.g. http://cadenza.example.com:8080/cadenza/monitoring/prometheus

Prerequisites

  • The Monitoring_Prometheus plug-in has to be added to the plugins.xml file.

  • The monitoring-publishers-config.xml configuration file is mandatory.

Configuring Monitoring

In the monitoringpublishers-config.xml file, you have to define the credentials of a technical user to access the metrics. The endpoint can be password-protected, see Encoding Passwords for Use with Cadenza.

Monitoring Metrics

Basic Metrics

cadenza_build

gauge Always has the dummy value 1.0 but has the following openmetrics tags with further information:

  • version with the Cadenza version as value

  • commithash with the git commit hash that this release was built from

    For example:

    cadenza_build{version="9.3.150",commithash="abcdefghij123456",} 1.0

Workqueue Metrics

Cadenza employs various queues for asynchronous processing. These queues are classified with a name and for each queue we have a bunch of different values with different meaning that are enumerated below.

Each metric is repeated for each workqueue category by use of the openmetrics tag with the same name.

For example:

cadenza_workqueue_active_workers{category="TimesliderVideoExport",} 0.0

In this case the number of active workers for the TimesliderVideoExport queue is logged.

The possible categories are described in Categories for Workqueue Metrics

cadenza_workqueue_active_workers

gauge The number of active workers for a particular work queue. The number is an integer.

cadenza_workqueue_completed_tasks_total

gauge The total number of completed tasks for a particular work queue. The number is an integer.

cadenza_workqueue_execution_time_seconds

summary The execution time of the queue. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_workqueue_execution_time_measurements_total

The total number of completed tasks for a particular work queue. The number is an integer.

cadenza_workqueue_execution_time_seconds_total

The cumulated execution time of all tasks of the work queue in seconds. The number is a decimal number.

cadenza_workqueue_max_execution_time_seconds

gauge The longest task execution time for a particular queue in the last minutes. With a standard step interval of 1 minute the last 2-3 minutes are considered (RingBuffer with 3 buckets rotated once per minute).

cadenza_workqueue_max_waiting_time_seconds

gauge The longest waiting time for a task of a particular queue in the last minutes in seconds (decimal number). With a standard step interval of 1 minute the last 2-3 minutes are considered (RingBuffer with 3 buckets rotated once per minute).

cadenza_workqueue_max_workers

gauge The number of workers assigned to a particular work queue.

cadenza_workqueue_waiting_time_seconds

summary Waiting time until a task of a particular queue was executed. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_workqueue_waiting_time_measurements_total

The total count of tasks executed by a particular queue.

cadenza_workqueue_waiting_time_seconds_total

The cumulated waiting time of all tasks executed by a particular work queue.

cadenza_workqueue_waiting_jobs

gauge The number of tasks currently waiting to be executed by a particular work queue.

Basic Web Metrics

cadenza_active_users

gauge The number of active Cadenza sessions

cadenza_max_concurrent_users

gauge The maximum allowed number of active Cadenza sessions set in the configuration; possibly limited by the Cadenza license, too.

Job Metrics

Metrics for jobs executed in the background.

cadenza_job_current_number_of_waiting_threads

gauge Current number of idle Job Executor threads, measured right before the Job Executor looks for jobs to process.

cadenza_job_current_number_of_working_threads

gauge Current number of active Job Executor threads, measured right before the Job Executor looks for jobs to process.

cadenza_job_last_heartbeat

gauge The last time (as milliseconds since Unix epoch[1]) the Job Executor looked for jobs to process.

cadenza_job_last_scheduling_check

gauge The last time (as milliseconds since Unix epoch[1]) the Job Scheduler checked the schedules for jobs to create.

cadenza_job_max_number_threads

gauge The maximum number of threads that that can be used for executing jobs.

Database Connection Metrics

Metrics for database connections and their associated connection pools.

All connection metrics have the openmentrics tag connectiontype which describes the general purpose of the connection. The following list shows all possible values for the tag connectiontype:

  • AdhocConnection

  • AuditLog

  • Authentication

  • Authorization

  • ClassicDatasourceConnection

  • Configuration

  • DatabaseRepository

  • DatasourceConnection

  • JobScheduling

  • Permalink

  • Settings

Based on the tag connectiontype, we add additional tags to further specify the connection:

  • DatasourceConnection

    • connection: contains the associated datasource id

    • repository: contains the associated repository name

    • datasourceprintname: contains the associated datasource printname, if present

  • ClassicDatasourceConnection

    • connection: contains the associated datasource id

    • datasourceprintname: contains the associated datasource printname, if present

  • AdhocConnection

    • connection: contains a unique hashcode based on the connection specification

  • DatabaseRepository

    • repositoryschemaname: contains the associated repository schema name

    • repositoryschemaprintname: contains the associated repository schema printname

cadenza_dbpool_connections_active

gauge The number of connections currently borrowed from the connection pool (refreshed once per second).

cadenza_dbpool_configured_connection_idle_time_before_closing_milliseconds

gauge The configured time a connection from the connection pool is allowed to be idle before it is closed.

cadenza_dbpool_configured_max_connections

gauge The configured maximum number of simultaneous open connections possible in the pool.

cadenza_dbpool_configured_connection_max_waiting_time_milliseconds

gauge The configured maximum wait time before throwing an exception if no connection can be borrowed from the pool.

cadenza_dbpool_configured_min_idle_connections

gauge The configured minimum number of idle connections to maintain in the pool.

cadenza_dbpool_configured_connection_network_timeout_seconds

gauge The configured maximum time that should be waited for a query result from this connection before throwing an exception.

cadenza_dbpool_idle_connections

gauge The current number of idle connections in the pool (refreshed once per second).

cadenza_dbpool_connection_waiters

gauge The number of threads currently blocked waiting for a connection from the pool (refreshed once per second).

cadenza_dbpool_connection_timeouts_total

counter The total number of connection timeouts in the pool (since start of the application).

cadenza_dbpool_connection_acquire_time_seconds

summary Time it takes acquiring connections from the pool. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_dbpool_connection_acquire_measurements_total

The total count of acquired connection from the pool

cadenza_dbpool_connection_acquire_time_seconds_total

The total time spent on acquiring connections from the pool.

cadenza_dbpool_connection_acquire_time_seconds_max

gauge The maximum time spent acquiring a connection from the pool in the last 2-3 minutes (with a standard polling interval of 1 minute).

cadenza_dbpool_connection_creation_time_seconds

summary Time it takes creating new connections in the pool. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_dbpool_connection_creation_measurements_total

The total number of connection created by the pool.

cadenza_dbpool_connection_creation_time_seconds_total

The total time spent creating new connections in the pool.

cadenza_dbpool_connection_creation_time_seconds_max

gauge The maximum time spent for creating a new connection in the pool in the last 2-3 minutes (with a standard polling interval of 1 minute)

cadenza_dbpool_connection_usage_time_seconds

summary Time spent using connections from the pool. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_dbpool_connection_usage_measurements_total

The total count of connection usages in the pool.

cadenza_dbpool_connection_usage_time_seconds_total

The total time spent using connections from the pool.

cadenza_dbpool_connection_usage_time_seconds_max

gauge The maximum time spent using a connection from the pool in the last 2-3 minutes (with a standard polling interval of 1 minute)

Cache Metrics

Metrics for the Caffeine Caches used in Cadenza.

Cadenza uses various Caches to speed up processing. These caches are classified with a name and for each cache we have a bunch of different values with different meaning that are enumerated below.

Each metric is repeated for each cache and has the openmetrics tag cache with the cache name as value.

For example:

cadenza_cache_puts_total{cache="RepositoryTreeCache",} 7.0

In this case the number of times an element has been put into the Cache RepositoryTreeCache is logged.

cadenza_cache_eviction_weight_total

counter The sum of weights of evicted entries. This total does not include manual invalidations.

cadenza_cache_evictions_total

counter The number of times the cache was evicted.

cadenza_cache_gets_total

counter The number of times cache lookup methods have returned a cached (result="hit) or uncached (newly loaded or null) value (result="miss").

cadenza_cache_load_duration_seconds

gauge The total time (in seconds) the cache has spent loading new values

cadenza_cache_load_total

counter The number of times cache lookup methods have successfully loaded a new value (result="success) or failed to load a new value, either because no value was found or an exception was thrown while loading (result="failure)

cadenza_cache_puts_total

counter The number of entries added to the cache

cadenza_cache_size

gauge The number of entries in this cache. This may be an approximation, depending on the type of cache.

JVM Metrics

Metrics about the statue of the Java Virtual Machine.

cadenza_jvm_buffer_count_buffers

gauge An estimate of the number of buffers in the pool
Tags:
id=["mapped"|"mapped - 'non-volatile memory'"|"direct"]

cadenza_jvm_buffer_memory_used_bytes

counter An estimate of the memory that the Java virtual machine is using for this buffer pool
Tags:
id=["mapped"|"mapped - 'non-volatile memory'"|"direct"]

cadenza_jvm_buffer_total_capacity_bytes

gauge An estimate of the total capacity of the buffers in this pool
Tags:
id=["mapped"|"mapped - 'non-volatile memory'"|"direct"]

cadenza_jvm_classes_loaded_classes

gauge The number of classes that are currently loaded in the Java virtual machine

cadenza_jvm_classes_unloaded_classes_total

counter The total number of classes unloaded since the Java virtual machine has started execution

cadenza_jvm_compilation_time_ms_total

counter The approximate accumulated elapsed time in ms spent in compilation
Tags
compiler: e.g. "HotSpot 64-Bit Tiered Compilers"

cadenza_jvm_gc_live_data_size_bytes

gauge Size of long-lived heap memory pool after reclamation

cadenza_jvm_gc_max_data_size_bytes

gauge Max size of long-lived heap memory pool

cadenza_jvm_gc_memory_allocated_bytes_total

counter Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next

cadenza_jvm_gc_memory_promoted_bytes_total

counter Count of positive increases in the size of the old generation memory pool before GC to after GC

cadenza_jvm_gc_overhead

gauge An approximation of the percent of CPU time used by GC activities over the last lookback period or since monitoring began, whichever is shorter, in the range [0..1]

cadenza_jvm_gc_pause_seconds

summary Time spent in GC pause. Consists of two separate metrics which can for example be used to calculate averages over a certain time.
Tags:
action: e.g. "End of minor GC"
cause: e.g. "Metadata GC Threshold", "GCLocker Initiated GC", "G1 Evacuation Pause"
gc: e.g. "G1 Young Generation"

cadenza_jvm_gc_pause_seconds_count

The total count of GC pauses.

cadenza_jvm_gc_pause_seconds_sum

The cumulated time spent in GC pauses.

cadenza_jvm_gc_concurrent_phase_count

The total count of GC concurrent phases.

cadenza_jvm_gc_concurrent_phase_seconds_sum

The cumulated time spent in GC concurrent phases.

cadenza_jvm_memory_committed_bytes

gauge The amount of memory in bytes that is committed for the Java virtual machine to use
Tags:
area=["heap"|"nonheap"]
id: memory area, e.g. "G1 Eden Space", "G1 Old Gen", …​

cadenza_jvm_memory_max_bytes

gauge The maximum amount of memory in bytes that can be used for memory management.
Tags:
area=["heap"|"nonheap"]
id: memory area, e.g. "G1 Eden Space", "G1 Old Gen", …​

cadenza_jvm_memory_usage_after_gc

gauge The percentage of long-lived heap pool used after the last GC event, in the range [0..1]
Tags:
area: e.g. "heap"
pool: e.g. "long-lived"

cadenza_jvm_memory_used_bytes

gauge The amount of used memory in bytes
Tags:
area=["heap"|"nonheap"]
id: memory area, e.g. "G1 Eden Space", "G1 Old Gen", …​

cadenza_jvm_process_cpu_usage

gauge The "recent cpu usage" for the Java Virtual Machine process. The value is a decimal number.

cadenza_jvm_process_start_time_seconds

gauge Start time of the process in seconds since Unix epoch[1].

cadenza_jvm_process_uptime_seconds

gauge The uptime of the Java virtual machine in seconds

cadenza_jvm_system_cpu_count

gauge The number of processors available to the Java virtual machine

cadenza_jvm_system_cpu_usage

gauge The "recent cpu usage" of the system the application is running in. The value is a decimal number.

cadenza_jvm_system_load_average_1m

gauge The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over the last minute

cadenza_jvm_threads_daemon_threads

gauge The current number of live daemon threads

cadenza_jvm_threads_live_threads

gauge The current number of live threads including both daemon and non-daemon threads

cadenza_jvm_threads_peak_threads

gauge The peak live thread count since the Java virtual machine started or peak was reset

cadenza_jvm_threads_started_threads_total

counter The total number of application threads started in the JVM

cadenza_jvm_threads_states_threads

gauge The current number of threads
Tags:
state=["runnable"|"terminated"|"new"|"time-waiting"|"blocked"|"waiting"]

Messaging Metrics

Metrics about the state of the messaging cluster.

cadenza_messaging_cluster_state

gauge Either 1.0 (healthy) or 0.0 (not healthy).

Report Metrics

Most Metrics about the PDF report generation can be obtained from the Workqueue Metrics for the "ReportGeneration" queue. But there are some additional metrics about the size of the generated PDF files.

cadenza_web_report_pdf_size_bytes_max

gauge The maximum size in bytes of the reports recorded in the last minutes. With the standard step interval of 1 minute the last 2-3 minutes are considered (RingBuffer with 3 buckets rotated once per minute)

cadenza_web_report_pdf_size_bytes

summary Size of the generated reports in bytes. Consists of two separate metrics which can for example be used to calculate averages over a certain time:

cadenza_web_report_pdf_size_measurements_total

The total count of reports generated.

cadenza_web_report_pdf_size_bytes_total

The cumulated size of all generated reports.

Categories for Workqueue Metrics

The categories for the Workqueue Metrics are described below.

{category="UserAction",} Used for an action triggered by the user that is executed asynchronously on the server, e.g. a self-service import.

{category="Cancel",} Used when an action is cancelled by the user or automatically to cancel requests against data sources or similar.

{category="GISQueryExecution",} Used for GIS specific requests.

{category="QueryExecution",} Used for data requests to RDBMS and ElasticSearch data sources. This workqueue uses Java Virtual Threads and is therefore unlimited in size and tasks do not have to wait until a worker thread is available to be started. The metrics cadenza_workqueue_max_workers, cadenza_workqueue_max_waiting_time_seconds, cadenza_workqueue_waiting_time_measurements_total, cadenza_workqueue_waiting_time_seconds_total and cadenza_workqueue_waiting_jobs do not provide any monitor-able information for this kind of workqueue.

{category="OutgoingMail",} Used to send mails e.g. for data protection and deletion deadline mails

`{category="Rendering",}` Used for server layer rendering.

{category="LayerPersistence",} Is used for asynchronous reading and creation of classic layers.

{category="DefaultWorkQueue",} The default work queue for asynchronous execution. Used when there is no dedicated work queue for a task.

{category="LayerDataLoading",} Is used for asynchronous classic layer data loading.

{category="TimesliderAnimation",} Is used for timeslider animations.

{category="TimesliderVideoExport",} Is used for timeslider animation video export.

{category="ReportGeneration",} Used for report generation in workbooks.

{category="LocationFinder",} Used for location finder queries.

{category="LayerExtentCalculator",} Used to calculate extend of layers in workbooks.

{category="AuditLogging",} Used for audit logging when asynchronous audit logging is allowed.

{category="DataViewExport",} Used for exporting data views in various formats.

Access Manager Metrics

Metrics about access manager processes:

cadenza_access_manager_request_time

The time until backend responded for a particular class and category.

cadenza_access_manager_request_time_total

The total number of time measured requests for a particular class and category.

cadenza_access_manager_requests_time_seconds_total

The cumulated waiting time of all requests for a particular class and category.

cadenza_access_manager_response_size

The size of the server response for a particular class and category in bytes.

cadenza_access_manager_response_size_total

The total number of measured requests sizes for a particular class and category.

cadenza_access_manager_response_size_bytes_total

The cumulated size of the server responses for a particular class and category in bytes. == Categories for AccessManager Metrics The categories for the Access Manager Metrics are described below.

{category="authentication",} Used for the authentication process.

{category="groupMapping",} Used for the groupMapping process.

{category="propertyMapping",} Used for the propertyMapping process.

{category="technical",} Used for technical processes like token refreshes, user suggestions or impersonations. == Classes for AccessManager Metrics The classes for the Access Manager Metrics are described below.

{class="ldap",} Used for ldap processes.

{class="oauth",} Used for OAuth processes.


1. January 1, 1970, 00:00:00 GMT