# Application Metrics

## Our metrics philosophy

The groundcover platform generates 100% of its metrics from the actual data. There are no sample rates or complex interpolations to make up for partial coverage. Our measurements represent the real, complete flow of data in your environment.

[Stream processing](https://docs.groundcover.com/welcome/readme#stream-data-processing) allows us to construct the majority of the metrics on the very node where the raw transactions are recorded. This means the raw data is turned into numbers the moment it becomes possible - removing the need for storing or sending it elsewhere.

Metrics are stored in groundcover's `victoria-metrics` deployment, ensuring top-notch performance on every scale.

### Golden signals

In the world of excessive data, it's important to have a rule of thumb for knowing where to start looking. For application metrics, we rely on our [golden signals](https://www.groundcover.com/blog/monitor-the-four-golden-signals).

The following metrics are generated for each resource being [aggregated](https://docs.groundcover.com/capabilities/application-performance-monitoring-apm/..#aggregation):

* Requests per second (RPS)
* Errors rate
* Latencies (p50 and p95)

The golden signals are then displayed in two important ways: **Workload** and **Resource** aggregations.

{% hint style="info" %}
See [below](#golden-signals-metrics) for the full list of generated workload and resource golden metrics.
{% endhint %}

**Resource** aggregations are highly granularity metrics, providing insights into individual APIs.

<figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2Fgit-blob-53fae49dd1409fc20a8cc919121a33cc5051ee55%2FApplication%20Metrics%201.jpg?alt=media" alt=""><figcaption></figcaption></figure>

**Workload** aggregations are designed to show an overview of each service, enabling a higher level inspection. These are constructed using all of the resources recorded for each service.

<figure><img src="https://2771001740-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUHgqKYgCiRKdOpWQdi52%2Fuploads%2Fgit-blob-1182a8be7ad94d552f5fa7f4694922957ba904e1%2FApplication%20Metrics%202.jpg?alt=media" alt=""><figcaption></figcaption></figure>

### Controlling retention

groundcover allows full control over the retention of your metrics. Learn more [here](https://docs.groundcover.com/customization/customize-usage/custom-data-retention).

### List of available metrics

Below you will find the full list of our APM metrics, as well as the labels we export for each. These labels are designed with high granularity in mind for maximal insight depth. All of the metrics listed are available out of the box after installing groundcover, without any further setup.

{% hint style="info" %}
We fully support the ingestion of [custom metrics](https://docs.groundcover.com/integrations/data-sources/prometheus) to further expand the visibility into your environment.

We also allow for building [custom dashboards](https://docs.groundcover.com/use-groundcover/dashboards-and-alerts), enabling full freedom in deciding how to display your metrics - building on groundcover's metrics below plus every custom metric ingested.
{% endhint %}

### Our labels

<table><thead><tr><th width="256">Label name</th><th width="266">Description</th><th>Relevant types</th></tr></thead><tbody><tr><td>clusterId</td><td>Name identifier of the K8s cluster</td><td></td></tr><tr><td>region</td><td>Cloud provider region name</td><td></td></tr><tr><td>namespace</td><td>K8s namespace</td><td></td></tr><tr><td>workload_name</td><td>K8s workload (or service) name</td><td></td></tr><tr><td>pod_name</td><td>K8s pod name</td><td></td></tr><tr><td>container_name</td><td>K8s container name</td><td></td></tr><tr><td>container_image</td><td>K8s container image name</td><td></td></tr><tr><td>remote_namespace</td><td>Remote K8s namespace (other side of the communication)</td><td></td></tr><tr><td>remote_service_name</td><td>Remote K8s service name (other side of the communication)</td><td></td></tr><tr><td>remote_container_name</td><td>Remote K8s container name (other side of the communication)</td><td></td></tr><tr><td>type</td><td>The protocol in use (HTTP, gRPC, Kafka, DNS etc.)</td><td></td></tr><tr><td>role</td><td>Role in the communication (client or server)</td><td></td></tr><tr><td>clustered_path</td><td>HTTP / gRPC aggregated resource path (e.g. /metrics/*)</td><td>http, grpc</td></tr><tr><td>method</td><td>HTTP / gRPC method (e.g GET)</td><td>http, grpc</td></tr><tr><td>response_status_code</td><td>Return status code of a HTTP / gPRC request (e.g. 200 in HTTP)</td><td>http, grpc</td></tr><tr><td>dialect</td><td>SQL dialect (MySQL or PostgreSQL)</td><td>mysql, postgresql</td></tr><tr><td>response_status</td><td>Return status code of a SQL query (e.g 42P01 for undefined table)</td><td>mysql, postgresql</td></tr><tr><td>client_type</td><td>Kafka client type (Fetcher / Producer)</td><td>kafka</td></tr><tr><td>topic</td><td>Kafka topic name</td><td>kafka</td></tr><tr><td>partition</td><td>Kafka partition identifier</td><td>kafka</td></tr><tr><td>error_code</td><td>Kafka return status code</td><td>kafka</td></tr><tr><td>query_type</td><td>type of DNS query (e.g. AAAA)</td><td>dns</td></tr><tr><td>response_return_code</td><td>Return status code of a DNS resolution request (e.g. Name Error)</td><td>dns</td></tr><tr><td>method_name, method_class_name</td><td>Method code for the operation</td><td>amqp</td></tr><tr><td>response_method_name, response_method_class_name</td><td>Method code for the operation's response</td><td>amqp</td></tr><tr><td>exit_code</td><td>K8s container termination exit code</td><td>container_state, container_crash</td></tr><tr><td>state</td><td>K8s container current state (Running, Waiting or Terminated)</td><td>container_state</td></tr><tr><td>state_reason</td><td>K8s container state transition reason (e.g CrashLoopBackOff or OOMKilled)</td><td>container_state</td></tr><tr><td>crash_reason</td><td>K8s container crash reason (e.g Error, OOMKilled)</td><td>container_crash</td></tr><tr><td>pvc_name</td><td>K8s PVC name</td><td>storage</td></tr></tbody></table>

{% hint style="info" %}
Summary based metrics have an additional ***quantile*** label, representing the percentile. Available values: \[`”0.5”, “0.95”, 0.99”`].
{% endhint %}

{% hint style="warning" %}
[**groundcover**](https://www.groundcover.com/) uses a set of internal labels which are not relevant in most use-cases. Find them interesting? [Let us know over Slack!](https://www.groundcover.com/join-slack)

**`issue_id`** **`entity_id`** **`resource_id`** **`query_id`** **`aggregation_id`** **`parent_entity_id`** **`perspective_entity_id`** **`perspective_entity_is_external`** **`perspective_entity_issue_id`** **`perspective_entity_name`** **`perspective_entity_namespace`** **`perspective_entity_resource_id`**
{% endhint %}

### Golden Signals Metrics

{% hint style="info" %}
In the lists below, we describe **error** and **issue** counters. Every issue flagged by groundcover is an error; but not every error is flagged as an issue.
{% endhint %}

#### Resource metrics

<table><thead><tr><th width="357.3333333333333">Name</th><th width="258">Description</th><th>Type<select><option value="c476a927c24d4e67b8115feb2e039751" label="Counter" color="blue"></option><option value="655b3932f5bd49b0aacc43978ad78425" label="Gauge" color="blue"></option><option value="c0106f6abc0244adbf70f29447aa3875" label="Summary" color="blue"></option></select></th></tr></thead><tbody><tr><td>groundcover_resource_total_counter</td><td>total amount of resource requests</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_resource_error_counter</td><td>total amount of requests with error status codes</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_resource_issue_counter</td><td>total amount of requests which were flagged as issues</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_resource_success_counter</td><td>total amount of resource requests with OK status codes</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_resource_latency_seconds</td><td>resource latency [sec]</td><td><span data-option="c0106f6abc0244adbf70f29447aa3875">Summary</span></td></tr></tbody></table>

#### Workload metrics

<table><thead><tr><th width="358.3333333333333">Name</th><th width="258">Description</th><th>Type<select><option value="c476a927c24d4e67b8115feb2e039751" label="Counter" color="blue"></option><option value="655b3932f5bd49b0aacc43978ad78425" label="Gauge" color="blue"></option><option value="c0106f6abc0244adbf70f29447aa3875" label="Summary" color="blue"></option></select></th></tr></thead><tbody><tr><td>groundcover_workload_total_counter</td><td>total amount of requests handled by the workload</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_workload_error_counter</td><td>total amount of requests handled by the workload with error status codes</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_workload_issue_counter</td><td>total amount of requests handled by the workload which were flagged as issues</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_workload_success_counter</td><td>total amount of requests handled by the workload with OK status codes</td><td><span data-option="c476a927c24d4e67b8115feb2e039751">Counter</span></td></tr><tr><td>groundcover_workload_latency_seconds</td><td>resource latency across all of the workload APIs [sec]</td><td><span data-option="c0106f6abc0244adbf70f29447aa3875">Summary</span></td></tr></tbody></table>

### Storage usage metrics

<table><thead><tr><th width="329">Name</th><th width="285">Description</th><th>Type<select><option value="579160c3a58647ae94a346a3c517890a" label="Counter" color="blue"></option><option value="5a23e1b2296a4288be65fa1bc492c829" label="Gauge" color="blue"></option><option value="26c560ac591747cdb4f95ed12ed73b2c" label="Summary" color="blue"></option></select></th></tr></thead><tbody><tr><td>groundcover_pvc_read_bytes_total</td><td>total amount of bytes read by the workload from the PVC</td><td><span data-option="579160c3a58647ae94a346a3c517890a">Counter</span></td></tr><tr><td>groundcover_pvc_write_bytes_total</td><td>total amount of bytes written by the workload to the PVC</td><td><span data-option="579160c3a58647ae94a346a3c517890a">Counter</span></td></tr><tr><td>groundcover_pvc_reads_total</td><td>total amount of read operations done by the workload from the PVC</td><td><span data-option="579160c3a58647ae94a346a3c517890a">Counter</span></td></tr><tr><td>groundcover_pvc_writes_total</td><td>total amount of write operations done by the workload to the PVC</td><td><span data-option="579160c3a58647ae94a346a3c517890a">Counter</span></td></tr><tr><td>groundcover_pvc_read_latency</td><td>latency of read operation by the workload from the PVC, in microseconds</td><td><span data-option="26c560ac591747cdb4f95ed12ed73b2c">Summary</span></td></tr><tr><td>groundcover_pvc_write_latency</td><td>latency of write operation by the workload to the PVC, in microseconds</td><td><span data-option="26c560ac591747cdb4f95ed12ed73b2c">Summary</span></td></tr></tbody></table>

### Kafka specific metrics

<table><thead><tr><th width="302.3333333333333">Name</th><th width="258">Description</th><th>Type<select><option value="c476a927c24d4e67b8115feb2e039751" label="Counter" color="blue"></option><option value="655b3932f5bd49b0aacc43978ad78425" label="Gauge" color="blue"></option><option value="c0106f6abc0244adbf70f29447aa3875" label="Summary" color="blue"></option></select></th></tr></thead><tbody><tr><td>groundcover_client_offset</td><td>client last message offset (for producer the last offset produced, for consumer the last requested offset)</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr><tr><td>groundcover_workload_client_offset</td><td>client last message offset (for producer the last offset produced, for consumer the last requested offset), aggregated by workload</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr><tr><td>groundcover_calc_lagged_messages</td><td>current lag in messages</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr><tr><td>groundcover_workload_calc_lagged_messages</td><td>current lag in messages, aggregated by workload</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr><tr><td>groundcover_calc_lag_seconds</td><td>current lag in time [sec]</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr><tr><td>groundcover_workload_calc_lag_seconds</td><td>current lag in time, aggregated by workload [sec]</td><td><span data-option="655b3932f5bd49b0aacc43978ad78425">Gauge</span></td></tr></tbody></table>
