Application Metrics
Last updated
Last updated
The groundcover platform generates 100% of its metrics from the actual data. There are no sample rates or complex interpolations to make up for partial coverage. Our measurements represent the real, complete flow of data in your environment.
Stream processing allows us to construct the majority of the metrics on the very node where the raw transactions are recorded. This means the raw data is turned into numbers the moment it becomes possible - removing the need for storing or sending it elsewhere.
Metrics are stored in groundcover's victoria-metrics
deployment, ensuring top-notch performance on every scale.
In the world of excessive data, it's important to have a rule of thumb for knowing where to start looking. For application metrics, we rely on our golden signals.
The following metrics are generated for each resource being aggregated:
Requests per second (RPS)
Errors rate
Latencies (p50 and p95)
The golden signals are then displayed in two important ways: Workload and Resource aggregations.
See below for the full list of generated workload and resource golden metrics.
Resource aggregations are highly granularity metrics, providing insights into individual APIs.
Workload aggregations are designed to show an overview of each service, enabling a higher level inspection. These are constructed using all of the resources recorded for each service.
groundcover allows full control over the retention of your metrics. Learn more here.
Below you will find the full list of our APM metrics, as well as the labels we export for each. These labels are designed with high granularity in mind for maximal insight depth. All of the metrics listed are available out of the box after installing groundcover, without any further setup.
We fully support the ingestion of custom metrics to further expand the visibility into your environment.
We also allow for building custom dashboards, enabling full freedom in deciding how to display your metrics - building on groundcover's metrics below plus every custom metric ingested.
Summary based metrics have an additional quantile label, representing the percentile. Available values: [”0.5”, “0.95”, 0.99”
].
groundcover uses a set of internal labels which are not relevant in most use-cases. Find them interesting? Let us know over Slack!
issue_id
entity_id
resource_id
query_id
aggregation_id
parent_entity_id
perspective_entity_id
perspective_entity_is_external
perspective_entity_issue_id
perspective_entity_name
perspective_entity_namespace
perspective_entity_resource_id
In the lists below, we describe error and issue counters. Every issue flagged by groundcover is an error; but not every error is flagged as an issue.
Label name | Description | Relevant types |
---|---|---|
Name | Description | Type |
---|---|---|
Name | Description | Type |
---|---|---|
Name | Description | Type |
---|---|---|
Name | Description | Type |
---|---|---|
clusterId
Name identifier of the K8s cluster
region
Cloud provider region name
namespace
K8s namespace
workload_name
K8s workload (or service) name
pod_name
K8s pod name
container_name
K8s container name
container_image
K8s container image name
remote_namespace
Remote K8s namespace (other side of the communication)
remote_service_name
Remote K8s service name (other side of the communication)
remote_container_name
Remote K8s container name (other side of the communication)
type
The protocol in use (HTTP, gRPC, Kafka, DNS etc.)
role
Role in the communication (client or server)
clustered_path
HTTP / gRPC aggregated resource path (e.g. /metrics/*)
http, grpc
method
HTTP / gRPC method (e.g GET)
http, grpc
response_status_code
Return status code of a HTTP / gPRC request (e.g. 200 in HTTP)
http, grpc
dialect
SQL dialect (MySQL or PostgreSQL)
mysql, postgresql
response_status
Return status code of a SQL query (e.g 42P01 for undefined table)
mysql, postgresql
client_type
Kafka client type (Fetcher / Producer)
kafka
topic
Kafka topic name
kafka
partition
Kafka partition identifier
kafka
error_code
Kafka return status code
kafka
query_type
type of DNS query (e.g. AAAA)
dns
response_return_code
Return status code of a DNS resolution request (e.g. Name Error)
dns
method_name, method_class_name
Method code for the operation
amqp
response_method_name, response_method_class_name
Method code for the operation's response
amqp
exit_code
K8s container termination exit code
container_state, container_crash
state
K8s container current state (Running, Waiting or Terminated)
container_state
state_reason
K8s container state transition reason (e.g CrashLoopBackOff or OOMKilled)
container_state
crash_reason
K8s container crash reason (e.g Error, OOMKilled)
container_crash
pvc_name
K8s PVC name
storage
groundcover_resource_total_counter
total amount of resource requests
groundcover_resource_error_counter
total amount of requests with error status codes
groundcover_resource_issue_counter
total amount of requests which were flagged as issues
groundcover_resource_success_counter
total amount of resource requests with OK status codes
groundcover_resource_latency_seconds
resource latency [sec]
groundcover_workload_total_counter
total amount of requests handled by the workload
groundcover_workload_error_counter
total amount of requests handled by the workload with error status codes
groundcover_workload_issue_counter
total amount of requests handled by the workload which were flagged as issues
groundcover_workload_success_counter
total amount of requests handled by the workload with OK status codes
groundcover_workload_latency_seconds
resource latency across all of the workload APIs [sec]
groundcover_pvc_read_bytes_total
total amount of bytes read by the workload from the PVC
groundcover_pvc_write_bytes_total
total amount of bytes written by the workload to the PVC
groundcover_pvc_reads_total
total amount of read operations done by the workload from the PVC
groundcover_pvc_writes_total
total amount of write operations done by the workload to the PVC
groundcover_pvc_read_latency
latency of read operation by the workload from the PVC, in microseconds
groundcover_pvc_write_latency
latency of write operation by the workload to the PVC, in microseconds
groundcover_client_offset
client last message offset (for producer the last offset produced, for consumer the last requested offset)
groundcover_workload_client_offset
client last message offset (for producer the last offset produced, for consumer the last requested offset), aggregated by workload
groundcover_calc_lagged_messages
current lag in messages
groundcover_workload_calc_lagged_messages
current lag in messages, aggregated by workload
groundcover_calc_lag_seconds
current lag in time [sec]
groundcover_workload_calc_lag_seconds
current lag in time, aggregated by workload [sec]