Metrics & Labels
Infrastructure Metrics & Labels
Container CPU and Memory
Labels
type
clusterId
region
namespace
node_name
workload_name
pod_name
container_name
container_image
Metrics
groundcover_container_cpu_usage_rate_millis
CPU usage in mCPU
mCPU
groundcover_container_cpu_request_m_cpu
K8s container CPU request
mCPU
groundcover_container_cpu_limit_m_cpu
K8s container CPU limit
mCPU
groundcover_container_memory_working_set_bytes
current memory working set
Bytes
groundcover_container_memory_rss_bytes
current memory RSS
Bytes
groundcover_container_memory_request_bytes
K8s container memory request
Bytes
groundcover_container_memory_limit_bytes
K8s container memory limit
Bytes
groundcover_container_cpu_delay_seconds
K8s container CPU delay
Seconds
groundcover_container_disk_delay_seconds
K8s container disk delay
Seconds
groundcover_container_cpu_throttled_seconds_total
K8s container total CPU throttling
Seconds
Node CPU, Memory and Disk
Labels
type
clusterId
region
node_name
Metrics
groundcover_node_allocatable_cpum_cpu
amount of allocatable CPU in the current node
mCPU
groundcover_node_allocatable_mem_bytes
amount of allocatable memory in the current node
Bytes
groundcover_node_mem_used_percent
percent of used memory in current node
0-100
groundcover_node_used_disk_space
current used disk space in current node
Bytes
groundcover_node_free_disk_space
amount of free disk space in current node
Bytes
groundcover_node_total_disk_space
amount of total disk space in current node
Bytes
groundcover_node_used_percent_disk_space
percent of used disk space in current node
0-100
Storage Usage
Labels
type
clusterId
region
name
namespace
Metrics
groundcover_pvc_usage_bytes
PVC usage
Bytes
groundcover_pvc_capacity_bytes
PVC capacity
Bytes
groundcover_pvc_available_bytes
PVC available
Bytes
groundcover_pvc_usage_percent
percent of used PVC storage
0-100
groundcover_pvc_read_bytes_total
total amount of bytes read by the workload from the PVC
Bytes
groundcover_pvc_write_bytes_total
total amount of bytes written by the workload to the PVC
Bytes
groundcover_pvc_reads_total
total amount of read operations done by the workload from the PVC
Number
groundcover_pvc_writes_total
total amount of write operations done by the workload to the PVC
Number
groundcover_pvc_read_latency
latency of read operation by the workload from the PVC
Seconds
groundcover_pvc_write_latency
latency of write operation by the workload to the PVC
Seconds
Network Usage
Labels
clusterId workload_name
namespace
container_name
remote_service_name
remote_namespace
remote_is_external
availability_zone
region
remote_availability_zone
remote_region
is_cross_az
protocol
role
server_port
encryption
transport_protocol
is_loopback
Notes:
is_loopback
andremote_is_external
are special labels that indicate the remote service is either the same service as the recording side (loopback) or resides in an external network, e.g managed service outside of the cluster (external).In both cases the
remote_service_name
and theremote_namespace
labels will be empty
is_cross_az
means the traffic was sent and/or received between two different availability zones. This is a helpful flag to quickly identify this special kind of communication.The actual zones are detailed in the
availability_zone
andremote_availability_zone
labels
Metrics
groundcover_network_rx_bytes_total
Bytes received by the workload
Bytes
groundcover_network_tx_bytes_total
Bytes sent by the workload
Bytes
groundcover_network_connections_opened_total
Connections opened by the workload
Number
groundcover_network_connections_closed_total
Connections closed by the workload
Number
groundcover_network_connections_opened_failed_total
Connections attempts failed per workload (including refused connections)
Number
groundcover_network_connections_refused_failed_total
Connections attempts refused per workload
Number
Application Metrics & Labels
clusterId
Name identifier of the K8s cluster
All
region
Cloud provider region name
All
namespace
K8s namespace
All
workload_name
K8s workload (or service) name
All
pod_name
K8s pod name
All
container_name
K8s container name
All
container_image
K8s container image name
All
remote_namespace
Remote K8s namespace (other side of the communication)
All
remote_service_name
Remote K8s service name (other side of the communication)
All
remote_container_name
Remote K8s container name (other side of the communication)
All
type
The protocol in use (HTTP, gRPC, Kafka, DNS etc.)
All
sub_type
The sub type of the protocol (GET, POST, etc)
All
role
Role in the communication (client or server)
All
clustered_resource_name
The clustered name of the resource, depends on the protocol
All
status_code
"ok", "error" or "unset"
All
server
The server workload/name
All
client
The client workload/name
All
server_namesapce
The server namespace
All
client_namespace
The client namespace
All
server_is_external
Indicate whether the server is external
All
client_is_external
Indicate wheter the client is external
All
is_encrypted
Indicate whether the communication is encrypted
All
is_cross_az
Indicate wether the communication is cross availability zone
All
clustered_path
HTTP / gRPC aggregated resource path (e.g. /metrics/*)
http, grpc
method
HTTP / gRPC method (e.g GET)
http, grpc
response_status_code
Return status code of a HTTP / gPRC request (e.g. 200 in HTTP)
http, grpc
dialect
SQL dialect (MySQL or PostgreSQL)
mysql, postgresql
response_status
Return status code of a SQL query (e.g 42P01 for undefined table)
mysql, postgresql
client_type
Kafka client type (Fetcher / Producer)
kafka
topic
Kafka topic name
kafka
partition
Kafka partition identifier
kafka
error_code
Kafka return status code
kafka
query_type
type of DNS query (e.g. AAAA)
dns
response_return_code
Return status code of a DNS resolution request (e.g. Name Error)
dns
exit_code
K8s container termination exit code
container_state, container_crash
state
K8s container current state (Running, Waiting or Terminated)
container_state
state_reason
K8s container state transition reason (e.g CrashLoopBackOff or OOMKilled)
container_state
crash_reason
K8s container crash reason (e.g Error, OOMKilled)
container_crash
pvc_name
K8s PVC name
storage
Summary based metrics have an additional quantile label, representing the percentile. Available values: [”0.5”, “0.95”, 0.99”
].
We also use a set of internal labels which are not relevant in most use-cases. Find them interesting? Let us know over Slack!
issue_id
entity_id
resource_id
query_id
aggregation_id
parent_entity_id
perspective_entity_id
perspective_entity_is_external
perspective_entity_issue_id
perspective_entity_name
perspective_entity_namespace
perspective_entity_resource_id
Golden Signals (Errors & Issues)
In the lists below, we describe error and issue counters. Every issue flagged by the platform is an error; but not every error is flagged as an issue.
Resource metrics
groundcover_resource_total_counter
total amount of resource requests
Number
groundcover_resource_error_counter
total amount of requests with error status codes
Number
groundcover_resource_issue_counter
total amount of requests which were flagged as issues
Number
groundcover_resource_success_counter
total amount of resource requests with OK status codes
Number
groundcover_resource_latency_seconds
resource latency
Seconds
Workload metrics
groundcover_workload_total_counter
total amount of requests handled by the workload
Number
groundcover_workload_error_counter
total amount of requests handled by the workload with error status codes
Number
groundcover_workload_issue_counter
total amount of requests handled by the workload which were flagged as issues
Number
groundcover_workload_success_counter
total amount of requests handled by the workload with OK status codes
Number
groundcover_workload_latency_seconds
resource latency across all of the workload APIs
Seconds
Kafka specific metrics
groundcover_client_offset
client last message offset (for producer the last offset produced, for consumer the last requested offset)
groundcover_workload_client_offset
client last message offset (for producer the last offset produced, for consumer the last requested offset), aggregated by workload
groundcover_calc_lagged_messages
current lag in messages
Number
groundcover_workload_calc_lagged_messages
current lag in messages, aggregated by workload
Number
groundcover_calc_lag_seconds
current lag in time
Seconds
groundcover_workload_calc_lag_seconds
current lag in time, aggregated by workload
Seconds
Last updated