Infrastructure Monitoring
Get complete visibility into your cloud infrastructure performance at any scale, easily access all your metrics in one place and optimize infrastructure efficiency.
Last updated
Get complete visibility into your cloud infrastructure performance at any scale, easily access all your metrics in one place and optimize infrastructure efficiency.
Last updated
The groundcover platform offers infrastructure monitoring capabilities that were built for cloud-native environments. It enables you to track the health and efficiency of your infrastructure instantly, with an effortless deployment process.
Troubleshoot efficiently - acting as a centralized hub for all your infrastructure, application and customer metrics allows you to query, correlate and troubleshoot your cloud environments using real time data and insight on your entire stack.
Store it all, without a sweat - store any metrics volume without worrying about cardinality or retention limits. Your subscription costs remain unaffected by the granularity of metrics you store or query.
groundcover's proprietary eBPF sensor leverages all its innovative powers to collect comprehensive data across your cloud environments without the burden of performance overhead. This data is sourced from various Kubernetes components, including kube-system workloads, cluster information via the Kubernetes API, and the applications' interactions with the Kubernetes infrastructure. This level of detailed collection at the kernel level enables the ability to provide actionable insights into the health of your Kubernetes clusters, which are indispensable for troubleshooting existing issues and taking proactive steps to future-proof your cloud environments.
You also have the option to define the retention period for your metrics in the VictoriaMetrics database. By default, logs are retained for 7 days, but you can adjust this period to your preferences.
Beyond collecting data, groundcover's methodology involves a strategic layer of data enrichment that seeks to correlate Kubernetes metrics with application performance indicators. This correlation is crucial for creating a transparent image of the Kubernetes ecosystem. It enables a deep understanding of how Kubernetes interacts with applications, identifying potential points of failure across the interconnected environment. By monitoring Kubernetes not as an isolated platform but as an integral part of the application infrastructure, groundcover ensures that the monitoring strategy aligns with your dynamic and complex cloud operations.
Monitoring a cluster involves tracking resources that are critical to the performance and stability of the entire system. Monitoring these essential metrics is crucial for maintaining a healthy Kubernetes cluster:
CPU consumption: It's essential to track the CPU resources being utilized against the total capacity to prevent workloads from failing due to insufficient CPU availability.
Memory utilization: Keeping an eye on the remaining memory resources ensures that your cluster doesn't encounter disruptions due to memory shortages.
Disk space allocation: For Kubernetes clusters running stateful applications or requiring persistent storage for data, such as etcd databases, tracking the available disk space is crucial to avert potential storage deficiencies.
Network usage: Visualize traffic rates and connections being established and closed on a service-to-service level of granularity, and easily pinpoint cross availability zone communication to investigate misconfigurations and surging costs.
Available Labels
type
clusterId
region
namespace
node_name
workload_name
pod_name
container_name
container_image
Available Metrics
Name | Description | Type |
---|---|---|
groundcover_container_cpu_usage_rate_millis | CPU usage in mCPU | Gauge |
groundcover_container_cpu_request_m_cpu | K8s container CPU request (mCPU) | Gauge |
groundcover_container_cpu_limit_m_cpu | K8s container CPU limit (mCPU) | Gauge |
groundcover_container_memory_working_set_bytes | current memory working set (B) | Gauge |
groundcover_container_memory_rss_bytes | current memory RSS (B) | Gauge |
groundcover_container_memory_request_bytes | K8s container memory request (B) | Gauge |
groundcover_container_memory_limit_bytes | K8s container memory limit (B) | Gauge |
groundcover_container_cpu_delay_seconds | K8s container CPU delay accounting in seconds | Counter |
groundcover_container_disk_delay_seconds | K8s container disk delay accounting in seconds | Counter |
groundcover_container_cpu_throttled_seconds_total | K8s container total CPU throttling in seconds | Counter |
Available Labels
type
clusterId
region
node_name
Available Metrics
Name | Description | Type |
---|---|---|
groundcover_node_allocatable_cpum_cpu | amount of allocatable CPU in the current node (mCPU) | Gauge |
groundcover_node_allocatable_mem_bytes | amount of allocatable memory in the current node (B) | Gauge |
groundcover_node_mem_used_percent | percent of used memory in current node (0-100) | Gauge |
groundcover_node_used_disk_space | current used disk space in current node (B) | Gauge |
groundcover_node_free_disk_space | amount of free disk space in current node (B) | Gauge |
groundcover_node_total_disk_space | amount of total disk space in current node (B) | Gauge |
groundcover_node_used_percent_disk_space | percent of used disk space in current node (0-100) | Gauge |
Available Labels
type
clusterId
region
name
namespace
Available Metrics
Name | Description | Type |
---|---|---|
groundcover_pvc_usage_bytes | PVC used bytes (B) | Gauge |
groundcover_pvc_capacity_bytes | PVC capacity bytes (B) | Gauge |
groundcover_pvc_available_bytes | PVC available bytes (B) | Gauge |
groundcover_pvc_usage_percent | percent of used pvc storage (0-100) | Gauge |
Available Labels
clusterId workload_name
namespace
container_name
remote_service_name
remote_namespace
remote_is_external
availability_zone
region
remote_availability_zone
remote_region
is_cross_az
protocol
role
server_port
encryption
transport_protocol
is_loopback
Notes:
is_loopback
and remote_is_external
are special labels that indicate the remote service is either the same service as the recording side (loopback) or resides in an external network, e.g managed service outside of the cluster (external).
In both cases the remote_service_name
and the remote_namespace
labels will be empty
is_cross_az
means the traffic was sent and/or received between two different availability zones. This is a helpful flag to quickly identify this special kind of communication.
The actual zones are detailed in the availability_zone
and remote_availability_zone
labels
Available Metrics
Name | Description | Type |
---|---|---|
groundcover_network_rx_bytes_total | Bytes received by the workload (B) | Counter |
groundcover_network_tx_bytes_total | Bytes sent by the workload (B) | Counter |
groundcover_network_connections_opened_total | Connections opened by the workload | Counter |
groundcover_network_connections_closed_total | Connections closed by the workload | Counter |