Monitoring

Resources Usage

Citus Cloud metrics enable you to get information about your cluster’s health and performance. The “Metrics” tab of the Cloud Console provides graphs for a number of measurements, all viewable per node.

Amazon EBS Volume Metrics

  • Read IOPS. The average number of read operations per second.
    ../_images/metric-iops-read.png
  • Write IOPS. The average number of write operations per second.
    ../_images/metric-iops-write.png
  • Average Queue Length (Count). The number of read and write operation requests waiting to be completed.
    ../_images/metric-queue.png
  • Average Read Latency (Seconds)
    ../_images/metric-latency-read.png
  • Average Write Latency (Seconds)
    ../_images/metric-latency-write.png
  • Bytes Read / Second
    ../_images/metric-bytes-read.png
  • Bytes Written / Second
    ../_images/metric-bytes-write.png

CPU and Network

  • CPU Utilization (Percent)
    ../_images/metric-cpu.png
  • Network - Bytes In / Second
    ../_images/metric-network-in.png
  • Network - Bytes Out / Second
    ../_images/metric-network-out.png

PostgreSQL Write-Ahead Log

  • WAL Bytes Written / Second
    ../_images/metric-wal.png

Formation Events Feed

To monitor events in the life of a formation with outside tools via a standard format, we offer RSS feeds per organization. You can use a feed reader or RSS Slack integration (e.g. on an #ops channel) to keep up to date.

On the upper right of the “Formations” list in the Cloud console, follow the “Formation Events” link to the RSS feed.

../_images/cloud-formation-events.png

The feed includes entries for three types of events, each with the following details:

Server Unavailable

This is a notification of connectivity problems such as hardware failure.

  • Formation name
  • Formation url
  • Server

Failover Scheduled

For planned upgrades, or when operating a formation without high availability that experiences a failure, this event will appear to indicate a future planned failover event.

  • Formation name
  • Formation url
  • Leader
  • Failover at

For planned failovers, “failover at” will usually match your maintenance window. Note that the failover might happen at this point or shortly thereafter, once a follower is available and has caught up to the primary database.

Failover

Failovers happen to address hardware failure, as mentioned, and also for other reasons such as performing system software upgrades, or transferring data to a server with better hardware.

  • Formation name
  • Formation url
  • Leader
  • Situation
  • Follower

Systemic Cloud Status

Any events affecting the Citus Cloud platform as a whole are recorded on status.citusdata.com.