Skip to main content > Citus Data logo Citus
v11.2

Get Started

  • What is Citus?
    • Citus Gives You Postgres At Any Scale
  • When to Use Citus
    • Multi-Tenant SaaS Database
    • Real-Time Analytics
    • Considerations for Use
    • When Citus is Inappropriate
  • Quick Tutorials
    • Multi-tenant Applications
      • Data model and sample data
      • Creating tables
      • Distributing tables and loading data
      • Running queries
    • Real-time Analytics
      • Data model and sample data
      • Creating tables
      • Distributing tables and loading data
      • Running queries

Install

  • Single-Node Citus
    • Docker (Mac or Linux)
    • Ubuntu or Debian
    • Fedora, CentOS, or Red Hat
  • Multi-Node Citus
    • Ubuntu or Debian
      • Steps to be executed on all nodes
      • Steps to be executed on the coordinator node
    • Fedora, CentOS, or Red Hat
      • Steps to be executed on all nodes
      • Steps to be executed on the coordinator node
  • Managed Service

Use-Case Guides

  • Multi-tenant Applications
    • Let’s Make an App – Ad Analytics
    • Scaling the Relational Data Model
    • Preparing Tables and Ingesting Data
      • Try it Yourself
    • Integrating Applications
    • Sharing Data Between Tenants
    • Online Changes to the Schema
    • When Data Differs Across Tenants
    • Scaling Hardware Resources
    • Dealing with Big Tenants
    • Where to Go From Here
  • Real-Time Dashboards
    • Data Model
    • Rollups
    • Expiring Old Data
    • Approximate Distinct Counts
    • Unstructured Data with JSONB
  • Timeseries Data
    • Scaling Timeseries Data on Citus
    • Automating Partition Creation
    • Archiving with Columnar Storage
      • Archiving a Row Partition to Columnar Storage

Architecture

  • Concepts
    • Nodes
      • Coordinator and Workers
    • Distributed Data
      • Table Types
        • Type 1: Distributed Tables
        • Type 2: Reference Tables
        • Type 3: Local Tables
      • Shards
        • Shard Placements
      • Co-Location
      • Parallelism
    • Query Execution

Develop

  • Determining Application Type
    • At a Glance
    • Examples and Characteristics
  • Choosing Distribution Column
    • Multi-Tenant Apps
      • Best Practices
    • Real-Time Apps
      • Best Practices
    • Timeseries Data
      • Best Practices
    • Table Co-Location
      • Data co-location in Citus for hash-distributed tables
      • A practical example of co-location
      • Using Regular PostgreSQL Tables
      • Distributing tables by ID
      • Distributing tables by tenant
      • Co-location means better feature support
      • Query Performance
  • Migrating an Existing App
    • Identify Distribution Strategy
      • Pick distribution key
      • Identify types of tables
    • Prepare Source Tables for Migration
      • Add distribution keys
      • Backfill newly created columns
    • Prepare Application for Citus
      • Set up Development Citus Cluster
        • Include distribution column in keys
      • Add distribution key to queries
        • Ruby on Rails
        • Django
        • ASP.NET
        • Java Hibernate
        • Other (SQL Principles)
      • Enable Secure Connections
      • Check for cross-node traffic
    • Migrate Production Data
      • Small Database Migration
      • Big Database Migration
        • Duplicate schema
        • Enable logical replication
        • Open access for network connection
        • Begin Replication
        • Switch over to Citus and stop all connections to old database
  • SQL Reference
    • Creating and Modifying Distributed Objects (DDL)
      • Creating And Distributing Tables
        • Reference Tables
        • Distributing Coordinator Data
      • Co-Locating Tables
        • Upgrading from Citus 5.x
      • Dropping Tables
      • Modifying Tables
        • Adding/Modifying Columns
        • Adding/Removing Constraints
        • Using NOT VALID Constraints
        • Adding/Removing Indices
      • Types and Functions
      • Manual Modification
    • Ingesting, Modifying Data (DML)
      • Inserting Data
        • “From Select” Clause (Distributed Rollups)
        • COPY Command (Bulk load)
    • Caching Aggregations with Rollups
      • Updates and Deletion
      • Maximizing Write Performance
    • Querying Distributed Tables (SQL)
      • Aggregate Functions
        • Count (Distinct) Aggregates
        • Estimating Top N Items
        • Percentile Calculations
      • Limit Pushdown
      • Views on Distributed Tables
      • Joins
        • Co-located joins
        • Reference table joins
        • Repartition joins
    • Query Processing
      • Distributed Query Planner
      • Distributed Query Executor
        • Subquery/CTE Push-Pull Execution
      • PostgreSQL planner and executor
    • Manual Query Propagation
      • Running on all Workers
      • Running on all Shards
      • Limitations
    • SQL Support and Workarounds
      • Workarounds
        • Work around limitations using CTEs
        • Temp Tables: the Workaround of Last Resort
  • Citus API
    • Citus Utility Functions
      • Table and Shard DDL
        • create_distributed_table
        • truncate_local_data_after_distributing_table
        • undistribute_table
        • alter_distributed_table
        • alter_table_set_access_method
        • remove_local_tables_from_metadata
        • create_reference_table
        • citus_add_local_table_to_metadata
        • mark_tables_colocated
        • update_distributed_table_colocation
        • create_distributed_function
        • alter_columnar_table_set
        • create_time_partitions
        • drop_old_time_partitions
        • alter_old_partitions_set_access_method
      • Metadata / Configuration Information
        • citus_add_node
        • citus_update_node
        • citus_set_node_property
        • citus_add_inactive_node
        • citus_activate_node
        • citus_disable_node
        • citus_add_secondary_node
        • citus_remove_node
        • citus_get_active_worker_nodes
        • citus_backend_gpid
        • citus_check_cluster_node_health
        • citus_set_coordinator_host
        • master_get_table_metadata
        • get_shard_id_for_distribution_column
        • column_to_column_name
        • citus_relation_size
        • citus_table_size
        • citus_total_relation_size
        • citus_stat_statements_reset
      • Cluster Management And Repair Functions
        • citus_move_shard_placement
        • citus_rebalance_start
        • citus_rebalance_status
        • citus_rebalance_stop
        • citus_rebalance_wait
        • rebalance_table_shards
        • get_rebalance_table_shards_plan
        • get_rebalance_progress
        • citus_add_rebalance_strategy
        • citus_set_default_rebalance_strategy
        • citus_remote_connection_stats
        • citus_drain_node
        • isolate_tenant_to_new_shard
        • citus_create_restore_point
    • Citus Tables and Views
      • Coordinator Metadata
        • Partition table
        • Shard table
        • Shard information view
        • Shard placement table
        • Worker node table
        • Distributed object table
        • Citus tables view
        • Time partitions view
        • Co-location group table
        • Rebalancer strategy table
        • Query statistics table
        • Distributed Query Activity
      • Tables on all Nodes
        • Connection Credentials Table
        • Connection Pooling Credentials
    • Configuration Reference
      • General configuration
        • citus.max_worker_nodes_tracked (integer)
        • citus.use_secondary_nodes (enum)
        • citus.cluster_name (text)
        • citus.enable_version_checks (boolean)
        • citus.log_distributed_deadlock_detection (boolean)
        • citus.distributed_deadlock_detection_factor (floating point)
        • citus.node_connection_timeout (integer)
        • citus.node_conninfo (text)
        • citus.local_hostname (text)
        • citus.show_shards_for_app_name_prefixes (text)
      • Query Statistics
        • citus.stat_statements_purge_interval (integer)
        • citus.stat_statements_max (integer)
        • citus.stat_statements_track (enum)
      • Data Loading
        • citus.multi_shard_commit_protocol (enum)
        • citus.shard_count (integer)
        • citus.shard_max_size (integer)
        • citus.replicate_reference_tables_on_activate (boolean)
      • Planner Configuration
        • citus.local_table_join_policy (enum)
        • citus.limit_clause_row_fetch_count (integer)
        • citus.count_distinct_error_rate (floating point)
        • citus.task_assignment_policy (enum)
      • Intermediate Data Transfer
        • citus.binary_worker_copy_format (boolean)
        • citus.max_intermediate_result_size (integer)
      • DDL
        • citus.enable_ddl_propagation (boolean)
        • citus.enable_local_reference_table_foreign_keys (boolean)
      • Executor Configuration
        • General
        • Explain output
  • External Integrations
    • Ingesting Data from Kafka
      • Caveats
    • Ingesting Data from Spark
    • Business Intelligence with Tableau

Administer

  • Cluster Management
    • Choosing Cluster Size
      • Shard Count
        • Multi-Tenant SaaS Use-Case
        • Real-Time Analytics Use-Case
    • Initial Hardware Size
      • Multi-Tenant SaaS Use-Case
      • Real-Time Analytics Use-Case
    • Scaling the cluster
      • Add a worker
      • Rebalance Shards without Downtime
        • How it Works
      • Adding a coordinator
    • Dealing With Node Failures
      • Worker Node Failures
      • Coordinator Node Failures
    • Tenant Isolation
    • Viewing Query Statistics
      • Statistics Expiration
    • Resource Conservation
      • Limiting Long-Running Queries
    • Security
      • Connection Management
      • Setup Certificate Authority signed certificates
      • Increasing Worker Security
      • Row-Level Security
    • PostgreSQL extensions
    • Creating a New Database
  • Table Management
    • Determining Table and Relation Size
    • Vacuuming Distributed Tables
    • Analyzing Distributed Tables
    • Columnar Storage
      • Usage
      • Measuring compression
      • Example
      • Gotchas
      • Limitations
  • Upgrading Citus
    • Upgrading Citus Versions
      • Patch Version Upgrade
      • Major and Minor Version Upgrades
        • Step 1. Update Citus Package
        • Step 2. Apply Update in DB
    • Upgrading PostgreSQL version from 14 to 15
      • For Every Node

Troubleshoot

  • Query Performance Tuning
    • Table Distribution and Shards
    • PostgreSQL tuning
    • Scaling Out Performance
    • Distributed Query Performance Tuning
      • General
        • Subquery/CTE Network Overhead
      • Advanced
        • Connection Management
        • Task Assignment Policy
        • Intermediate Data Transfer Format
        • Binary protocol
    • Scaling Out Data Ingestion
      • Real-time Insert and Updates
        • Insert Throughput
        • Update Throughput
        • Insert and Update: Throughput Checklist
        • Insert and Update: Latency
      • Staging Data Temporarily
      • Bulk Copy (250K - 2M/s)
  • Useful Diagnostic Queries
    • Finding which shard contains data for a specific tenant
    • Finding the distribution column for a table
    • Detecting locks
    • Querying the size of your shards
    • Querying the size of all distributed tables
    • Identifying unused indices
    • Monitoring client connection count
    • Viewing system queries
      • Active queries
      • Why are queries waiting
    • Index hit rate
    • Cache hit rate
  • Common Error Messages
    • Could not receive query results
      • Resolution
    • Canceling the transaction since it was involved in a distributed deadlock
      • Resolution
    • Could not connect to server: Cannot assign requested address
      • Resolution
    • SSL error: certificate verify failed
      • Resolution
    • Could not connect to any active placements
      • Resolution
    • Remaining connection slots are reserved for non-replication superuser connections
      • Resolution
    • PgBouncer cannot connect to server
      • Resolution
    • Relation foo is not distributed
      • Resolution
    • Unsupported clause type
      • Resolution
    • Cannot open new connections after the first modification command within a transaction
      • Resolution
    • Cannot create uniqueness constraint
      • Resolution
    • Function create_distributed_table does not exist
      • Resolution
    • STABLE functions used in UPDATE queries cannot be called with column references
      • Resolution

FAQ

  • Frequently Asked Questions
    • Can I create primary keys on distributed tables?
    • How do I add nodes to an existing Citus cluster?
    • How does Citus handle failure of a worker node?
    • How does Citus handle failover of the coordinator node?
    • Are there any PostgreSQL features not supported by Citus?
    • How do I choose the shard count when I hash-partition my data?
    • How do I change the shard count for a hash partitioned table?
    • How does citus support count(distinct) queries?
    • In which situations are uniqueness constraints supported on distributed tables?
    • How do I create database roles, functions, extensions etc in a Citus cluster?
    • What if a worker node’s address changes?
    • Which shard contains data for a particular tenant?
    • I forgot the distribution column of a table, how do I find it?
    • Can I distribute a table by multiple keys?
    • Why does pg_relation_size report zero bytes for a distributed table?
    • Why am I seeing an error about max_intermediate_result_size?
    • Can I run Citus on Microsoft Azure?
    • Can I shard by schema on Citus for multi-tenant applications?
    • How does cstore_fdw work with Citus?
    • What happened to pg_shard?

Articles

  • Related Articles
    • Efficient Rollup Tables with HyperLogLog in Postgres
      • Rollup tables without HLL—using GitHub events data as an example
      • Without HLL, rollup tables have a few limitations
      • HLL to the rescue
      • HLL and rollup tables in action, together
      • What kinds of queries can HLL answer?
      • A rollup table with HLL is worth a thousand rollup tables without HLL
      • Want to learn more about HLL in Postgres?
    • Distributed Distinct Count with HyperLogLog on Postgres
      • What does HLL do behind the curtains?
        • Hash all elements
        • Observe the data for rare patterns
        • Stochastic Averaging
        • More?
      • HLL in distributed systems
      • Hands on with HLL
        • Setup
        • Examples
        • Conclusion
    • Postgres Parallel Indexing in Citus
    • Real-time Event Aggregation at Scale Using Postgres with Citus
    • How Distributed Outer Joins on PostgreSQL with Citus Work
      • Distributed Outer Joins with Citus
    • Designing your SaaS Database for Scale with Postgres
    • Building a Scalable Postgres Metrics Backend using the Citus Extension
      • Time-Series Metrics
      • Events
    • Sharding a Multi-Tenant App with Postgres
      • Tenancy
      • Multi-tenancy and co-location, a perfect pair
      • In conclusion
    • Sharding Postgres with Semi-Structured Data and Its Performance Implications
      • One large table, without joins
      • Enter Citus
      • The query workload
      • Every distribution has its thorns
    • Scalable Real-time Product Search using PostgreSQL with Citus
Versions
latest
stable
v11.2
v11.1
v11.0
v10.2
v10.1
v10.0
v9.5
v9.4
v9.3
v9.2
v9.1
v9.0
v8.3
v8.2
v8.1
v8.0
v7.5
v7.4
v7.3
v7.2
v7.1
v7.0
v6.2.2
v6.2
v6.1
v6.0
v5.2
v5.1
v5.0
a11y-2
Downloads
pdf
epub
On Read the Docs
Project Home
Builds
Citus
  • »
  • Quick Tutorials
  • Edit on GitHub

Quick Tutorials

  • Multi-tenant Applications
    • Data model and sample data
    • Creating tables
    • Distributing tables and loading data
    • Running queries
  • Real-time Analytics
    • Data model and sample data
    • Creating tables
    • Distributing tables and loading data
    • Running queries
Previous Next

© Copyright 2023, Citus Data, a Microsoft Company. Revision cb0d4914.