Partition Large Tables for Better Performance

Why This Matters

Partitioning splits a large table into smaller, manageable pieces called partitions. Queries that filter on the partition key only scan relevant partitions, dramatically improving performance.

Query performance: Only scans partitions matching the WHERE clause
Maintenance speed: VACUUM, ANALYZE, and index rebuilds run on individual partitions
Data management: Drop old partitions instantly instead of deleting millions of rows
Index efficiency: Smaller indexes per partition are faster to maintain

When to partition: Consider partitioning when tables exceed 10-50GB or contain millions of rows, especially for time-series data like logs, events, or analytics.

The Problem

Incorrect - Single large table

-- Single table with 100M+ rows
CREATE TABLE events (
  id BIGINT PRIMARY KEY,
  user_id INT,
  event_type TEXT,
  created_at TIMESTAMPTZ,
  data JSONB
);

-- Query for last 7 days scans entire table
SELECT * FROM events
WHERE created_at >= NOW() - INTERVAL '7 days';

-- VACUUM takes hours
-- Deleting old data is slow
DELETE FROM events WHERE created_at < NOW() - INTERVAL '1 year';  -- Very slow

The Solution

Correct - Range partitioned by month

-- Create partitioned table
CREATE TABLE events (
  id BIGINT,
  user_id INT,
  event_type TEXT,
  created_at TIMESTAMPTZ,
  data JSONB,
  PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE events_2024_01 PARTITION OF events
  FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
  FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Query only scans relevant partitions
SELECT * FROM events
WHERE created_at >= NOW() - INTERVAL '7 days';  -- Fast!

-- Drop old partition instantly
DROP TABLE events_2023_01;  -- Instant, no DELETE needed

Result: Queries filtering on created_at only scan relevant partitions. A query for the last 7 days scans 1 partition instead of 100M rows, reducing query time from 5 seconds to 50ms.

Automatic Partition Management

Use a function to automatically create future partitions:

-- Function to create next month's partition
CREATE OR REPLACE FUNCTION create_monthly_partition(table_name TEXT, start_date DATE)
RETURNS VOID AS $$
DECLARE
  partition_name TEXT;
  end_date DATE;
BEGIN
  partition_name := table_name || '_' || TO_CHAR(start_date, 'YYYY_MM');
  end_date := start_date + INTERVAL '1 month';
  
  EXECUTE FORMAT(
    'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
    partition_name, table_name, start_date, end_date
  );
END;
$$ LANGUAGE plpgsql;

-- Create partitions for next 12 months
SELECT create_monthly_partition('events', DATE_TRUNC('month', NOW()) + (i || ' months')::INTERVAL)
FROM GENERATE_SERIES(0, 11) i;

Common Mistakes

Partitioning small tables: Partitioning adds overhead. Only partition tables with 10GB+ or millions of rows.
Wrong partition key: The partition key must be in the PRIMARY KEY. Use composite keys: PRIMARY KEY (id, created_at).
Too many partitions: Having 1000+ partitions slows down planning. Keep partitions to hundreds, not thousands.
Not filtering on partition key: Queries without a partition key filter scan all partitions, negating benefits.
Forgetting indexes: Create indexes on each partition or use a default index on the parent table.

Important: The partition key column must be part of the PRIMARY KEY. If your primary key is just id, change it to PRIMARY KEY (id, created_at) when partitioning by created_at.

Check Your Project Automatically

Connect your Supabase project to identify large tables that would benefit from partitioning.

Start Supabase Audit

Why This Matters

The Problem

The Solution

Automatic Partition Management

Common Mistakes

Check Your Project Automatically

Related Practices

Learn More