Medium Impact

Batch INSERT Statements for Bulk Data

10-100x faster inserts for bulk operations

Why This Matters

Inserting data row-by-row creates significant overhead. Each INSERT statement requires a round-trip to the database, transaction logging, and index updates. For bulk operations, this can take hours instead of minutes.

Batch inserts are essential for:

Warning: Single-row inserts for 10,000 rows can take 30+ seconds. Batch inserts reduce this to under 1 second.

The Problem

Looping through data and inserting one row at a time is extremely slow:

Incorrect - Single-row inserts in a loop
-- Application code: inserting 10,000 rows one at a time
for (item in items) {
  INSERT INTO products (name, price, category_id)
  VALUES (item.name, item.price, item.category_id);
}

-- Result: 10,000 round-trips, 10,000 transaction commits
-- Time: ~30 seconds for 10,000 rows

Each INSERT is a separate statement, requiring network round-trips, parsing, planning, and execution overhead. This scales linearly with the number of rows.

The Solution

Use multi-row INSERT statements to batch multiple rows into a single statement:

Correct - Multi-row INSERT
-- Insert multiple rows in a single statement
INSERT INTO products (name, price, category_id)
VALUES
  ('Product 1', 29.99, 1),
  ('Product 2', 39.99, 1),
  ('Product 3', 49.99, 2),
  -- ... up to 1000 rows per statement
  ('Product 1000', 19.99, 3);

-- Result: 1 round-trip, 1 transaction commit
-- Time: ~0.3 seconds for 10,000 rows (batched in groups of 1000)

For even better performance with very large datasets, use the COPY command:

-- COPY is the fastest method for bulk inserts
COPY products (name, price, category_id)
FROM '/path/to/data.csv'
WITH (FORMAT csv, HEADER true);

-- Or use COPY with STDIN for programmatic inserts
-- COPY products (name, price, category_id) FROM STDIN;
-- [stream data]
-- \.

-- Result: Fastest possible insert speed
-- Time: ~0.1 seconds for 10,000 rows

Best Practice: Batch sizes of 500-1000 rows per INSERT statement provide optimal balance between performance and memory usage. For datasets larger than 100,000 rows, prefer COPY.

Result: Batch inserts reduce insert time from 30 seconds to 0.3 seconds - a 100x improvement. COPY can achieve even better performance for very large datasets.

Check Your Project Automatically

Connect your Supabase project to detect inefficient insert patterns and other performance issues.

Start Supabase Audit

Common Mistakes

Implementation Tips

When implementing batch inserts in your application:

-- JavaScript/Node.js example
const batchSize = 1000;
const batches = [];

// Group items into batches
for (let i = 0; i < items.length; i += batchSize) {
  batches.push(items.slice(i, i + batchSize));
}

// Insert each batch
for (const batch of batches) {
  const values = batch.map(item => 
    `('${item.name}', ${item.price}, ${item.category_id})`
  ).join(', ');
  
  await db.query(`
    INSERT INTO products (name, price, category_id)
    VALUES ${values}
  `);
}

For Python with psycopg2, use execute_values:

# Python example
from psycopg2.extras import execute_values

execute_values(
    cursor,
    "INSERT INTO products (name, price, category_id) VALUES %s",
    [(item['name'], item['price'], item['category_id']) for item in items],
    page_size=1000
)

Learn More