Minimalistic Oracle: Compression for Postgres tables

In PostgreSQL you don’t create a “compressed table” in the Oracle sense. Rather, compression is achieved per-column compression via TOAST, plus (if desirable) a few extension/filesystem tricks.

Postgres compresses large variable-length columns using TOAST:

text, varchar, bytea, jsonb, xml, numeric, etc.

TOAST will be applied when a row is too big to fit in an 8kB page; large values are stored in a separate TOAST table, optionally compressed.

There is no built-in heap/row compression for normal fixed-width columns (e.g. integer, bigint) in vanilla Postgres.

From PostgreSQL 14 onward you can choose compression algorithm per column: pglz (classic) or lz4 (faster, generally preferred).

Here is how I did it in my test environment:

First, ensure your PostgreSQL server was compiled with LZ4 support (use pg_config --configure and look for --with-lz4)

pg_config --configure | grep lz4

It will show you a long list of options that was used when PostgreSQL was built. Look for '--with-lz4'

Set compression globally:

show default_toast_compression;
 default_toast_compression
---------------------------
 pglz
(1 row)

postgres=# ALTER SYSTEM SET default_toast_compression = 'lz4';
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# show default_toast_compression;
 default_toast_compression
---------------------------
 lz4

Optional: default to LZ4 for this session

  SET default_toast_compression = 'lz4';

Yet another option is to set LZ4 for a specific database:

 ALTER DATABASE mydb SET default_toast_compression = 'lz4';

Create the table:

CREATE TABLE app_logs (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text        COMPRESSION lz4,
    details     jsonb       COMPRESSION lz4
);

Note:

COMPRESSION lz4 / COMPRESSION pglz is a column option.

Only matters for TOAST-able types; it won’t change anything for integer, date, etc.

Compression only happens when the row gets large enough for TOAST to kick in (roughly when row > ~2kB).

You can switch existing columns to LZ4 (or back to pglz):

 ALTER TABLE app_logs
    ALTER COLUMN message SET COMPRESSION lz4,
    ALTER COLUMN details SET COMPRESSION lz4;

Note that an ALTER TABLE only changes the future TOAST entries. To actually recompress existing rows you need to cause a rewrite. Common options:

-- 1) Table rewrite (heavy, but clean)
ALTER TABLE app_logs SET (toast_tuple_target = 2040);  -- optional tweak
VACUUM FULL app_logs;

-- or 2) Cluster on some index (also rewrites)
CLUSTER app_logs USING app_logs_pkey;
ANALYZE app_logs;

Any bulk rewrite (incl. CREATE TABLE AS ..., INSERT INTO new SELECT ... FROM old) will store new TOAST values using the new compression method.

Check that the table is using column compression for TOAST values:

-- Main table vs TOAST table sizes
SELECT
    relname,
    pg_size_pretty(pg_relation_size(oid))          AS heap_size,
    pg_size_pretty(pg_total_relation_size(oid))    AS total_with_indexes_toast
FROM pg_class
WHERE relname IN ('app_logs2','app_logs3','app_logs4');

-- Look at TOAST table directly
SELECT
    c1.relname       AS main_table,
    c2.relname       AS toast_table,
    pg_size_pretty(pg_total_relation_size(c2.oid)) AS toast_total
FROM pg_class c1
JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE c1.relname IN ('app_logs2','app_logs3','app_logs4');

In a simple test, I created three tables with three different compression directives and created one long value that would make sure it was TOASTED:

CREATE TABLE app_logs2 (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text,
    details     jsonb
);

CREATE TABLE app_logs3 (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text        COMPRESSION lz4,
    details     jsonb       COMPRESSION lz4
);

CREATE TABLE app_logs4 (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text        COMPRESSION pglz,
    details     jsonb       COMPRESSION pglz
);

INSERT INTO app_logs2 (log_time, level, message, details)
VALUES (
    now(),
    'INFO',
    repeat('x', 100000),                -- make it large enough to be TOASTed
    jsonb_build_object('k', repeat('y', 100000))
);

INSERT INTO app_logs3 (log_time, level, message, details)
VALUES (
    now(),
    'INFO',
    repeat('x', 100000),                -- make it large enough to be TOASTed
    jsonb_build_object('k', repeat('y', 100000))
);

INSERT INTO app_logs4 (log_time, level, message, details)
VALUES (
    now(),
    'INFO',
    repeat('x', 100000),                -- make it large enough to be TOASTed
    jsonb_build_object('k', repeat('y', 100000))
);

As expected, the app_logs2 defaulted to lz4 (set globally):

SELECT
    relname,
    pg_size_pretty(pg_relation_size(oid))          AS heap_size,
    pg_size_pretty(pg_total_relation_size(oid))    AS total_with_indexes_toast
FROM pg_class
WHERE relname IN ('app_logs2','app_logs3','app_logs4');

-- Look at TOAST table directly
SELECT
    c1.relname       AS main_table,
    c2.relname       AS toast_table,
    pg_size_pretty(pg_total_relation_size(c2.oid)) AS toast_total
FROM pg_class c1
JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE c1.relname IN ('app_logs2','app_logs3','app_logs4');

 relname  | heap_size  | total_with_indexes_toast
-----------+------------+--------------------------
 app_logs2 | 8192 bytes | 32 kB
 app_logs3 | 8192 bytes | 32 kB
 app_logs4 | 8192 bytes | 48 kB
(3 rows)

 main_table |   toast_table    | toast_total
------------+------------------+-------------
 app_logs2  | pg_toast_2510179 | 8192 bytes
 app_logs3  | pg_toast_2510188 | 8192 bytes
 app_logs4  | pg_toast_2510197 | 24 kB
(3 rows)

Remember, per-column compression via default_toast_compression doesn not show up in \d+ unless it was explicitly set in the column definition.

So even if table app_logs2 uses compression for TOASTable columns, it does not reveal this fact when being described:

CREATE TABLE app_logs2 (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text,
    details     jsonb
);

 \d+ app_logs2
                                                                 Table "myschema.app_logs2"
  Column  |           Type           | Collation | Nullable |                  Default                  | Storage  | Compression | Stats target | Description
----------+--------------------------+-----------+----------+-------------------------------------------+----------+-------------+--------------+-------------
 log_id   | bigint                   |           | not null | nextval('app_logs2_log_id_seq'::regclass) | plain    |             |              |
 log_time | timestamp with time zone |           | not null |                                           | plain    |             |              |
 level    | text                     |           | not null |                                           | extended |             |              |
 message  | text                     |           |          |                                           | extended |             |              |
 details  | jsonb                    |           |          |                                           | extended |             |              |

Minimalistic Oracle

Thursday, January 29, 2026

Compression for Postgres tables

No comments:

Post a Comment