Showing posts with label Compression. Show all posts
Showing posts with label Compression. Show all posts

Thursday, January 29, 2026

Compression for Postgres tables

In PostgreSQL you don’t create a “compressed table” in the Oracle sense. Rather, compression is achieved per-column compression via TOAST, plus (if desirable) a few extension/filesystem tricks.

Postgres compresses large variable-length columns using TOAST:
text, varchar, bytea, jsonb, xml, numeric, etc.
TOAST will be applied when a row is too big to fit in an 8kB page; large values are stored in a separate TOAST table, optionally compressed.

There is no built-in heap/row compression for normal fixed-width columns (e.g. integer, bigint) in vanilla Postgres.

From PostgreSQL 14 onward you can choose compression algorithm per column: pglz (classic) or lz4 (faster, generally preferred).

Here is how I did it in my test environment:

First, ensure your PostgreSQL server was compiled with LZ4 support (use pg_config --configure and look for --with-lz4)
pg_config --configure | grep lz4
It will show you a long list of options that was used when PostgreSQL was built. Look for '--with-lz4'

Set compression globally:
show default_toast_compression;
 default_toast_compression
---------------------------
 pglz
(1 row)

postgres=# ALTER SYSTEM SET default_toast_compression = 'lz4';
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# show default_toast_compression;
 default_toast_compression
---------------------------
 lz4
Optional: default to LZ4 for this session
  SET default_toast_compression = 'lz4';
Yet another option is to set LZ4 for a specific database:
 ALTER DATABASE mydb SET default_toast_compression = 'lz4';
Create the table:
CREATE TABLE app_logs (
    log_id      bigserial PRIMARY KEY,
    log_time    timestamptz NOT NULL,
    level       text        NOT NULL,
    message     text        COMPRESSION lz4,
    details     jsonb       COMPRESSION lz4
);
Note:
  • COMPRESSION lz4 / COMPRESSION pglz is a column option.
  • Only matters for TOAST-able types; it won’t change anything for integer, date, etc.
  • Compression only happens when the row gets large enough for TOAST to kick in (roughly when row > ~2kB).

    You can switch existing columns to LZ4 (or back to pglz):
     ALTER TABLE app_logs
        ALTER COLUMN message SET COMPRESSION lz4,
        ALTER COLUMN details SET COMPRESSION lz4;
    
     
    Note that an ALTER TABLE only changes the future TOAST entries. To actually recompress existing rows you need to cause a rewrite. Common options:
    -- 1) Table rewrite (heavy, but clean)
    ALTER TABLE app_logs SET (toast_tuple_target = 2040);  -- optional tweak
    VACUUM FULL app_logs;
    
    -- or 2) Cluster on some index (also rewrites)
    CLUSTER app_logs USING app_logs_pkey;
    ANALYZE app_logs;
    
    Any bulk rewrite (incl. CREATE TABLE AS ..., INSERT INTO new SELECT ... FROM old) will store new TOAST values using the new compression method.


  • Check that the table is using column compression for TOAST values:

    -- Main table vs TOAST table sizes
    SELECT
        relname,
        pg_size_pretty(pg_relation_size(oid))          AS heap_size,
        pg_size_pretty(pg_total_relation_size(oid))    AS total_with_indexes_toast
    FROM pg_class
    WHERE relname IN ('app_logs2','app_logs3','app_logs4');
    
    -- Look at TOAST table directly
    SELECT
        c1.relname       AS main_table,
        c2.relname       AS toast_table,
        pg_size_pretty(pg_total_relation_size(c2.oid)) AS toast_total
    FROM pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
    WHERE c1.relname IN ('app_logs2','app_logs3','app_logs4');
    
    In a simple test, I created three tables with three different compression directives and created one long value that would make sure it was TOASTED:
    CREATE TABLE app_logs2 (
        log_id      bigserial PRIMARY KEY,
        log_time    timestamptz NOT NULL,
        level       text        NOT NULL,
        message     text,
        details     jsonb
    );
    
    CREATE TABLE app_logs3 (
        log_id      bigserial PRIMARY KEY,
        log_time    timestamptz NOT NULL,
        level       text        NOT NULL,
        message     text        COMPRESSION lz4,
        details     jsonb       COMPRESSION lz4
    );
    
    CREATE TABLE app_logs4 (
        log_id      bigserial PRIMARY KEY,
        log_time    timestamptz NOT NULL,
        level       text        NOT NULL,
        message     text        COMPRESSION pglz,
        details     jsonb       COMPRESSION pglz
    );
    
    INSERT INTO app_logs2 (log_time, level, message, details)
    VALUES (
        now(),
        'INFO',
        repeat('x', 100000),                -- make it large enough to be TOASTed
        jsonb_build_object('k', repeat('y', 100000))
    );
    
    INSERT INTO app_logs3 (log_time, level, message, details)
    VALUES (
        now(),
        'INFO',
        repeat('x', 100000),                -- make it large enough to be TOASTed
        jsonb_build_object('k', repeat('y', 100000))
    );
    
    INSERT INTO app_logs4 (log_time, level, message, details)
    VALUES (
        now(),
        'INFO',
        repeat('x', 100000),                -- make it large enough to be TOASTed
        jsonb_build_object('k', repeat('y', 100000))
    );
    
    
    As expected, the app_logs2 defaulted to lz4 (set globally):
    SELECT
        relname,
        pg_size_pretty(pg_relation_size(oid))          AS heap_size,
        pg_size_pretty(pg_total_relation_size(oid))    AS total_with_indexes_toast
    FROM pg_class
    WHERE relname IN ('app_logs2','app_logs3','app_logs4');
    
    -- Look at TOAST table directly
    SELECT
        c1.relname       AS main_table,
        c2.relname       AS toast_table,
        pg_size_pretty(pg_total_relation_size(c2.oid)) AS toast_total
    FROM pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
    WHERE c1.relname IN ('app_logs2','app_logs3','app_logs4');
    
     relname  | heap_size  | total_with_indexes_toast
    -----------+------------+--------------------------
     app_logs2 | 8192 bytes | 32 kB
     app_logs3 | 8192 bytes | 32 kB
     app_logs4 | 8192 bytes | 48 kB
    (3 rows)
    
     main_table |   toast_table    | toast_total
    ------------+------------------+-------------
     app_logs2  | pg_toast_2510179 | 8192 bytes
     app_logs3  | pg_toast_2510188 | 8192 bytes
     app_logs4  | pg_toast_2510197 | 24 kB
    (3 rows)
    
    Remember, per-column compression via default_toast_compression doesn not show up in \d+ unless it was explicitly set in the column definition.

    So even if table app_logs2 uses compression for TOASTable columns, it does not reveal this fact when being described:
    CREATE TABLE app_logs2 (
        log_id      bigserial PRIMARY KEY,
        log_time    timestamptz NOT NULL,
        level       text        NOT NULL,
        message     text,
        details     jsonb
    );
    
     \d+ app_logs2
                                                                     Table "myschema.app_logs2"
      Column  |           Type           | Collation | Nullable |                  Default                  | Storage  | Compression | Stats target | Description
    ----------+--------------------------+-----------+----------+-------------------------------------------+----------+-------------+--------------+-------------
     log_id   | bigint                   |           | not null | nextval('app_logs2_log_id_seq'::regclass) | plain    |             |              |
     log_time | timestamp with time zone |           | not null |                                           | plain    |             |              |
     level    | text                     |           | not null |                                           | extended |             |              |
     message  | text                     |           |          |                                           | extended |             |              |
     details  | jsonb                    |           |          |                                           | extended |             |              |
     

    Monday, June 28, 2021

    How to add an ADO compression policy to a table partition

    ADO (Automatic Data Optimization) is one component of the Oracle 12c ILM (Information Life Cycle Management) solution.
    For ADO to work, access to and modification of data needs to be constantly tracked. This is done by enabling a heat map in your database:
    alter system set heat_map=on scope=both;
    
    The database will immediately start sampling information. After a while, the heat map will contain information about how your objects are being used.

    My table contains 211 partitions. I would like to add an ADO policy to one of them to illustrate how ADO policies works.
    Let's list the 3 largest partitions:
    select p.TABLE_NAME, s.partition_name,s.tablespace_name, p.compression,p.num_rows,Round(sum(s.bytes)/1024/1024/1024,1) "GB"
    from dba_segments S join dba_tab_partitions p
    on (s.partition_name = p.partition_name)
    where s.segment_name='ENTITIY_PR'
    and   s.owner='SSB'
    group by p.table_name,s.partition_name,s.tablespace_name,p.compression,p.num_rows
    order by 5 desc
    fetch first 3 rows only;
    

    TABLE_NAME PARTITION_NAME TABLESPACE_NAME COMPRESSION NUM_ROWS GB
    ENTITIY_PR SYS_P5055 SSB DISABLED
    43448193
    3,3
    ENTITIY_PR SYS_P4518 SSB DISABLED
    43447546
    3,4
    ENTITIY_PR SYS_P4709 SSB DISABLED
    43217045
    3,3


    Add a policy to have Oracle automatically compress the partition segment after one day without any access (read and write):
      
    alter table SSB.ENTITIY_PR 
    modify partition SYS_P5055 ilm add policy 
    row store compress advanced segment after 1 days of no access;
    
    In addition to no access, you could also use low access, no modification or creation to express activity type

    Verify that the ADO policy is in place:
    SELECT policy_name,
           object_owner,
           object_name,
           subobject_name,
           object_type,
           inherited_from,
           enabled,
           deleted
    FROM   dba_ilmobjects
    WHERE  object_owner='SSB'
    AND    object_name='ENTITIY_PR'
    AND    subobject_name='SYS_P5055'
    ORDER BY 1;
    
    POLICY_NAME OBJECT_OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE INHERITED_FROM ENABLED DELETED
    P22 SSB ENTITIY_PR SYS_P5055 TABLE PARTITION POLICY NOT INHERITED YES NO

    What does the heat map say about this particular partition?
    select * 
    from dba_heat_map_segment
    where owner='SSB'
    and   object_name='ENTITIY_PR' 
    AND   subobject_name='SYS_P5055';
    
    
    No rows returned.
    
    So this partitioned hasn't actually been modified or accessed since the heat map was enabled.
    Let's access the partition with a query, and see if this will be registered in the heat map.
    SELECT SYSTIMESTAMP FROM DUAL; --> 25-JUN-21 10.04.30
    
    SELECT *
    FROM   SSB.ENTITIY_PR
    PARTITION (SYS_P5055)
    fetch first 10 rows only;
    
    Run the query against dba_heat_map_segment again, and there is now an entry in the heat map for my visit to the table:
    OWNER OBJECT_NAME SUBOBJECT_NAME SEGMENT_WRITE_TIME SEGMENT_READ_TIME FULL_SCAN LOOKUP_SCAN
    SSB ENTITIY_PR SYS_P5055   25.06.2021 10:04:34 25.06.2021 10:04:34  
    Notice that the column "LOOKUP_SCAN" has no value.
    Let's try to send a query to the database that will require an PK index lookup:
    SELECT *
    FROM   SBB.ENTITIY_PR
    PARTITION (SYS_P5055)
    WHERE systemid = 1145708618;
    
    Run the query against dba_heat_map_segment again, and you can now see that the lookup was recorded in the heat map:
    OWNER OBJECT_NAME SUBOBJECT_NAME SEGMENT_WRITE_TIME SEGMENT_READ_TIME FULL_SCAN LOOKUP_SCAN
    SBB ENTITIY_PR SYS_P5055   25.06.2021 10:25:33 25.06.2021 10:25:33 25.06.2021 10:25:33
    If you wonder why the timestamps in dba_heat_map_segment are incrementing, although you're not executing any queries, it is because the object is still in the shared pool. There is no such thing as purging a table from the shared pool. However, you can purge the cursors that refer to it. To find those cursors, use the following query:
    set lines 200
    spool purge.sql
    set heading off
    set feedback off
    set verify off
    set echo off
    set pages 0
    set trimspool on
    
    select 'exec DBMS_SHARED_POOL.PURGE (''' || ADDRESS ||','|| HASH_VALUE || ''',''C'');'
    from V$SQLAREA where SQL_ID IN (SELECT SQL_ID FROM v$sqlarea where sql_text like 'SELECT * FROM%SBB.ENTITIY_PR%SYS_P5055%');
    select 'exit' from dual;
    exit
    
    Run the script and it will spool a file "purge.sql" to your current directory, which looks like this in my case:
    exec DBMS_SHARED_POOL.PURGE ('00000000BF56D560,1541949375','C');
    exec DBMS_SHARED_POOL.PURGE ('00000000C67994B8,3904135316','C');
    exec DBMS_SHARED_POOL.PURGE ('00000000C67973D8,27827680','C');
    
    Run these statements, and the cursors refering to ENTITY_PR will be removed from memory, and the incrementing timestamps will stop. Obviously, if other users are actually using the object, you would need to be careful about purging the shared pool.
    After one full day of no access, let's check if the ADO policy has done its work:
    select p.TABLE_NAME, s.partition_name,s.tablespace_name, p.compression,p.num_rows,Round(sum(s.bytes)/1024/1024/1024,1) "GB"
    from dba_segments S join dba_tab_partitions p
    on (s.partition_name = p.partition_name)
    where s.segment_name='ENTITIY_PR'
    and   s.owner='SSB'
    and   s.partition_name in ( 'SYS_P5055')
    and   s.segment_name = p.table_name
    group by p.table_name,s.partition_name,s.tablespace_name,p.compression,p.num_rows
    order by 5 desc;
    
    The output confirms that the partition has been compressed, and if you check the output from the same query earlier in this post, you'll notice that the size of the partition has dropped from 3,3G to 1,3G, a 60% reduction:
    TABLE_NAME PARTITION_NAME TABLESPACE_NAME COMPRESSION NUM_ROWS GB
    ENTITY_PR SYS_P5055 SSB ENABLED
    43448193
    1,3
    When the policy now has been enforced, what is the status of the policy?
    SELECT policy_name,
           object_owner,
           object_name,
           subobject_name,
           object_type,
           inherited_from,
           enabled,
           deleted
    FROM   dba_ilmobjects
    WHERE  object_owner='SSB'
    AND    object_name='ENTITY_PR'
    AND    subobject_name='SYS_P5055'
    ORDER BY 1;
    
    POLICY_NAME OBJECT_OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE INHERITED_FROM ENABLED DELETED
    P22 SAKSINFO_SKFIN ENTITET_EGENSKAP SYS_P5055 TABLE PARTITION POLICY NOT INHERITED NO NO
    Note how the rule is no longer enabled, now that it has been executed.
    Further reading:
    The official Oracle 19c ADO documentation As always, the excellently written tutorials from Tim Hall I also used examples from the book "Oracle Database 12c New Features", Chapter 8, by Robert G. Freeman published by Oracle Press.

    Monday, March 16, 2020

    How to specifiy default compression for a tablespace



    CREATE BIGFILE TABLESPACE star_data DATAFILE
    '/oradata/proddb01/star_data.dbf' SIZE 1024M AUTOEXTEND ON NEXT 128M MAXSIZE 12T
    LOGGING
    DEFAULT
    TABLE COMPRESS FOR OLTP
    INDEX COMPRESS ADVANCED LOW
    ONLINE
    EXTENT MANAGEMENT LOCAL AUTOALLOCATE
    BLOCKSIZE 8K
    SEGMENT SPACE MANAGEMENT AUTO
    FLASHBACK ON;

    Note that the syntax used for tablespaces
    TABLE COMPRESS FOR OLTP
    
    is equivalent to the ROW STORE COMPRESS ADVANCED clause of CREATE TABLE.

    Read more about tablespaces with default compression attributes in the 12.2 documentation

    How to compress an existing table online



    The ONLINE keyword makes it simple to compress an existing table online:
    alter TABLE scott.emp
    move new_tablespace
    row store compress advanced
    online;
    

    If you omit the new_tablespace clause, Oracle will move it to the same tablespace as it is already residing in and the table will be reorganized and compressed.

    Thursday, November 16, 2017

    Basic compression, advanced compression, HCC - an overview


    Basic Compression:


    Basic compression was released in Oracle 9iR2.It is part of the Enterprise Edition and comes at no extra licencing cost.

    If you specify only the keyword COMPRESS, it is equivalent to specifying ROW STORE COMPRESS BASIC and enables basic table compression.

    Basic Compression

    CREATE TABLE SCOTT.test
    (
    id NUMBER,
    fname VARCHAR2(32),
    ename VARCHAR2(32),
    added_date DATE
    )
    TABLESPACE USERS
    COMPRESS;

    The keyword COMPRESS is kept for backward compability, but the current preferred syntax is
    ROW STORE COMPRESS
    or
    ROW STORE COMPRESS BASIC
    Basic compression does not apply compression to DML operations after the initial load.
    With basic compression, Oracle will attempt to compress data during direct-path INSERT operations, but not during conventional-path INSERT operations. Make sure you fullfil the requirements for direct path inserts.

    Advanced Compression:


    Avanced compression was released in Oracle 11gR1 in 2007, and was called "OLTP Table Compression" at the time. You need to purchace the Advanced Compression Option from Oracle to use it.

    When you enable table compression by specifying ROW STORE COMPRESS ADVANCED, you enable Advanced Row Compression. Oracle Database compresses data during all DML operations on the table. This form of compression is recommended for OLTP environments.

    Advanced Compression

    CREATE TABLE SCOTT.test
    (
      id          NUMBER,
      fname       VARCHAR2(32),
      ename       VARCHAR2(32),
      added_date  DATE
    )
    TABLESPACE USERS
    ROW STORE COMPRESS ADVANCED;
    

    In earlier releases, Advanced Row Compression was called OLTP table compression and was enabled using COMPRESS FOR OLTP. This syntax is still supported for backward compatibility.


    Hybrid Columnar Compression


    For certain storage systems, Oracle has introduced Hybrid Columnar Compression, or HCC.


    COLUMN STORE COMPRESS FOR QUERY is useful in data warehousing environments. Valid values are LOW and HIGH, with HIGH providing a higher compression ratio. The default is HIGH:

    HCC Compression - non-dormant data

    CREATE TABLE SCOTT.test
    (
      id          NUMBER,
      fname       VARCHAR2(32),
      ename       VARCHAR2(32),
      added_date  DATE
    )
    TABLESPACE USERS
    COLUMN STORE COMPRESS FOR QUERY HIGH | LOW;
    

    COLUMN STORE COMPRESS FOR ARCHIVE uses higher compression ratios than COLUMN STORE COMPRESS FOR QUERY, and is useful for compressing data that will be stored for long periods of time. Valid values are LOW and HIGH, with HIGH providing the highest possible compression ratio. The default is LOW:

    HCC Compression - dormant data

    CREATE TABLE SCOTT.test
    (
      id          NUMBER,
      fname       VARCHAR2(32),
      ename       VARCHAR2(32),
      added_date  DATE
    )
    TABLESPACE USERS
    COLUMN STORE COMPRESS FOR ARCHIVE HIGH | LOW;
    

    If you attempt to create a table With HCC on a non-supported Storage system, you will receive the following error Message:
    ORA-64307:  Exadata Hybrid Columnar Compression is not supported for tablespaces on this storage type
    


    Sources: Create Table

    Oracle Advanced Compression With Oracle 12cR2 - excellent document

    Wednesday, August 30, 2017

    How to move a partition online from Oracle 12.1 and onwards


    From Oracle version 12.1 and ownwards, moving partitions and subpartitions becomes non-blocking DDL operations. DML against the table can continue while the subobject is being moved. Oracle guarantees that global indexes are maintained during the move operation, so you no longer have to specify the "UPDATE INDEXES ONLINE"-clause.

    Remember that skipping this clause will trigger Asynchronous Global Index Maintenance during the nightly maintenance window.

    To move a subpartition to a new tablespace:
    ALTER TABLE MYUSER.MYTABLE 
    MOVE SUBPARTITION P_201312 ONLINE
    TABLESPACE MYTABSPC;
    

    To compress the subpartition data at the same time, use
    ALTER TABLE MYUSER.MYTABLE 
    MOVE SUBPARTITION P_201312_P_OPLSUM ONLINE 
    TABLESPACE MYTABSPC 
    ROW STORE COMPRESS ADVANCED;
    

    To move a partition to a new tablespace using a DOP of 4 while updating any global indexes defined on the table:
    ALTER TABLE MYUSER.MYTABLE
    MOVE PARTITION P_2018_05 ONLINE
    TABLESPACE NEWTABLESPACE
    PARALLEL 4;
    

    To compress the partition data at the same time, use
    ALTER TABLE MYUSER.MYTABLE
    MOVE PARTITION P_2018_06 ONLINE
    TABLESPACE NEWTABLESPACE
    ROW STORE COMPRESS ADVANCED
    PARALLEL 4;
    

    Remember to change the properties for the partition and the table, too:
    ALTER TABLE MYUSER.MYTABLE
    MODIFY DEFAULT ATTRIBUTES FOR PARTITION P201312
    TABLESPACE MYTABSPC;
    
    ALTER TABLE MYUSER.MYTABLE 
    MODIFY DEFAULT ATTRIBUTES 
    TABLESPACE MYTABSPC;
    

    A simple query to find how much space the partitions are using would be:
    select partition_name,tablespace_name, round(sum(bytes)/1024/1024/1024,1) "GB"
    from dba_segments
    where segment_name='MYTABLE'
    group by partition_name,tablespace_name
    order by 1;
    

    Example output:
    PARTITION_NAME TABLESPACE_NAME GB
    SYS_SUBP102302 DATA2
    8,9
    SYS_SUBP102303 DATA2
    2
    SYS_SUBP102304 DATA2
    0
    SYS_SUBP120071 DATA2
    7,9
    SYS_SUBP120072 DATA
    1,8
    SYS_SUBP120073 DATA2
    0
    SYS_SUBP12401 DATA
    129,5


    Source: Oracle Documentation, New Features list, Version 12.1.0.1. See section "2.2.3.4 ONLINE Move Partition"

    Wednesday, February 1, 2017

    How to use gzip and tar together

    To compress and tar a directory in one command, use the following syntax:
    # tar -czvf mytarfile.tar.gz mydir
    
    which will tar and compress the directory "mydir" with its contents.

    To extract and untar a .tar.gz file in one command, use the following syntax:
    # tar -zxvf mytarfile.tar.gz
    
    To extract and untar a .tar.gz file in one command, into a specific directory, use the following syntax:
    # tar -zxvf mytarfile.tar.gz -C /data1/mydir
    

    Wednesday, February 12, 2014

    ORA-39726: unsupported add/drop column operation on compressed tables when attempting to drop column on a (previously) compressed table

    My customer had a partitioned table that was previously compressed with FOR ALL OPERATIONS options.
    One of the developers wanted to drop a columnn on a table, but receives the following error message:

    ORA-39726: unsupported add/drop column operation on compressed tables

    According to Oracle, this is expected behaviour. Oracle Support note 1068820.1 "Error ORA-39726 Drop Column Operation On Compressed Tables 10.2 Release" explains:

    "In release 10g, a drop column attempt on a compressed table must raise an ORA-39726 error if the column has to be physically removed from the data blocks.
    Hence "DROP COLUMN" and "DROP COLUMN UNUSED" are both illegal because they may have to touch the datablocks. SET UNUSED is OK because it is just a data
    dictionary operation (as is DROP COLUMN on virtual columns)."

    The Oracle support note concludes:

    "In 11g it is allowed to drop columns from a compressed table IF compatible is set to 11.1 or higher AND table was created with the "compress for all OLTP" option but even in this situation there is no real drop but internally the database sets the column UNUSED to avoid long-running decompression and recompression operations."

    My database unfortunately has compatible set to 11.0. So the only option at this point is to use the syntax

    ALTER TABLE owner.table_name SET UNUSED COLUMN column_name;

    This will render the column as good as dropped, another column can be added to the table with the same name.
    The number of unused columns for a table can be tracked with

    SELECT *
    FROM USER_UNUSED_COL_TABS
    WHERE TABLE_NAME='table_name';