Monday, July 14, 2025

How to find when a PostgreSQL table was last vacuumed

SELECT
schemaname,
relname,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze
FROM pg_stat_user_tables
WHERE relname = 'documents';
   schemaname   | relname  | last_vacuum |        last_autovacuum        |         last_analyze          |       last_autoanalyze
----------------+----------+-------------+-------------------------------+-------------------------------+-------------------------------
 scott          | documents|             | 2025-06-25 02:05:20.692869+02 | 2025-07-12 22:04:05.119051+02 | 2025-06-17 08:41:12.567351+02
(1 row)
Explanation of columns:

  • last_vacuum: Last manual vacuum.
  • last_autovacuum: Last automatic vacuum.
  • last_analyze: Last manual analyze.
  • last_autoanalyze: Last automatic analyze.

    What does ANALYZE do in PostgreSQL?

    ANALYZE collects statistics about the contents of a table (or all tables), such as:
  • Number of rows
  • Distribution of values
  • Most common values
  • Histogram of value ranges
  • Null fractions These statistics are stored in pg_statistic and are used by the query planner to make decisions (e.g., which index to use, join strategies, etc.).

    Syntax to run ANALYZE

    Analyze the entire database (all tables in all schemas):
    ANALYZE;
    
    Analyze a specific table:
    ANALYZE my_table;
    
    Analyze specific columns only:
    ANALYZE my_table (column1, column2);
    
  • Friday, July 11, 2025

    PostgreSQL: how to tail the alert log of the day with filtering for long-running statements

    I have set the parameter log_min_duration_statement to 2 seconds and reloaded the config:
    postgres=# alter system set log_min_duration_statement='2s';
    ALTER SYSTEM
    
    postgres=# SELECT pg_reload_conf();
     pg_reload_conf
    ----------------
     t
    (1 row)
    
    postgres=# show log_min_duration_statement;
     log_min_duration_statement
    ----------------------------
     2s
    (1 row)
    
    
    I expect every statement that lasts longer than 2 seconds to be logged in the postgreSQL server main log.

    From a different session, I execute the following in any database in the cluster:
    postgres=# SELECT pg_sleep(3);
     pg_sleep
    ----------
    
    (1 row)
    
    From another session, tail the log from today:
    tail -F /var/log/postgres/postgresql-2025-07-11_000000.log | grep --line-buffered "duration:.*statement:"
    
    Output
    2025-07-11 09:45:42 CEST [173349]: [5-1] user=postgres,db=postgres,app=psql,client=[local] LOG:  duration: 3003.698 ms  statement: SELECT pg_sleep(3);
    

    Wednesday, July 9, 2025

    How to list all subdirectories and exclude certain subdirectories in a tree-like fashion

    In this example I want to see all subdirectories that starts with "dir", but also to exclude the directory "dirdat":
    cd $ORACLE_GG
    tree -L 2 --noreport $(ls -d dir* | grep -v '^dirdat$')
    
  • ls -d dir*: lists all dir* entries (only directories if they match)
  • grep -v '^dirdat$': removes dirdat from the list
  • $(...): expands the filtered list as arguments to tree
  • Thursday, June 19, 2025

    Postgres: List all RANGE partitions, sorted by their range start value

    My table contains 180 partitons, and looks like this when described in psql:
    mydb=> \d mytable
          Partitioned table "myschema.mytable"
          Column       |           Type           | Collation | Nullable |           Default            
    -------------------+--------------------------+-----------+----------+------------------------------
     id                | bigint                   |           | not null | generated always as identity
     ad_updtud_log_id  | bigint                   |           | not null | 
     period            | integer                  |           | not null | 
     created           | timestamp with time zone |           | not null | 
    Partition key: RANGE (id)
    Indexes:
        "mytable_pkey" PRIMARY KEY, btree (id, periode)
        "mytable_id_index" btree (im_oppsum_logg_id)
        "mytable_opprettet_index" btree (opprettet)
    Foreign-key constraints:
        "fk_mytable" FOREIGN KEY (ad_updtud_log_id, period) REFERENCES mytable2(id, period)
    Number of partitions: 180 (Use \d+ to list them.)
    
    This is how you can list all partitions belonging to a range-partitioned table in postgreSQL, based on the values:
    SELECT
        child.relname AS partition_name,
        regexp_replace(pg_get_expr(child.relpartbound, child.oid), '.*FROM \(''(\d+).*', '\1')::bigint AS range_start,
        pg_get_expr(child.relpartbound, child.oid) AS partition_range
    FROM pg_inherits
    JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
    JOIN pg_class child ON pg_inherits.inhrelid = child.oid
    JOIN pg_namespace nm ON nm.oid = child.relnamespace
    WHERE parent.relname = 'mytable'
      AND nm.nspname = 'myschema'
    ORDER BY range_start;
    
    Result (excerpt only):
       partition_name   | range_start | partition_range                  
    ---------------------------------------+-------------+--------------------------------------------
     mytable_0	    |           0 | FOR VALUES FROM ('0') TO ('30000000')
     mytable_30000000   |    30000000 | FOR VALUES FROM ('30000000') TO ('60000000')
     mytable_60000000   |    60000000 | FOR VALUES FROM ('60000000') TO ('90000000')
     mytable_90000000   |    90000000 | FOR VALUES FROM ('90000000') TO ('120000000')
     mytable_120000000  |   120000000 | FOR VALUES FROM ('120000000') TO ('150000000')
     mytable_150000000  |   150000000 | FOR VALUES FROM ('150000000') TO ('180000000')
     mytable_180000000  |   180000000 | FOR VALUES FROM ('180000000') TO ('210000000')
     mytable_210000000  |   210000000 | FOR VALUES FROM ('210000000') TO ('240000000')
     mytable_240000000  |   240000000 | FOR VALUES FROM ('240000000') TO ('270000000')
     mytable_270000000  |   270000000 | FOR VALUES FROM ('270000000') TO ('300000000')
     mytable_300000000  |   300000000 | FOR VALUES FROM ('300000000') TO ('330000000')
     mytable_330000000  |   330000000 | FOR VALUES FROM ('330000000') TO ('360000000')
     mytable_360000000  |   360000000 | FOR VALUES FROM ('360000000') TO ('390000000')
     mytable_390000000  |   390000000 | FOR VALUES FROM ('390000000') TO ('420000000')
     mytable_420000000  |   420000000 | FOR VALUES FROM ('420000000') TO ('450000000')
     mytable_450000000  |   450000000 | FOR VALUES FROM ('450000000') TO ('480000000')
     

    Thursday, June 5, 2025

    PostgreSQL search_path basics

    Similar to Oracles alter session set current_schema=<schema_name>, in PostgreSQL, search_path is a session-level setting that determines the order in which schemas are searched when you reference database objects without a schema name. It consists of a list of schema names. When you run a query like
    SELECT * FROM mytable;
    PostgreSQL checks each schema in the list — in order — to find mytable.

    Default value is:
     "$user", public 
    This means:
    1. Look for a schema named after the current user.
    2. If not found or not accessible, try the public schema.

    How to view and set it

    Check your current search_path:
     SHOW search_path; 
    Set it for the session:
     SET search_path TO schema1, public; 
    or
     SET search_path TO schema1, schema2, public; 
    The last example is important in cases where you logon to a database with a user with no matching schema. Consider the following example:
     psql -h server1.oric.no -U jim -d musicdb
    
    I am logging onto the database "musicdb" with a user called "jim". By default, jim will have its own user followed by public, in his search path:
    musicdb=> show search_path;
       search_path
    -----------------
     "$user", public
    (1 row)
    
    I have already given user jim the privilges needed to "see" the objects created by schema "music" which exists in the database "musicdb".

    For convenience, add schema "music" to the search_path:
    musicdb=> set search_path to 'music','$user','public';
    SET
    musicdb=> show search_path;
              search_path
    -------------------------------
     music, "$user", public
    (1 row)
    
    The "current_user" and "current_schema" functions will now return the actual user name, and the first match in the search_path, respectivly:
    
    musicdb=> select current_user, current_schema;
     current_user | current_schema
    --------------+----------------
     jim          | music
    (1 row)
    
    Why is it important?
    It controls where PostgreSQL looks first for unqualified object names. It allows you to skip schema prefixes when working with other schemas' objects

    How to limit connections in a postgres database

    This is how you can restrict connections for a specific database to zero:
    postgres=# alter database db01 connection limit 0;
    ALTER DATABASE
    
    Verify
    SELECT datname, datconnlimit FROM pg_database WHERE datname = 'db01';
    -[ RECORD 1 ]+-------------
    datname      | db01
    datconnlimit | 0
    
    Set back to unlimited:
    ALTER DATABASE db01 CONNECTION LIMIT -1;
    ALTER DATABASE
    
    Verify:
    SELECT datname, datconnlimit FROM pg_database WHERE datname = 'db01';
    -[ RECORD 1 ]+-------------
    datname      | db01
    datconnlimit | -1
    
    To limit the connections for a specific user only:
     psql
    psql (15.13)
    Type "help" for help.
    
    postgres=# alter user read_db01 connection limit 0;
    ALTER ROLE
    postgres=# alter user read_db01 connection limit -1;
    ALTER ROLE
    postgres=#
    
    The current setting can be verified with:
     SELECT rolname, rolconnlimit
    FROM pg_roles
    WHERE rolname = 'read_db01';
          rolname      | rolconnlimit
    -------------------+--------------
     read_db01         |           -1
    
    or, list all users that does have restrictions on the number of connections:
     SELECT rolname, rolconnlimit
    FROM pg_roles
    WHERE rolconnlimit != -1;
          rolname      | rolconnlimit
    -------------------+--------------
     pganalyze         |            5
     read_db01         |            0
    (2 rows)
    

    create user in postgres - basic syntax

    I will create

  • a group role called common_users
  • a user called ro_user1
  • in a database called db01

    The read-only user ro_user1 should be able perform queries against all tables owned by the schema schema1.

    First, create the role common_users by logging onto the postgres (default) database
    psql
    
    CREATE ROLE common_users WITH
      NOLOGIN
      NOSUPERUSER
      INHERIT
      NOCREATEDB
      NOCREATEROLE
      NOREPLICATION
      NOBYPASSRLS;
    
    GRANT pg_read_all_stats TO common_users;
    
    Then, create the user ro_user1:
    create user ro_user1 password 'mysecretpassword';
    grant common_users to ro_user1;
    grant connect on database db01 to ro_user1;
    
    Log into the database db01 and revoke and grant some privileges:
    psql
    \connect db01
    revoke all on schema schema1 from ro_user1;
    grant usage on schema schema1 to ro_user1;
    grant select on all tables in schema schema1 to ro_user1;
    
    Confirm the privileges:
    \connect postgres
    select database_privs('ro_user1');