Wednesday, July 30, 2025

How to find the columns in a specific postgres data dictionary table

For example, to check which columns belongs to the view "pg_stat_wal_receiver"
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'pg_stat_wal_receiver';
Output
     column_name
-----------------------
 pid
 last_msg_receipt_time
 latest_end_lsn
 latest_end_time
 sender_port
 receive_start_lsn
 receive_start_tli
 written_lsn
 flushed_lsn
 received_tli
 last_msg_send_time
 status
 slot_name
 sender_host
 conninfo
(15 rows)
Or use the psql metacommand "\d+":
echo '\d+ pg_stat_wal_receiver' |psql
Output
                                  View "pg_catalog.pg_stat_wal_receiver"
        Column         |           Type           | Collation | Nullable | Default | Storage  | Description
-----------------------+--------------------------+-----------+----------+---------+----------+-------------
 pid                   | integer                  |           |          |         | plain    |
 status                | text                     |           |          |         | extended |
 receive_start_lsn     | pg_lsn                   |           |          |         | plain    |
 receive_start_tli     | integer                  |           |          |         | plain    |
 written_lsn           | pg_lsn                   |           |          |         | plain    |
 flushed_lsn           | pg_lsn                   |           |          |         | plain    |
 received_tli          | integer                  |           |          |         | plain    |
 last_msg_send_time    | timestamp with time zone |           |          |         | plain    |
 last_msg_receipt_time | timestamp with time zone |           |          |         | plain    |
 latest_end_lsn        | pg_lsn                   |           |          |         | plain    |
 latest_end_time       | timestamp with time zone |           |          |         | plain    |
 slot_name             | text                     |           |          |         | extended |
 sender_host           | text                     |           |          |         | extended |
 sender_port           | integer                  |           |          |         | plain    |
 conninfo              | text                     |           |          |         | extended |
View definition:
 SELECT pid,
    status,
    receive_start_lsn,
    receive_start_tli,
    written_lsn,
    flushed_lsn,
    received_tli,
    last_msg_send_time,
    last_msg_receipt_time,
    latest_end_lsn,
    latest_end_time,
    slot_name,
    sender_host,
    sender_port,
    conninfo
   FROM pg_stat_get_wal_receiver() s(pid, status, receive_start_lsn, receive_start_tli, written_lsn, flushed_lsn, received_tli, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time, slot_name, sender_host, sender_port, conninfo)
  WHERE pid IS NOT NULL;

Monday, July 28, 2025

How to list the postgres processes running on your server and what their jobs are

A neet little trick I learned from the series "Mirage of memory, part 2: PSS"
ps -fu postgres | grep $(pgrep -f postmaster)
postgres    1913       1  0 Jul17 ?        00:02:51 /usr/pgsql-15/bin/postmaster -D /var/lib/pgsql/15/data/
postgres    1982    1913  0 Jul17 ?        00:00:24 postgres: logger
postgres    2063    1913  0 Jul17 ?        00:00:18 postgres: checkpointer
postgres    2064    1913  0 Jul17 ?        00:00:03 postgres: background writer
postgres    2119    1913  0 Jul17 ?        00:00:07 postgres: walwriter
postgres    2120    1913  0 Jul17 ?        00:00:14 postgres: autovacuum launcher
postgres    2121    1913  0 Jul17 ?        00:00:01 postgres: archiver
postgres    2122    1913  0 Jul17 ?        00:00:00 postgres: logical replication launcher
postgres    2657    1913  0 Jul17 ?        00:02:05 postgres: postgres_exporter postgres localhost(48674) idle
The command lists all processes by user "postgres" and and greps for the output of the command "pgrep -f postmaster", which returns a list of process IDs

You could also count the number of processes by user "postgres" by simply typing
pgrep -c postmaster
9

Monday, July 14, 2025

How to find when a PostgreSQL table was last vacuumed

SELECT
schemaname,
relname,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze
FROM pg_stat_user_tables
WHERE relname = 'documents';
   schemaname   | relname  | last_vacuum |        last_autovacuum        |         last_analyze          |       last_autoanalyze
----------------+----------+-------------+-------------------------------+-------------------------------+-------------------------------
 scott          | documents|             | 2025-06-25 02:05:20.692869+02 | 2025-07-12 22:04:05.119051+02 | 2025-06-17 08:41:12.567351+02
(1 row)
Explanation of columns:

  • last_vacuum: Last manual vacuum.
  • last_autovacuum: Last automatic vacuum.
  • last_analyze: Last manual analyze.
  • last_autoanalyze: Last automatic analyze.

    What does ANALYZE do in PostgreSQL?

    ANALYZE collects statistics about the contents of a table (or all tables), such as:
  • Number of rows
  • Distribution of values
  • Most common values
  • Histogram of value ranges
  • Null fractions These statistics are stored in pg_statistic and are used by the query planner to make decisions (e.g., which index to use, join strategies, etc.).

    Syntax to run ANALYZE

    Analyze the entire database (all tables in all schemas):
    ANALYZE;
    
    Analyze a specific table:
    ANALYZE my_table;
    
    Analyze specific columns only:
    ANALYZE my_table (column1, column2);
    
  • Friday, July 11, 2025

    PostgreSQL: how to tail the alert log of the day with filtering for long-running statements

    I have set the parameter log_min_duration_statement to 2 seconds and reloaded the config:
    postgres=# alter system set log_min_duration_statement='2s';
    ALTER SYSTEM
    
    postgres=# SELECT pg_reload_conf();
     pg_reload_conf
    ----------------
     t
    (1 row)
    
    postgres=# show log_min_duration_statement;
     log_min_duration_statement
    ----------------------------
     2s
    (1 row)
    
    
    I expect every statement that lasts longer than 2 seconds to be logged in the postgreSQL server main log.

    From a different session, I execute the following in any database in the cluster:
    postgres=# SELECT pg_sleep(3);
     pg_sleep
    ----------
    
    (1 row)
    
    From another session, tail the log from today:
    tail -F /var/log/postgres/postgresql-2025-07-11_000000.log | grep --line-buffered "duration:.*statement:"
    
    Output
    2025-07-11 09:45:42 CEST [173349]: [5-1] user=postgres,db=postgres,app=psql,client=[local] LOG:  duration: 3003.698 ms  statement: SELECT pg_sleep(3);
    

    Wednesday, July 9, 2025

    How to list all subdirectories and exclude certain subdirectories in a tree-like fashion

    In this example I want to see all subdirectories that starts with "dir", but also to exclude the directory "dirdat":
    cd $ORACLE_GG
    tree -L 2 --noreport $(ls -d dir* | grep -v '^dirdat$')
    
  • ls -d dir*: lists all dir* entries (only directories if they match)
  • grep -v '^dirdat$': removes dirdat from the list
  • $(...): expands the filtered list as arguments to tree
  • Thursday, June 19, 2025

    Postgres: List all RANGE partitions, sorted by their range start value

    My table contains 180 partitons, and looks like this when described in psql:
    mydb=> \d mytable
          Partitioned table "myschema.mytable"
          Column       |           Type           | Collation | Nullable |           Default            
    -------------------+--------------------------+-----------+----------+------------------------------
     id                | bigint                   |           | not null | generated always as identity
     ad_updtud_log_id  | bigint                   |           | not null | 
     period            | integer                  |           | not null | 
     created           | timestamp with time zone |           | not null | 
    Partition key: RANGE (id)
    Indexes:
        "mytable_pkey" PRIMARY KEY, btree (id, periode)
        "mytable_id_index" btree (im_oppsum_logg_id)
        "mytable_opprettet_index" btree (opprettet)
    Foreign-key constraints:
        "fk_mytable" FOREIGN KEY (ad_updtud_log_id, period) REFERENCES mytable2(id, period)
    Number of partitions: 180 (Use \d+ to list them.)
    
    This is how you can list all partitions belonging to a range-partitioned table in postgreSQL, based on the values:
    SELECT
        child.relname AS partition_name,
        regexp_replace(pg_get_expr(child.relpartbound, child.oid), '.*FROM \(''(\d+).*', '\1')::bigint AS range_start,
        pg_get_expr(child.relpartbound, child.oid) AS partition_range
    FROM pg_inherits
    JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
    JOIN pg_class child ON pg_inherits.inhrelid = child.oid
    JOIN pg_namespace nm ON nm.oid = child.relnamespace
    WHERE parent.relname = 'mytable'
      AND nm.nspname = 'myschema'
    ORDER BY range_start;
    
    Result (excerpt only):
       partition_name   | range_start | partition_range                  
    ---------------------------------------+-------------+--------------------------------------------
     mytable_0	    |           0 | FOR VALUES FROM ('0') TO ('30000000')
     mytable_30000000   |    30000000 | FOR VALUES FROM ('30000000') TO ('60000000')
     mytable_60000000   |    60000000 | FOR VALUES FROM ('60000000') TO ('90000000')
     mytable_90000000   |    90000000 | FOR VALUES FROM ('90000000') TO ('120000000')
     mytable_120000000  |   120000000 | FOR VALUES FROM ('120000000') TO ('150000000')
     mytable_150000000  |   150000000 | FOR VALUES FROM ('150000000') TO ('180000000')
     mytable_180000000  |   180000000 | FOR VALUES FROM ('180000000') TO ('210000000')
     mytable_210000000  |   210000000 | FOR VALUES FROM ('210000000') TO ('240000000')
     mytable_240000000  |   240000000 | FOR VALUES FROM ('240000000') TO ('270000000')
     mytable_270000000  |   270000000 | FOR VALUES FROM ('270000000') TO ('300000000')
     mytable_300000000  |   300000000 | FOR VALUES FROM ('300000000') TO ('330000000')
     mytable_330000000  |   330000000 | FOR VALUES FROM ('330000000') TO ('360000000')
     mytable_360000000  |   360000000 | FOR VALUES FROM ('360000000') TO ('390000000')
     mytable_390000000  |   390000000 | FOR VALUES FROM ('390000000') TO ('420000000')
     mytable_420000000  |   420000000 | FOR VALUES FROM ('420000000') TO ('450000000')
     mytable_450000000  |   450000000 | FOR VALUES FROM ('450000000') TO ('480000000')
     

    Thursday, June 5, 2025

    PostgreSQL search_path basics

    Similar to Oracles alter session set current_schema=<schema_name>, in PostgreSQL, search_path is a session-level setting that determines the order in which schemas are searched when you reference database objects without a schema name. It consists of a list of schema names. When you run a query like
    SELECT * FROM mytable;
    PostgreSQL checks each schema in the list — in order — to find mytable.

    Default value is:
     "$user", public 
    This means:
    1. Look for a schema named after the current user.
    2. If not found or not accessible, try the public schema.

    How to view and set it

    Check your current search_path:
     SHOW search_path; 
    Set it for the session:
     SET search_path TO schema1, public; 
    or
     SET search_path TO schema1, schema2, public; 
    The last example is important in cases where you logon to a database with a user with no matching schema. Consider the following example:
     psql -h server1.oric.no -U jim -d musicdb
    
    I am logging onto the database "musicdb" with a user called "jim". By default, jim will have its own user followed by public, in his search path:
    musicdb=> show search_path;
       search_path
    -----------------
     "$user", public
    (1 row)
    
    I have already given user jim the privilges needed to "see" the objects created by schema "music" which exists in the database "musicdb".

    For convenience, add schema "music" to the search_path:
    musicdb=> set search_path to 'music','$user','public';
    SET
    musicdb=> show search_path;
              search_path
    -------------------------------
     music, "$user", public
    (1 row)
    
    The "current_user" and "current_schema" functions will now return the actual user name, and the first match in the search_path, respectivly:
    
    musicdb=> select current_user, current_schema;
     current_user | current_schema
    --------------+----------------
     jim          | music
    (1 row)
    
    Why is it important?
    It controls where PostgreSQL looks first for unqualified object names. It allows you to skip schema prefixes when working with other schemas' objects