Thursday, June 5, 2025

PostgreSQL search_path basics

Similar to Oracles alter session set current_schema=<schema_name>, in PostgreSQL, search_path is a session-level setting that determines the order in which schemas are searched when you reference database objects without a schema name. It consists of a list of schema names. When you run a query like
SELECT * FROM mytable;
PostgreSQL checks each schema in the list — in order — to find mytable.

Default value is:
 "$user", public 
This means:
  1. Look for a schema named after the current user.
  2. If not found or not accessible, try the public schema.

How to view and set it

Check your current search_path:
 SHOW search_path; 
Set it for the session:
 SET search_path TO schema1, public; 
or
 SET search_path TO schema1, schema2, public; 
The last example is important in cases where you logon to a database with a user with no matching schema. Consider the following example:
 psql -h server1.oric.no -U jim -d musicdb
I am logging onto the database "musicdb" with a user called "jim". By default, jim will have its own user followed by public, in his search path:
musicdb=> show search_path;
   search_path
-----------------
 "$user", public
(1 row)
I have already given user jim the privilges needed to "see" the objects created by schema "music" which exists in the database "musicdb".

For convenience, add schema "music" to the search_path:
musicdb=> set search_path to 'music','$user','public';
SET
musicdb=> show search_path;
          search_path
-------------------------------
 music, "$user", public
(1 row)
The "current_user" and "current_schema" functions will now return the actual user name, and the first match in the search_path, respectivly:

musicdb=> select current_user, current_schema;
 current_user | current_schema
--------------+----------------
 jim          | music
(1 row)
Why is it important?
It controls where PostgreSQL looks first for unqualified object names. It allows you to skip schema prefixes when working with other schemas' objects

How to limit connections in a postgres database

This is how you can restrict connections for a specific database to zero:
postgres=# alter database db01 connection limit 0;
ALTER DATABASE
Verify
SELECT datname, datconnlimit FROM pg_database WHERE datname = 'db01';
-[ RECORD 1 ]+-------------
datname      | db01
datconnlimit | 0
Set back to unlimited:
ALTER DATABASE db01 CONNECTION LIMIT -1;
ALTER DATABASE
Verify:
SELECT datname, datconnlimit FROM pg_database WHERE datname = 'db01';
-[ RECORD 1 ]+-------------
datname      | db01
datconnlimit | -1
To limit the connections for a specific user only:
 psql
psql (15.13)
Type "help" for help.

postgres=# alter user read_db01 connection limit 0;
ALTER ROLE
postgres=# alter user read_db01 connection limit -1;
ALTER ROLE
postgres=#
The current setting can be verified with:
 SELECT rolname, rolconnlimit
FROM pg_roles
WHERE rolname = 'read_db01';
      rolname      | rolconnlimit
-------------------+--------------
 read_db01         |           -1
or, list all users that does have restrictions on the number of connections:
 SELECT rolname, rolconnlimit
FROM pg_roles
WHERE rolconnlimit != -1;
      rolname      | rolconnlimit
-------------------+--------------
 pganalyze         |            5
 read_db01         |            0
(2 rows)

create user in postgres - basic syntax

I will create

  • a group role called common_users
  • a user called ro_user1
  • in a database called db01

    The read-only user ro_user1 should be able perform queries against all tables owned by the schema schema1.

    First, create the role common_users by logging onto the postgres (default) database
    psql
    
    CREATE ROLE common_users WITH
      NOLOGIN
      NOSUPERUSER
      INHERIT
      NOCREATEDB
      NOCREATEROLE
      NOREPLICATION
      NOBYPASSRLS;
    
    GRANT pg_read_all_stats TO common_users;
    
    Then, create the user ro_user1:
    create user ro_user1 password 'mysecretpassword';
    grant common_users to ro_user1;
    grant connect on database db01 to ro_user1;
    
    Log into the database db01 and revoke and grant some privileges:
    psql
    \connect db01
    revoke all on schema schema1 from ro_user1;
    grant usage on schema schema1 to ro_user1;
    grant select on all tables in schema schema1 to ro_user1;
    
    Confirm the privileges:
    \connect postgres
    select database_privs('ro_user1');
    
  • Tuesday, June 3, 2025

    PostgreSQL Memory Parameters and how they relate

    Here are some of the important memory parameters in a postgreSQL server, and how they relate to one another.

    shared_buffers

    The amount of memory the database server uses for shared memory buffers.

    The postgres documentation suggest starting with allocating 25% of the available memory to the shared database memory pool:

    If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system... because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount.

    It also points out the necessity of considering database checkpointing:

    Larger settings for shared_buffers usually require a corresponding increase in max_wal_size, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.

    max_wal_size

    Controls the maximum size the Write-Ahead Logging (WAL) files can grow before triggering a checkpoint. Checkponts are relative expensive operations, so we do not want them to occur too often. On the other hands, too infrequent checkpointing may increase recovery times. max_wal_size can be set to balance performance and recovery time by influencing how often checkpoints occur.

    work_mem

    Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files.

    The documentation points out that a complex query might perform several sort and hash operations at the same time, with each operation generally being allowed to use as much memory as this value specifies before it starts to write data into temporary files and that serveral running sessions could be executing such operations at the same time. So even if the 6M specified as its value does not seem like much, it could mean significant memory usage on a busy system.

    It is similar to pga_aggregate_target in an oracle database: the amount of memory set for all private global areas on the server. After the introduction of this parameter in Oracle 9i, the private global area parameters used back then, for example sort_area_size and hash_area_size, was not necessary any longer.

    maintenance_work_mem

    Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY.

    It can be set higher than work_mem: Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

    autovacuum_work_mem

    Specifies the maximum amount of memory to be used by each autovacuum worker process. If this value is specified without units, it is taken as kilobytes. It defaults to -1, indicating that the value of maintenance_work_mem should be used instead. The setting has no effect on the behavior of VACUUM when run in other contexts.

    Here is a table with my settings for my 16G Memory server
    Parameter Value
    shared_buffers 4GB
    max_wal_size 8GB
    work_mem 6MB
    maintenance_work_mem 479MB


    Source: PostgreSQL documentation

    How to setup huge pages on a postgreSQL server

    The Postgres Documentation states:

    The use of huge pages results in smaller page tables and less CPU time spent on memory management, increasing performance.

    This is how I set up huge pages on one of my postgreSQL servers.

    First, check the server's huge page size. My huge pages are 2M each:
     grep Hugepagesize /proc/meminfo
    Hugepagesize:       2048 kB
    
    On my 16G server, I would like to start with 25% of the total memory as available for shared_buffers:
    su - postgres
    sudo systemctl stop postgresql-15.service
    postgres --shared-buffers=4096MB -D $PGDATA -C shared_memory_size_in_huge_pages
    2102
    
    Update /etc/sysctl.conf. Add
    vm.nr_hugepages=2102
    
    Restart sysctl
    sysctl -p
    
    Now, tell the postgres server to use huge pages, if they are available.

    Add the following directive to the config file postgresql.conf
    huge_pages = try
    
    Add the following lines to /etc/security/limits.conf so that postgres can lock down the memory set aside for huge tables:
    postgres soft memlock unlimited
    postgres hard memlock unlimited
    
    Reboot the server and verify that huge pages are being used:
    cat /proc/meminfo | grep Huge
    
    Interestingly, the parameter huge_page_size should only be used if you wish to override the default huge page size on your system. The default is zero (0). When set to 0, the default huge page size on the system will be used. In my case this is what I want to so the parameter can be ignored.

    Friday, May 23, 2025

    How to find constraints on a table in postgres



    SELECT
        con.conname AS constraint_name,
        CASE con.contype
            WHEN 'p' THEN 'PRIMARY KEY'
            WHEN 'u' THEN 'UNIQUE'
            WHEN 'f' THEN 'FOREIGN KEY'
            WHEN 'c' THEN 'CHECK'
            WHEN 'x' THEN 'EXCLUSION'
            ELSE con.contype
        END AS constraint_type,
        rel.relname AS table_name,
        pg_get_constraintdef(con.oid) AS definition
    FROM
        pg_constraint con
    JOIN
        pg_class rel ON rel.oid = con.conrelid
    JOIN
        pg_namespace nsp ON nsp.oid = rel.relnamespace
    WHERE
        nsp.nspname = 'owner'
        AND rel.relname = 'table_name'
    ORDER BY
        constraint_name;
    

    Thursday, May 22, 2025

    PostgreSQL: difference between \copy and COPY

    If you put a statement like this in a file:
    COPY at_locations
    FROM '/external/data/geolocations_at_fixed.csv'
    WITH (
        FORMAT csv,
        DELIMITER ';',
        NULL 'NULL',
        HEADER false
    );
    
    and execute it like this:
    psql -h myserver.oric.no -U vinmonopolet -d vinmonopolet
    vinmonopolet=> \i cp_at.sql
    
    You will see the error
    ERROR:  must be superuser or have privileges of the pg_read_server_files role to COPY from a file
    HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
    
    This is because the COPY command runs server-side, and PostgreSQL server expects be able to access the file and its entire path on the server.

    This in turn requires that the user executing the COPY command is defined as a superuser or has the pg_read_server_files privilege granted to it.

    As the error message says, we could use the client-side command \copy instead. Put this in the same script
    \copy vinmonopolet.at_locations FROM '/external/data/geolocations_at_fixed.csv' WITH (FORMAT csv, DELIMITER ';', NULL 'NULL', HEADER false)
    
    and execute either with the \i as above, or use the -f flag directly at the prompt:
    psql -h myserver.oric.no -U vinmonopolet -d vinmonopolet -f cp_at.sql
    Password for user vinmonopolet:
    COPY 5
    
    This works because the \copy command is interpreted client-side. psql reads the file from your local machine (in my case, the "local machine" was indeed the postgres server, but the point stands), and pipes the content into the database. And it succeeds because no elevated privileges are needed for this method.