Showing posts with label Postgres. Show all posts
Showing posts with label Postgres. Show all posts

Tuesday, June 3, 2025

PostgreSQL Memory Parameters and how they relate

Here are some of the important memory parameters in a postgreSQL server, and how they relate to one another.

shared_buffers

The amount of memory the database server uses for shared memory buffers.

The postgres documentation suggest starting with allocating 25% of the available memory to the shared database memory pool:

If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system... because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount.

It also points out the necessity of considering database checkpointing:

Larger settings for shared_buffers usually require a corresponding increase in max_wal_size, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.

max_wal_size

Controls the maximum size the Write-Ahead Logging (WAL) files can grow before triggering a checkpoint. Checkponts are relative expensive operations, so we do not want them to occur too often. On the other hands, too infrequent checkpointing may increase recovery times. max_wal_size can be set to balance performance and recovery time by influencing how often checkpoints occur.

work_mem

Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files.

The documentation points out that a complex query might perform several sort and hash operations at the same time, with each operation generally being allowed to use as much memory as this value specifies before it starts to write data into temporary files and that serveral running sessions could be executing such operations at the same time. So even if the 6M specified as its value does not seem like much, it could mean significant memory usage on a busy system.

It is similar to pga_aggregate_target in an oracle database: the amount of memory set for all private global areas on the server. After the introduction of this parameter in Oracle 9i, the private global area parameters used back then, for example sort_area_size and hash_area_size, was not necessary any longer.

maintenance_work_mem

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY.

It can be set higher than work_mem: Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

autovacuum_work_mem

Specifies the maximum amount of memory to be used by each autovacuum worker process. If this value is specified without units, it is taken as kilobytes. It defaults to -1, indicating that the value of maintenance_work_mem should be used instead. The setting has no effect on the behavior of VACUUM when run in other contexts.

Here is a table with my settings for my 16G Memory server
Parameter Value
shared_buffers 4GB
max_wal_size 8GB
work_mem 6MB
maintenance_work_mem 479MB


Source: PostgreSQL documentation

How to setup huge pages on a postgreSQL server

The Postgres Documentation states:

The use of huge pages results in smaller page tables and less CPU time spent on memory management, increasing performance.

This is how I set up huge pages on one of my postgreSQL servers.

First, check the server's huge page size. My huge pages are 2M each:
 grep Hugepagesize /proc/meminfo
Hugepagesize:       2048 kB
On my 16G server, I would like to start with 25% of the total memory as available for shared_buffers:
su - postgres
sudo systemctl stop postgresql-15.service
postgres --shared-buffers=4096MB -D $PGDATA -C shared_memory_size_in_huge_pages
2102
Update /etc/sysctl.conf. Add
vm.nr_hugepages=2102
Restart sysctl
sysctl -p
Now, tell the postgres server to use huge pages, if they are available.

Add the following directive to the config file postgresql.conf
huge_pages = try
Add the following lines to /etc/security/limits.conf so that postgres can lock down the memory set aside for huge tables:
postgres soft memlock unlimited
postgres hard memlock unlimited
Reboot the server and verify that huge pages are being used:
cat /proc/meminfo | grep Huge
Interestingly, the parameter huge_page_size should only be used if you wish to override the default huge page size on your system. The default is zero (0). When set to 0, the default huge page size on the system will be used. In my case this is what I want to so the parameter can be ignored.

Friday, May 23, 2025

How to find constraints on a table in postgres



SELECT
    con.conname AS constraint_name,
    CASE con.contype
        WHEN 'p' THEN 'PRIMARY KEY'
        WHEN 'u' THEN 'UNIQUE'
        WHEN 'f' THEN 'FOREIGN KEY'
        WHEN 'c' THEN 'CHECK'
        WHEN 'x' THEN 'EXCLUSION'
        ELSE con.contype
    END AS constraint_type,
    rel.relname AS table_name,
    pg_get_constraintdef(con.oid) AS definition
FROM
    pg_constraint con
JOIN
    pg_class rel ON rel.oid = con.conrelid
JOIN
    pg_namespace nsp ON nsp.oid = rel.relnamespace
WHERE
    nsp.nspname = 'owner'
    AND rel.relname = 'table_name'
ORDER BY
    constraint_name;

Thursday, May 22, 2025

PostgreSQL: difference between \copy and COPY

If you put a statement like this in a file:
COPY at_locations
FROM '/external/data/geolocations_at_fixed.csv'
WITH (
    FORMAT csv,
    DELIMITER ';',
    NULL 'NULL',
    HEADER false
);
and execute it like this:
psql -h myserver.oric.no -U vinmonopolet -d vinmonopolet
vinmonopolet=> \i cp_at.sql
You will see the error
ERROR:  must be superuser or have privileges of the pg_read_server_files role to COPY from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
This is because the COPY command runs server-side, and PostgreSQL server expects be able to access the file and its entire path on the server.

This in turn requires that the user executing the COPY command is defined as a superuser or has the pg_read_server_files privilege granted to it.

As the error message says, we could use the client-side command \copy instead. Put this in the same script
\copy vinmonopolet.at_locations FROM '/external/data/geolocations_at_fixed.csv' WITH (FORMAT csv, DELIMITER ';', NULL 'NULL', HEADER false)
and execute either with the \i as above, or use the -f flag directly at the prompt:
psql -h myserver.oric.no -U vinmonopolet -d vinmonopolet -f cp_at.sql
Password for user vinmonopolet:
COPY 5
This works because the \copy command is interpreted client-side. psql reads the file from your local machine (in my case, the "local machine" was indeed the postgres server, but the point stands), and pipes the content into the database. And it succeeds because no elevated privileges are needed for this method.

Wednesday, April 9, 2025

How to mimic sqlplus variable behaviour in PostgreSQL

Here is how you can mimic the usage of variables at the prompt using psql and PostgreSQL like you would with oracle and sqlplus/sqlc:
\connect mydb
\prompt 'Enter value for querid: ' v_quid
\echo You entered: :v_quid
SELECT query_sql_text FROM query_store.query_texts_view WHERE query_text_id = :'v_quid';
Run it:
sql -h pgserver1.oric.no -U dbadmin -d postgres -f find_rest_of_statement.sql

Enter value for querid: -7644745457188850826
You entered: -7644745457188850826
                                                    query_sql_text                                                    
----------------------------------------------------------------------------------------------------------------------
      THE QUERY TEXT WILL SHOW UP HERE :-)                                                                                                    

Wednesday, April 2, 2025

How to check the number of rows inserted into the different partitions in a PostgreSQL partitioned table

Create the table:
CREATE TABLE mytable
(
    id bigserial,
    key text COLLATE pg_catalog."default" NOT NULL,
    outdata jsonb,
    partno integer,
    koffset bigint,
    handled_at timestamp with time zone,
    inserted_at timestamp with time zone NOT NULL DEFAULT now(),
    kyear integer NOT NULL, -- Ensure it's NOT NULL to work properly with partitions
    nk boolean NOT NULL,
    CONSTRAINT mytable_pkey PRIMARY KEY (id,kyear)
)
PARTITION BY RANGE (kyear);
Create the partitions. Make it suitable for ranges that adhere to the standard YYYYMM:
CREATE TABLE mytable_202501
PARTITION OF mytable
FOR VALUES FROM (202501) TO (202502);

CREATE TABLE mytable_202502
PARTITION OF mytable
FOR VALUES FROM (202502) TO (202503);
Insert some test values:
INSERT INTO mytable (key, outdata, partno, koffset, handled_at, kyear, nk)
VALUES
    ('A123', '{"data": "test1"}', 101, 500, NOW(), 202501, TRUE),
    ('B456', '{"data": "test2"}', 102, 600, NOW(), 202501, FALSE),
    ('C789', '{"data": "test3"}', 103, 700, NOW(), 202501, TRUE);

INSERT INTO mytable (key, outdata, partno, koffset, handled_at, kyear, nk)
VALUES
    ('D111', '{"data": "test4"}', 104, 800, NOW(), 202502, FALSE),
    ('E222', '{"data": "test5"}', 105, 900, NOW(), 202502, TRUE),
    ('F333', '{"data": "test6"}', 106, 1000, NOW(), 202502, FALSE);
Confirm that the data is indeed there:
select * from mytable;
Finally, verify that the rows have been commited to two different partitions:
musicdb=> SELECT relname, n_tup_ins
musicdb-> FROM pg_stat_all_tables
musicdb-> WHERE relname IN ('mytable_202501', 'mytable_202502');
    relname     | n_tup_ins
----------------+-----------
 mytable_202501 |         3
 mytable_202502 |         3
(2 rows)

Monday, January 27, 2025

How to install and configure PG Bouncer on Linux x86_64

Install the PGBouncer software


List packages allready installed on the system:
yum list installed |grep bounce
Search repository:
 yum search bounce
=========================================================== Name & Summary Matched: bounce ============================================================
pgbouncer.x86_64 : Lightweight connection pooler for PostgreSQL
=============================================================== Summary Matched: bounce ===============================================================
Install:
 yum install pgbouncer.x86_64
Verify installation:
 yum list installed |grep bounce
pgbouncer.x86_64                                              1.24.0-42PGDG.rhel8                     @YOURREPOSITORY_PostgreSQL_PostgreSQL_common_RHEL8_x86_64
Done with the installation!


Configure the pgbouncer for connections made by the user "music" to my database called "musicdb"

[root] chown postgres:postgres /etc/pgbouncer/ -R
Create the file /etc/pgbouncer/pgbouncer.ini:
;
; pgbouncer config file
;
[databases]
postgres    = port=5432 dbname=musicdb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
admin_users = music
auth_type   = scram-sha-256
auth_file   = /postgres_data/config/users.txt
logfile     = /postgres_data/config/pgbouncer.log
pidfile     = /postgres_data/config/pgbouncer.pid
server_reset_query = DISCARD ALL;
pool_mode = session
default_pool_size = 5
log_pooler_errors = 0
Find the SCRAM authenticatio key:
psql
select '"'||rolname||'" "'||rolpassword||'"' from pg_authid where rolname='music';
  
Create the file /postgres_data/config/users.txt

vi /postgres_data/config/users.txt

Add the following, and paste the value of the key you found in the query above, in between the second pair of quotation marks:

"music" "SCRAM-SHA-256_your_SCRAM_key_here"


Make sure port 6432 isn't occupied
 ss -ltpn | grep 6432 --> no output? Go go ahead! 
As the postgres user, start the pgbouncer process
pgbouncer -d /etc/pgbouncer/pgbouncer.ini
The port will now be occupied:
ss -ltpn | grep 6432
LISTEN 0      128        127.0.0.1:6432      0.0.0.0:*    users:(("pgbouncer",pid=392231,fd=12))
You can now connect to your pgbouncer instance. There are a number of different show-commands:
psql -p 6432 -h 127.0.0.1 -U music pgbouncer -c "show config"
Password for user music:
             key             |                         value                          |                        default                         | changeable
-----------------------------+--------------------------------------------------------+--------------------------------------------------------+------------
 admin_users                 | music                                                  |                                                        | yes
 application_name_add_host   | 0                                                      | 0                                                      | yes
 auth_dbname                 |                                                        |                                                        | yes
 auth_file                   | /var/lib/pgsql/users.txt                               |                                                        | yes
And some control commands:
psql -p 6432 -h 127.0.0.1 -U music pgbouncer -c "reload"
Password for user music:
RELOAD

To connect through the pgbouncer rather than a direct connection, use the syntax below, where you exchange the database name ("musicdb") with "pgbouncer":
  psql -p 6432 -h 127.0.0.1 -U music -d pgbouncer
Password for user music:
psql (15.10, server 1.24.0/bouncer)
WARNING: psql major version 15, server major version 1.24.
         Some psql features might not work.
Type "help" for help.

pgbouncer=# \conninfo
You are connected to database "pgbouncer" as user "music" on host "127.0.0.1" at port "6432".

Friday, December 27, 2024

How to check why a PostgreSQL server won't start

journalctl -u postgresql-15.service
Example output:
Dec 27 07:15:50 psl0part-prod-pgsql02.skead.no systemd[1]: Unit postgresql-15.service entered failed state.
Dec 27 07:15:50 psl0part-prod-pgsql02.skead.no systemd[1]: postgresql-15.service failed.
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no systemd[1]: Starting PostgreSQL 15 database server...
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no postmaster[65528]: 
2024-12-27 06:22:01.703 GMT [65528] 
LOG:  syntax error in file "/path/to/file/postgresql.auto.conf" line
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no systemd[1]: postgresql-15.service: main process exited, code=exited, status=1/FAILURE
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no systemd[1]: Failed to start PostgreSQL 15 database server.
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no systemd[1]: Unit postgresql-15.service entered failed state.
Dec 27 07:22:01 psl0part-prod-pgsql02.skead.no systemd[1]: postgresql-15.service failed.
In this case, the error seems to be in an error in the file postgresql.auto.conf.

Friday, July 19, 2024

How to check if your PostgreSQL server is part of a replication setup

If the server is down, check the entries in postgresql.auto.conf.

The following postgres.auto.conf is from a slave server:
wal_level = 'replica'
hot_standby = on
hot_standby_feedback = 'on'
primary_slot_name = 'stby1'
max_connections = 636

primary_conninfo      = 'host=pgserver01.oric.no port=5432 user=replicate password=mysecretpassword'
promote_trigger_file  = '/tmp/MasterNow'
The master server may have a very simple postgres.auto.conf file:
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
wal_level = 'replica'
If the server is up, use queries:

1. On the primary server, check that data is shipped over to the replica server:
[postgres@pgserver01.oric.no /pgdata/data]$ echo 'select usename,state,sync_state,sync_priority,client_hostname from pg_stat_replication;' | psql
  usename  |   state   | sync_state | sync_priority |        client_hostname
-----------+-----------+------------+---------------+--------------------------------
 replicate | streaming | async      |             0 | pgserver02.oric.no
(1 row)
2. The following query should return f (false) on primary server, t (true) on the replica server:
[postgres@pgserver01 /pgdata/data/log]$ echo "select pg_is_in_recovery();" | psql
 pg_is_in_recovery
-------------------
 f
(1 row)

[postgres@pgserver02 /pgdata/data/log]$  echo "select pg_is_in_recovery();" | psql
 pg_is_in_recovery
-------------------
 t
(1 row)

Wednesday, March 6, 2024

How to remove all postgres packages on a Linux RH server

Logged in as root on the server you would like to remove all postgres packages from:

Stop and disable the current postgres services (in this case, postgres version 15):
systemctl stop postgresql-15
systemctl disable postgresql-15
Remove the packages using "yum remove":
rpm -qa | grep postgres | xargs yum remove -y

Thursday, September 14, 2023

How to install an extension in a postgreSQL database

Are the desired extensions available in our installation?
[postgres@myserver.no ~]$ echo "SELECT * FROM pg_available_extensions;" | psql | egrep 'cube|earthdistance';
 cube               | 1.5             |                   | data type for multidimensional cubes
 earthdistance      | 1.1             |                   | calculate great-circle distances on the surface of the Earth
Yes they are.

Source : https://www.postgresql.org/docs/current/view-pg-available-extensions.html

Connect to the relevant database and create the extensions in the schema you want them:
postgres=# \connect mydb1
You are now connected to database "mydb1" as user "postgres".
mydb1=# create extension if not exists cube with schema myschema1 cascade;
CREATE EXTENSION
mydb1=# create extension if not exists earthdistance  with schema myschema1 cascade;
CREATE EXTENSION
mydb1=# exit

Check if they have indeed been installed:
[postgres@myserver.no ~]$ psql mydb01
psql (15.4)
Type "help" for help.

sit1=# \dx
                                                List of installed extensions
        Name        | Version |       Schema        |                              Description
--------------------+---------+---------------------+------------------------------------------------------------------------
 cube               | 1.5     | myschema1           | data type for multidimensional cubes
 earthdistance      | 1.1     | myschema1           | calculate great-circle distances on the surface of the Earth
 pg_qualstats       | 2.0.4   | public              | An extension collecting statistics about quals
 pg_stat_statements | 1.10    | public              | track planning and execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog          | PL/pgSQL procedural language
(5 rows)

Create extension is documented here

Friday, December 2, 2022

How to alter a column to use a sequence generated number

In my example, mytable contains a columnn called id, which is of datatype bigint.

If you want to use a sequence to automatically popluate the column at each insert, you can alter the column to support this with the below syntax:
ALTER TABLE myschema.mytable
    ALTER COLUMN id SET DEFAULT nextval('myschema.mysequence'::regclass);

Friday, October 21, 2022

Rename schema in PostgreSQL: syntax

\connect db01
alter schema schema1 rename to schema1_old;
Documented here

Rename database in PostgreSQL: syntax

alter database db01 rename to db01;
Documented here

See this post for how to create a new database in PostgreSQL

Wednesday, October 19, 2022

How to prevent a user from login into a postgres database

alter role scott with nologin;
ALTER ROLE
Remember that in postgres, "users" and "roles" are used interchangably, so this would also work:
alter user scott with nologin;
ALTER ROLE
In either case, the postgres server will echo "ALTER ROLE" back to the administrator.

To see the result of such an operation:
echo "\du" | psql
Example output:
                                               List of roles
            Role name            |                         Attributes                         |      Member of
---------------------------------+------------------------------------------------------------+---------------------
 scott                           | Cannot login                                              +| {business_users}
 

You can also query the postgres data dictionary for the answer, like this:
postgres=# select rolcanlogin from pg_roles where rolname='jim';
 rolcanlogin
-------------
 t
(1 row)

postgres=# select rolcanlogin from pg_roles where rolname='scott';
 rolcanlogin
-------------
 f
(1 row)
where t and f indicates true if the user can login and false if the user cannot, respectively.

Wednesday, August 31, 2022

Rename table in postgreSQL: syntax

Rename a table:
alter table myschema.mytable rename to mytable_old;
Change ownership for a table:
alter table myschema.mytable owner to anotherschema;

Friday, August 26, 2022

How to generate insert statements for a table in PostgreSQL

Thanks to Carlos Becker for pointing out the following solution on how to generate insert statments for a table in PostgreSQL
pg_dump mydb -n 'myschema' -a -Fp  -t myschema.mytable --column-inserts > /data1/insert_statements.sql| gzip

How to find the oid for a table in PostgreSQL

Connected to the relevant database using the connection tool of your choice, execute the following to find the oid for a table:
mydb # select 'myschema.mytable'::regclass::oid;
   oid
----------
 19561436
(1 row)

Thursday, August 25, 2022

Query to find schema size in PostgreSQL

SELECT pg_size_pretty(sum(pg_relation_size(quote_ident(schemaname) || '.' || quote_ident(tablename)))::bigint) FROM pg_tables 
WHERE schemaname = 'yourschema';

Wednesday, August 24, 2022

Does PostgreSQL support global indexes on partitioned tables?

PostgreSQL does not support Global Indexes, i.e indexes that spans all partitions in a partitioned table. The documentation states:

Although all partitions must have the same columns as their partitioned parent, partitions may have their own indexes, constraints and default values, distinct from those of other partitions.

and

Create an index on the key column(s), as well as any other indexes you might want, on the partitioned table. ... This automatically creates a matching index on each partition, and any partitions you create or attach later will also have such an index. An index or unique constraint declared on a partitioned table is “virtual” in the same way that the partitioned table is: the actual data is in child indexes on the individual partition tables. A proposal to add such feature seem to have been suggested back in 2019, but currently no implementation exists.

I also found this comparison between Oracle and PostgreSQL that elaborates on the topic a bit more.