Wednesday, February 24, 2021

PostgreSQL: How to set number of parallel workers in a session

SET max_parallel_workers_per_gather=num;

Default is num=2, which will give you 3 processes in total (1 master + 2 workers). If set to 3, you will have 5 processes in total. This parameter can be set on a per session basis.

Remember the global cap on parallelism that is represented by the parameter max_parallel_workers. Default is 8. max_parallel_workers_per_gather cannot exceed max_parallel_workers, which can only be changed in the configuration file and requires a full restart of the postgres server.

The documentation can be found here

See also "When can parallel query be used?" in the documentation

This post is based on the PostgreSQL 11 server.

Friday, February 19, 2021

How to find the number of huge pages to set on a PostgreSQL server

Thanks to Ibrar Ahmed for posting the article "Tune Linux Kernel Parameters For PostgreSQL Optimization"

Log on to the server as the os user that owns the PostgreSQL server. On my server, this user is called "postgres":
su - postgres
Create a file called find_hp.sh. Insert the following:
#!/bin/bash
pid=`head -1 $PGDATA/postmaster.pid`
echo "Pid:            $pid"
peak=`grep ^VmPeak /proc/$pid/status | awk '{ print $2 }'`
echo "VmPeak:            $peak kB"
hps=`grep ^Hugepagesize /proc/meminfo | awk '{ print $2 }'`
echo "Hugepagesize:   $hps kB"
hp=$((peak/hps))
echo Set Huge Pages:     $hp
Make sure the environment variable $PGDATA is set. Give the script execution rights:
chmod 755 find_hp.sh
Execute it, and it will tell you how many huge pages you need:
 ./find_hp.sh
Pid:             128678
VmPeak:          68986484 kB
Hugepagesize:    2048 kB
Set Huge Pages:  33684
I can now proceed to allow for 33684 huge pages on my system with
sysctl -w vm.nr_hugepages=33684

Tuesday, February 9, 2021

How to loop through tables in a schema in PostgreSQL and show estimated number of rows

To loop through all tables in a schema called "myschema", in a database called "proddb01" you can put the following in a script called "find_rows.sh":
for a in `echo "\t \dt+ myschema.*" | psql proddb01 | awk -F '[|]' '{ print $2 }'`; do
 echo "SELECT relname, reltuples::BIGINT AS estimate FROM pg_class WHERE relname='$a';" | psql proddb01
done
chmod 755 find_rows.sh
./find_rows.sh
Example output:
        relname           | estimate
----------------------------+----------
 table1                   |        0
(1 row)

           relname            | estimate
------------------------------+----------
 table2                       | 65525596
(1 row)

        relname        | estimate
-----------------------+-----------
 table3                | 153588080
(1 row)

      relname       | estimate
--------------------+----------
 table4             |        1
(1 row)

How to turn off output generated by psql

I guess this could be viewed as the equivalent to oracles "set" statements in sqlplus, for example "set heading off verify off feedback off echo off":
proddb01=# \dn
           List of schemas
           Name           |  Owner
--------------------------+----------
 sales                    | postgres
 hr                       | postgres
 manufacturing            | postgres
 public                   | postgres
 (4 rows)
Turn off unneccessary output like this:
proddb01=# \t
Tuples only is on.
Try again:
proddb01=# \dn
 sales                    | postgres
 hr                       | postgres
 manufacturing            | postgres
 public                   | postgres

How to extract estimated number of rows from a PostgreSQL table

prod=# SELECT reltuples::BIGINT AS estimate FROM pg_class WHERE relname='yourtable';
 estimate
----------
 42223028
(1 row)
To include the name of the tables in case you want to check several tables in one go:
prod=# select relname,reltuples::BIGINT AS estimate FROM pg_class WHERE relname in ('mytable','yourtable');

         relname         | relowner | estimate
-------------------------+----------+----------
            mytable      |    16724 |        0
          yourtable      |    16724 |        0


Unlike Oracle, PostgreSQL is converting all strings to lowercase, so you should not use WHERE relname='YOURTABLE', but stick to lower case.

Friday, February 5, 2021

Find duplicate file names

You can find duplicate file names, on different file systems, by using this query:
set lines 200
col "file_name" format a30
col "tablespace" format a30

set trimspool on
spool duplicates.lst
alter session set nls_language='american';

select t.name "tablespace",
trim(
            substr(f.name,
                (instr(f.name,'/', -1, 1) +1)
                )
               ) "file_name", count(*)
from v$datafile f join v$tablespace t
on (f.ts# = t.ts#)
group by t.name,
         trim(
            substr(f.name,
                (instr(f.name,'/', -1, 1) +1)
                )
               )
having count(*) > 1;



exit