Wednesday, February 24, 2021

PostgreSQL: How to set number of parallel workers in a session

SET max_parallel_workers_per_gather=num;

Default is num=2, which will give you 3 processes in total (1 master + 2 workers). If set to 3, you will have 5 processes in total. This parameter can be set on a per session basis.

Remember the global cap on parallelism that is represented by the parameter max_parallel_workers. Default is 8. max_parallel_workers_per_gather cannot exceed max_parallel_workers, which can only be changed in the configuration file and requires a full restart of the postgres server.

The documentation can be found here

See also "When can parallel query be used?" in the documentation

This post is based on the PostgreSQL 11 server.

Friday, February 19, 2021

How to find the number of huge pages to set on a PostgreSQL server

Thanks to Ibrar Ahmed for posting the article "Tune Linux Kernel Parameters For PostgreSQL Optimization"

Log on to the server as the os user that owns the PostgreSQL server. On my server, this user is called "postgres":
su - postgres
Create a file called find_hp.sh. Insert the following:
#!/bin/bash
pid=`head -1 $PGDATA/postmaster.pid`
echo "Pid:            $pid"
peak=`grep ^VmPeak /proc/$pid/status | awk '{ print $2 }'`
echo "VmPeak:            $peak kB"
hps=`grep ^Hugepagesize /proc/meminfo | awk '{ print $2 }'`
echo "Hugepagesize:   $hps kB"
hp=$((peak/hps))
echo Set Huge Pages:     $hp
Make sure the environment variable $PGDATA is set. Give the script execution rights:
chmod 755 find_hp.sh
Execute it, and it will tell you how many huge pages you need:
 ./find_hp.sh
Pid:             128678
VmPeak:          68986484 kB
Hugepagesize:    2048 kB
Set Huge Pages:  33684
I can now proceed to allow for 33684 huge pages on my system with
sysctl -w vm.nr_hugepages=33684

Tuesday, February 9, 2021

How to loop through tables in a schema in PostgreSQL and show estimated number of rows

To loop through all tables in a schema called "myschema", in a database called "proddb01" you can put the following in a script called "find_rows.sh":
for a in `echo "\t \dt+ myschema.*" | psql proddb01 | awk -F '[|]' '{ print $2 }'`; do
 echo "SELECT relname, reltuples::BIGINT AS estimate FROM pg_class WHERE relname='$a';" | psql proddb01
done
chmod 755 find_rows.sh
./find_rows.sh
Example output:
        relname           | estimate
----------------------------+----------
 table1                   |        0
(1 row)

           relname            | estimate
------------------------------+----------
 table2                       | 65525596
(1 row)

        relname        | estimate
-----------------------+-----------
 table3                | 153588080
(1 row)

      relname       | estimate
--------------------+----------
 table4             |        1
(1 row)

How to turn off output generated by psql

I guess this could be viewed as the equivalent to oracles "set" statements in sqlplus, for example "set heading off verify off feedback off echo off":
proddb01=# \dn
           List of schemas
           Name           |  Owner
--------------------------+----------
 sales                    | postgres
 hr                       | postgres
 manufacturing            | postgres
 public                   | postgres
 (4 rows)
Turn off unneccessary output like this:
proddb01=# \t
Tuples only is on.
Try again:
proddb01=# \dn
 sales                    | postgres
 hr                       | postgres
 manufacturing            | postgres
 public                   | postgres

How to extract estimated number of rows from a PostgreSQL table

prod=# SELECT reltuples::BIGINT AS estimate FROM pg_class WHERE relname='yourtable';
 estimate
----------
 42223028
(1 row)
To include the name of the tables in case you want to check several tables in one go:
prod=# select relname,reltuples::BIGINT AS estimate FROM pg_class WHERE relname in ('mytable','yourtable');

         relname         | relowner | estimate
-------------------------+----------+----------
            mytable      |    16724 |        0
          yourtable      |    16724 |        0


Unlike Oracle, PostgreSQL is converting all strings to lowercase, so you should not use WHERE relname='YOURTABLE', but stick to lower case.

Friday, February 5, 2021

Find duplicate file names

You can find duplicate file names, on different file systems, by using this query:
set lines 200
col "file_name" format a30
col "tablespace" format a30

set trimspool on
spool duplicates.lst
alter session set nls_language='american';

select t.name "tablespace",
trim(
            substr(f.name,
                (instr(f.name,'/', -1, 1) +1)
                )
               ) "file_name", count(*)
from v$datafile f join v$tablespace t
on (f.ts# = t.ts#)
group by t.name,
         trim(
            substr(f.name,
                (instr(f.name,'/', -1, 1) +1)
                )
               )
having count(*) > 1;



exit

Friday, January 29, 2021

How to manually register missing logsequences on a standby database

I have previously documented how to identify gaps in your standby database's log sequence. See these posts:

After a successful rescue operation of my standby database, I had a 3-day lag behind the primary. It's easy to identified these using the data guard broker command below:
show database "prod_stby" RecvQEntries

Output from this command was (abbreviated):
STANDBY_RECEIVE_QUEUE
              STATUS     RESETLOGS_ID  THREAD   LOG_SEQ       TIME_GENERATED       TIME_COMPLETED        FIRST_CHANGE#         NEXT_CHANGE#       SIZE (KBs)
         NOT_APPLIED        894886266  1    372460  01/28/2021 16:46:01  01/28/2021 16:47:35        6196037493227        6196037506094            20840
         NOT_APPLIED        894886266  1    372462  01/28/2021 17:02:35  01/28/2021 18:31:30        6196037647447        6196038652945          1350187
         NOT_APPLIED        894886266  1    372463  01/28/2021 18:31:30  01/28/2021 19:09:06        6196038652945        6196039875468          1051704
         .
         .
         .

The logfiles were phyically present in the Flash Recovery Area:
cd /fra/PROD_STBY/archivelog
find . -name "*372108*"
./2021_01_25/o1_mf_1_372108_j0x8j1fc_.arc
There are two ways to inform the standby database about the presence of the logfile:

1. RMAN.
On the standby database:
rman target /
Verify that the standby database does not recognize the archivelog:
list archivelog sequence between 372106 and 372107;

using target database control file instead of recovery catalog
List of Archived Log Copies for database with db_unique_name PROD_STBY
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
371883  1    372106  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372106_j0x6x0cj_.arc

The output above confirms that sequence 372106 exists, and that sequence 372107 does not. 

To catalog the missing file:
RMAN> catalog start with '/fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc';

using target database control file instead of recovery catalog
searching for all files that match the pattern /fra/PROD_STB//archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

List of Files Unknown to the Database
=====================================
File Name: /fra/PROD_STB/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /fra/PROD_STB/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

Confirm again, and you'll see that the new files is registered:
RMAN> list archivelog sequence between 372106 and 372107;

List of Archived Log Copies for database with db_unique_name PROD_STBY
=====================================================================

Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
371883  1    372106  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372106_j0x6x0cj_.arc

371956  1    372107  A 25-JAN-21
        Name: /fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc

If the number of logfiles missing is large, use a shortcut to register them all:
RMAN> catalog start with '/fra/PROD_STBY/archivelog/2021_01_25';
The above command will register all logfiles in the directory /fra/PROD_STBY/archivelog/2021_01_25
 
2. sqlplus:
SQL> alter database register logfile '/fra/PROD_STBY/archivelog/2021_01_25/o1_mf_1_372107_j0x7k8lh_.arc';
If you tail the alert log of the database, you'll see that the standby database quickly picks up the missing logfiles.