Showing posts with label Postgres. Show all posts
Showing posts with label Postgres. Show all posts

Wednesday, March 6, 2024

How to remove all postgres packages on a Linux RH server

Logged in as root on the server you would like to remove all postgres packages from:

Stop and disable the current postgres services (in this case, postgres version 15):
systemctl stop postgresql-15
systemctl disable postgresql-15
Remove the packages using "yum remove":
rpm -qa | grep postgres | xargs yum remove -y

Thursday, September 14, 2023

How to install an extension in a postgreSQL database

Are the desired extensions available in our installation?
[postgres@myserver.no ~]$ echo "SELECT * FROM pg_available_extensions;" | psql | egrep 'cube|earthdistance';
 cube               | 1.5             |                   | data type for multidimensional cubes
 earthdistance      | 1.1             |                   | calculate great-circle distances on the surface of the Earth
Yes they are.

Source : https://www.postgresql.org/docs/current/view-pg-available-extensions.html

Connect to the relevant database and create the extensions in the schema you want them:
postgres=# \connect mydb1
You are now connected to database "mydb1" as user "postgres".
mydb1=# create extension if not exists cube with schema myschema1 cascade;
CREATE EXTENSION
mydb1=# create extension if not exists earthdistance  with schema myschema1 cascade;
CREATE EXTENSION
mydb1=# exit

Check if they have indeed been installed:
[postgres@myserver.no ~]$ psql mydb01
psql (15.4)
Type "help" for help.

sit1=# \dx
                                                List of installed extensions
        Name        | Version |       Schema        |                              Description
--------------------+---------+---------------------+------------------------------------------------------------------------
 cube               | 1.5     | myschema1           | data type for multidimensional cubes
 earthdistance      | 1.1     | myschema1           | calculate great-circle distances on the surface of the Earth
 pg_qualstats       | 2.0.4   | public              | An extension collecting statistics about quals
 pg_stat_statements | 1.10    | public              | track planning and execution statistics of all SQL statements executed
 plpgsql            | 1.0     | pg_catalog          | PL/pgSQL procedural language
(5 rows)

Create extension is documented here

Friday, December 2, 2022

How to alter a column to use a sequence generated number

In my example, mytable contains a columnn called id, which is of datatype bigint.

If you want to use a sequence to automatically popluate the column at each insert, you can alter the column to support this with the below syntax:
ALTER TABLE myschema.mytable
    ALTER COLUMN id SET DEFAULT nextval('myschema.mysequence'::regclass);

Friday, October 21, 2022

Rename schema in PostgreSQL: syntax

\connect db01
alter schema schema1 rename to schema1_old;
Documented here

Rename database in PostgreSQL: syntax

alter database db01 rename to db01;
Documented here

See this post for how to create a new database in PostgreSQL

Wednesday, October 19, 2022

How to prevent a user from login into a postgres database

alter role scott with nologin;
ALTER ROLE
Remember that in postgres, "users" and "roles" are used interchangably, so this would also work:
alter user scott with nologin;
ALTER ROLE
In either case, the postgres server will echo "ALTER ROLE" back to the administrator.

To see the result of such an operation:
echo "\du" | psql
Example output:
                                               List of roles
            Role name            |                         Attributes                         |      Member of
---------------------------------+------------------------------------------------------------+---------------------
 scott                           | Cannot login                                              +| {business_users}
 

You can also query the postgres data dictionary for the answer, like this:
postgres=# select rolcanlogin from pg_roles where rolname='jim';
 rolcanlogin
-------------
 t
(1 row)

postgres=# select rolcanlogin from pg_roles where rolname='scott';
 rolcanlogin
-------------
 f
(1 row)
where t and f indicates true if the user can login and false if the user cannot, respectively.

Wednesday, August 31, 2022

Rename table in postgreSQL: syntax

Rename a table:
alter table myschema.mytable rename to mytable_old;
Change ownership for a table:
alter table myschema.mytable owner to anotherschema;

Friday, August 26, 2022

How to generate insert statements for a table in PostgreSQL

Thanks to Carlos Becker for pointing out the following solution on how to generate insert statments for a table in PostgreSQL
pg_dump mydb -n 'myschema' -a -Fp  -t myschema.mytable --column-inserts > /data1/insert_statements.sql| gzip

How to find the oid for a table in PostgreSQL

Connected to the relevant database using the connection tool of your choice, execute the following to find the oid for a table:
mydb # select 'myschema.mytable'::regclass::oid;
   oid
----------
 19561436
(1 row)

Thursday, August 25, 2022

Query to find schema size in PostgreSQL

SELECT pg_size_pretty(sum(pg_relation_size(quote_ident(schemaname) || '.' || quote_ident(tablename)))::bigint) FROM pg_tables 
WHERE schemaname = 'yourschema';

Wednesday, August 24, 2022

Does PostgreSQL support global indexes on partitioned tables?

PostgreSQL does not support Global Indexes, i.e indexes that spans all partitions in a partitioned table. The documentation states:

Although all partitions must have the same columns as their partitioned parent, partitions may have their own indexes, constraints and default values, distinct from those of other partitions.

and

Create an index on the key column(s), as well as any other indexes you might want, on the partitioned table. ... This automatically creates a matching index on each partition, and any partitions you create or attach later will also have such an index. An index or unique constraint declared on a partitioned table is “virtual” in the same way that the partitioned table is: the actual data is in child indexes on the individual partition tables. A proposal to add such feature seem to have been suggested back in 2019, but currently no implementation exists.

I also found this comparison between Oracle and PostgreSQL that elaborates on the topic a bit more.

Friday, April 22, 2022

How to set up replication between two postgreSQL servers

Compared to Oracle, setup of a postgreSQL physical standby database is quite simple. With two identically provisioned posgreSQL 13 servers created, the below steps must be performed.
Master: test01-sandbox-pgsql.mydomain.com
Slave: test02-sandbox-pgsql.mydomain.com

1. Configure the master database cluster As the postgres software owner, execute the following:
echo "CREATE ROLE replicate WITH REPLICATION LOGIN password 'mypassword';" | psql
echo "grant app_user to replicate;" | psql
echo "alter system set wal_level = replica;" | psql
echo "select pg_reload_conf();" | psql
echo "alter system set synchronous_commit = local;" | psql
echo "alter system set track_commit_timestamp = on;" |  psql
2. Restart the postgres server
systemctl restart postgresql-13.service
3. Configure the slave database cluster

As the root user, stop the postgres server
systemctl stop postgresql-13.service
4. Remove any files created in the deployment of the slave. They will be replaced anyway
rm -rf /data/*
5. take a base backup of the master database cluster
pg_basebackup -h test01-sandbox-pgsql.mydomain.com -D /data/ -P -U replicate --wal-method=stream
6. Set some parameters
echo "hot_standby = on" >> /pgconfig/postgresql.auto.conf 

echo "
primary_conninfo      = 'host=masterhost.mydomain.com port=5432 user=replicate password=mypassword'
promote_trigger_file  = '/tmp/MasterNow'
" >  /pgconfig/postgresql.conf  
7. Create a file necessary for the master/slave role determination
touch /data/standby.signal
8. Start the postgres server
systemctl start postgresql-13.service
9. Verification checks

On the master, to check that everything works fine, use the dictionary view pg_stat_replication, which, according to the documentation "contain one row per WAL sender process, showing statistics about replication to that sender's connected standby server. Only directly connected standbys are listed; no information is available about downstream standby servers."
 echo 'select usename,state,sync_state,sync_priority,client_hostname from pg_stat_replication;' | psql
  usename  |   state   | sync_state | sync_priority |          client_hostname
-----------+-----------+------------+---------------+-----------------------------------
 replicate | streaming | async      |             0 | til0dbgr-sandbox-pgsql02.skead.no
(1 row)
The check the slave status, use the dictionary view pg_stat_wal_receiver. According to the documentation the view "contain only one row, showing statistics about the WAL receiver from that receiver's connected server":
psql
psql (13.6)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# select status,sender_host,conninfo from pg_stat_wal_receiver;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
status      | streaming
sender_host | test01-sandbox-pgsql.mydomain.com
conninfo    | user=replicate password=******** channel_binding=prefer dbname=replication host=test01-sandbox-pgsql.mydomain.com port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any
Check that recovery is in progress:
 echo 'SELECT pg_is_in_recovery();' | psql
 pg_is_in_recovery
-------------------
 t
(1 row)
The equivalent query on the master node would state "f" (false). Read more about
  • pg_stat_replication
  • pg_stat_wal_receiver

    Note that some important changes to replication were introduced in PostgreSQL version 12
  • Wednesday, April 20, 2022

    Oracle vs PostgreSQL replication terminology

    Functionality Oracle PostgreSQL
    A standby server that can accept connections and serves read-only queries Active Data Guard Hot Standby
    A standby server that cannot be connected to until it is promoted to a master server Physical Standby Server Warm Standby
    A data-modifying transaction is not considered committed until all servers have committed the transaction. Maximum Availability Synchronous
    Allow some delay between the time of a commit and its propagation to the other servers.
    Some transactions might be lost in the switch to a backup server
    load balanced servers might return slightly stale results.
    Maximum Performance Asynchronous
    Do not allow any transactions to be unprotected at any time Maximum Protection  

  • PostgreSQL 11 dokumentation here
  • Oracle 12cR2 documentation here
  • Wednesday, April 6, 2022

    Wrappers for pg_dump and pg_restore

    Export:

    #pg_dump dumps a database as a text file or to other formats.
    #!/bin/bash
    export SCRIPT_NAME=`basename $0`
    export HOST=`uname -n`
    export TS=`date +\%m.\%d.\%y\_%H_%M_%S`
    export RUN_DATE=`date +\%m.\%d.\%y`
    export RUN_DIR=.
    export LOG_DIR=/tmp
    export DUMPDIR=/pgdata/export
    export JOB_NAME="ExportPGDB"
    export VERSION=1.0.0
    export LOGFILE=${LOG_DIR}/${SCRIPT_NAME}_${TS}.log
    
    exec &> ${LOGFILE}
    echo "Starting export job at " `date`
    pg_dump -Fd musicdb -n music -v -f ${DUMPDIR}/mussikkdbexport -j 2
    echo "Ending job at " `date`
    exit
    

    Import:

    # pg_restore - restore a PostgreSQL database from an archive file created by pg_dump
    #!/bin/bash
    export SCRIPT_NAME=`basename $0`
    export HOST=`uname -n`
    export TS=`date +\%m.\%d.\%y\_%H_%M_%S`
    export RUN_DATE=`date +\%m.\%d.\%y`
    export RUN_DIR=.
    export LOG_DIR=/tmp
    export DUMPDIR=/pgdata/export
    export JOB_NAME="ImportPGDB"
    export VERSION=1.0.0
    export LOGFILE=${LOG_DIR}/${SCRIPT_NAME}_${TS}.log
    exec &> ${LOGFILE}
    echo "Starting job at " `date`
    pg_restore ${DUMPDIR}/mussikkdbexport -C -c -d postgres -j 4 -v
    pg_restore ${DUMPDIR}/mussikkdbexport -c -d musicdb -j 4 -v
    echo "Ending job at " `date`
    exit
    


    More articles on pg_dump and pgrestore:

  • How to export and import a database in PostgreSQL
  • How to export and import a schema in PostgreSQL
  • How to list the contents of a custom format PostgreSQL export file
  • How to export and import a schema using the directory format
  • How to export a single table using different format in PostgreSQL
  • Friday, March 11, 2022

    How to create a partial index on a table in PostgreSQL

    I was asked by one of my customers to advise on how to create an index for a new column called "sk_type", of datatype varchar, with only 3 distinct values in addition to NULL. Let's call them STRING1, STRING2, STRING3.

    The table is already partitioned on column refyear(date). Here is the table DDL:
    CREATE TABLE event
    (
        id bigint NOT NULL DEFAULT nextval('hendelselager.hendelse_id_seq'::regclass),
        skpl character varying(8)  COLLATE pg_catalog."default",
        refyear integer NOT NULL,
        pnum bigint NOT NULL,
        ksystem character varying(128) COLLATE pg_catalog."default" NOT NULL,
        category character varying(128) COLLATE pg_catalog."default" NOT NULL,
        event character varying(128) COLLATE pg_catalog."default" NOT NULL,
        tstamp timestamp without time zone NOT NULL,
        ip_id character varying(128) COLLATE pg_catalog."default" NOT NULL,
        details jsonb,
        CONSTRAINT e_pkey PRIMARY KEY (refyear, event, id),
        CONSTRAINT uc_e01 UNIQUE (refyear, pnum, ksystem, category, event, ip_id)
    ) PARTITION BY RANGE (refyear);
    
    The distribution of values was expected to be very scew right from the start:
    * STRING1 - 95%
    * STRING2 - 5%
    * STRING3 < 0,5%
    * NULL < 0,1%

    In the documentation I found that perhaps a partial index would be ideal in this situation. A partial index is defined as

    an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries only for those table rows that satisfy the predicate.

    According to the documentation, a partial index could be useful in cases where you want to avoid indexing common values:

    Since a query searching for a common value (one that accounts for more than a few percent of all the table rows) will not use the index anyway, there is no point in keeping those rows in the index at all. This reduces the size of the index, which will speed up those queries that [actually] do use the index. It will also speed up many table update operations because the index does not need to be updated in all cases

    In other words, we would only index rows which have column value different from 'STRING1'.

    First, add the new column:
    ALTER TABLE event
    ADD sk_type character varying(8);
    
    Next, create the partial index. Here is the syntax I used in a sandbox environment:
    CREATE INDEX sk_type_idx ON event (refyear,sk_type)
     WHERE NOT (sk_type = 'STRING1');
    
    This approach comes with a caviat, which may or may not be acceptable:

    Observe that this kind of partial index requires that the common values be predetermined, so such partial indexes are best used for data distributions that do not change. The indexes can be recreated occasionally to adjust for new data distributions, but this adds maintenance effort.

    The offical PostgreSQL 11 documentation can be found here

    Wednesday, January 5, 2022

    How to set a dynamic parameter in a postgreSQL database cluster

    As with oracle, some parameters may be set dynamically in a postgreSQL database cluster. A postgreSQL database cluster uses a parameter file called postgres.conf. This file holds the cluster wide parameters. If you set a dynamic parameter using the ALTER SYSTEM SET command, the parameter will be written to yet another file called postgres.auto.conf, which values will always override the ones parameters in the postgres.conf Before the change, postgres.auto.conf look like this:
    log_line_prefix = '[%m] – %p %q- %u@%h:%d – %a'
    wal_level = 'replica'
    hot_standby = 'on'
    
    I then make a change to the system configuration:
    alter system set hot_standby_feedback=on;
    ALTER SYSTEM
    
    After this change, the file postgres.auto.conf has another entry:
    log_line_prefix = '[%m] – %p %q- %u@%h:%d – %a'
    wal_level = 'replica'
    hot_standby = 'on'
    hot_standby_feedback = 'on'
    
    I will then have to reload the database using the function pg_reload_conf() for the new parameter to take effect:
    postgres=#  select pg_reload_conf();
     pg_reload_conf
    ----------------
     t
    (1 row)
    
    The current logfile for the postgreSQL database cluster records this fact:
    [2022-01-03 14:45:23.127 CET] – 1683 LOG:  received SIGHUP, reloading configuration files
    [2022-01-03 14:45:23.129 CET] – 1683 LOG:  parameter "hot_standby_feedback" changed to "on"
    
    For details, check the documentation

    How to find out if a hot standby postgres database is receiving logs

     select pg_is_in_recovery();
     pg_is_in_recovery
    -------------------
     t
    (1 row)
    
    https://www.postgresql.org/docs/11/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE

    Wednesday, December 15, 2021

    How to find tables with specific column data types in a PostgreSQL

    Thanks to Bart Gawrych for blogging this very useful post on which my own post is based on.

    I needed to find all tables in a specific schema that used json or jsonb data type columns. Here is my query, saved in a file called "find_cols.sql":
    select col.table_schema,
           col.table_name,
           col.ordinal_position as column_id,
           col.column_name,
           col.data_type
    from information_schema.columns col
    join information_schema.tables tab on tab.table_schema = col.table_schema
                                       and tab.table_name = col.table_name
                                       and tab.table_type = 'BASE TABLE'
    where col.table_schema in ('myschema')
          and col.data_type IN ( 'json','jsonb' )
    order by col.table_schema,
             col.table_name,
             col.ordinal_position;
    
    
    Execution in psql directly on server like this:
    psql mydb -f find_cols.sql
    

    Tuesday, October 26, 2021

    How to only insert rows that do not violate a specific constraint

    PostgreSQL offers an alternative to raising an exception upon a unique key violation: ON CONFLICT ON CONSTRAINT
    INSERT INTO sales.sales_history 
    SELECT * from sales.sales_history_p201801 
    ON CONFLICT ON CONSTRAINT uc_sh1 DO NOTHING;
    
    Comes in as a handy way of transfering rows from an old table to a partially populated new table. Duplicate rows will simply be ignored.

    Documentation for PostgreSQL 11 can be found here.

    Monday, October 18, 2021

    How to export a single table in PostgreSQL

    Example of exporting a single table. In the directory-format:
    pg_dump -Fd mydb -t myschema.mytable -v -f  /data/mydirectory -j 2
    
    Where
    -Fd = export into a directory-format archive
    -t = table name
    -v = verbose
    -f = the target directory
    -j = number of export jobs
    
    Note that -f can be omitted for file based output formats.
    Since I am using the directory output format however, I am required to use it, and it specifies the target directory instead of a file. In this case the directory is created by pg_dump and it must not exist before.

    In plain text file format:
    pg_dump mydb -n 'myschema' -a -Fp  -t mytable | gzip > mydb_myschema_mytable.gz
    
    Where
    -Fp = plain text format
    -a  = data only, no metadata. In the case of plain text, this means no "create table" statement is generated
    -t  = table name
    
    When the export is done, you may unpack the mydb_myschema_mytable.gz and you will observe that the file contains COPY directives for each row.

    In custom format:
    pg_dump mydb -n 'myschema' -a -Fc  -t mytable | gzip > mydb_myschema_mytable_custom.gz
    
    Where all the options are identical to the plain text format, except for
    -Fc = custom data format