Tuesday, September 14, 2021

How to export and import a schema using the directory format

What exactly is the directory format used by pg_dump? 

 The documentation states: 

  Dumps can be output in script or archive file formats

 and 

The alternative archive file formats must be used with pg_restore to rebuild the database. 

Do they bring any advantages over the plain-text format?

Archive file formats allows pg_restore to

  • be selective about what is restored
  • reorder the items prior to being restored
  • be portable across architectures.
  • be used to examine the contents of an archive
  • use parallel workers during export and import

    The documentation claims that pg_dump used archive file formats and pg_restore provides a flexible archival and transfer mechanism. 

    There are two types of archive file format output:

  • The “custom” format (-Fc)
  • The “directory” format (-Fd)

    Both of these allows for selection and reordering of all archived items, support parallel restoration, and are compressed by default, but note that only the “directory” format supports parallelization of the export utility pg_dump.

    In other words, using pg_dump + pg_restore to move data should be a transition that most seasoned Oracle DBAs will be able to do. The usage of these utilities is a lot similar to Oracle Data Pump.


    Export schema "music" in database "musicdb" using the directory format archive:
    pg_dump -Fd musicdb -n music -v -f /export/musicdbexport -j 2
    I am using two workers indicated by the -j flag.
    You need to point to a valid path for your directory, which is /export. pg_dump will create the subdirectory "musicdbexport" for you, and according to the documentation it should not be created beforehand.

    If you list the path after export, you'll see files like these:
    cd /export/musicdbexport
    total 40K
    drwxr-xr-x 3 postgres postgres 4.0K Sep  7 16:22 ..
    -rw-r--r-- 1 postgres postgres 8.0K Sep  7 16:22 toc.dat
    -rw-r--r-- 1 postgres postgres   53 Sep  7 16:22 3756.dat.gz
    -rw-r--r-- 1 postgres postgres  292 Sep  7 16:22 3754.dat.gz
    -rw-r--r-- 1 postgres postgres   67 Sep  7 16:22 3751.dat.gz
    -rw-r--r-- 1 postgres postgres  142 Sep  7 16:22 3748.dat.gz
    -rw-r--r-- 1 postgres postgres   25 Sep  7 16:22 3750.dat.gz
    drwx------ 2 postgres postgres 4.0K Sep  7 16:22 .
    -rw-r--r-- 1 postgres postgres   88 Sep  7 16:22 3752.dat.gz
    
    Transfer these file, including the toc.dat file, to the destination server.

    Import schema "music" in database "musicdb" using the directory format archive:
    pg_restore /pgdata/export -C -c -d postgres -j 4 -v
    -C means "create the target database" -c means "drop the database objects before creating" -j is the number of jobs started in parallel
    Remeber that the file toc.dat must exist in the directory, otherwise you'll see the error message pg_restore: [archiver] directory "/pgdata/export" does not appear to be a valid archive ("toc.dat" does not exist)

    Remember, you don't have to drop the database first, it's only an option you have. You could also choose to import the schema "music" directly into an existing database:
    pg_restore /pgdata/export -d musicdb -j 4 -v

    If you examine the log aftwards, you'll typically see output like this:
    pg_restore: connecting to database for restore
    
    So postgres is simply connecting to the default database (named "postgres") in order to create the database which schema "music" happens to be exported from.

    Following that, we see how the database "musicdb" is created:
    pg_restore: processing item 3759 ENCODING ENCODING
    pg_restore: processing item 3760 STDSTRINGS STDSTRINGS
    pg_restore: processing item 3761 SEARCHPATH SEARCHPATH
    pg_restore: processing item 3762 DATABASE musicdb
    pg_restore: creating DATABASE "musicdb"
    


    Then the pg_restore connects to the new database and creates the schema in there:
    pg_restore: connecting to new database "musicdb"
    pg_restore: processing item 10 SCHEMA music
    pg_restore: creating SCHEMA "music"
    


    After that, the objects are imported one by one:
    pg_restore: processing item 202 TABLE albums
    pg_restore: creating TABLE "music.albums"
    
    Like schema imports in Oracle, any users that may have grants on the objects being imported must exist beforehand for any GRANT statements to execute successfully.

    In my case, these errors were thrown because I did not take care to create a user called "music" up front:

    pg_restore: [archiver (db)] Error while PROCESSING TOC:
    pg_restore: [archiver (db)] Error from TOC entry 202; 1259 45112865 TABLE albums music
    pg_restore: [archiver (db)] could not execute query: ERROR:  role "music" does not exist
        Command was: ALTER TABLE music.albums OWNER TO music;
    


    pg_dump is documented here
    pg_restore is documenteted here
  • Wednesday, September 8, 2021

    How to analyze a table in PostgreSQL

    Just like Oracle, PostgreSQL stores statistics about the tables in an internal table called pg_statistics. This is used by the query planner to determine the most efficient execution path. To gather fresh statistics on a table, use the following simple syntax:
    \connect mydatabase
    
    analyze verbose myschema.mytable;
    
    Documentation here

    Thursday, August 26, 2021

    What is vacuuming in PostgreSQL and how do I vacuum a table?

    Under the headline "Vacuuming Basics" in the PostgreSQL documentation, you can read amongst other things:
    PostgreSQL's VACUUM command has to process each table on a regular basis for several reasons:
    * To recover or reuse disk space occupied by updated or deleted rows.
    * To update data statistics used by the PostgreSQL query planner.
    * To update the visibility map, which speeds up index-only scans.
    * To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.
    

    Each of these reasons dictates performing VACUUM operations of varying frequency and scope,


    Further explaination can be found together with the syntax schema for VACUUM:

    VACUUM reclaims storage occupied by dead tuples.
    In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables.

    Vacuum a specific table in verbose mode:
    psql mydb
    You are now connected to database "mydb" as user "postgres".
    
    vacuum (verbose, analyze) myschema.mytable;

    Friday, August 20, 2021

    How to fetch first n rows only in PostgreSQL

    The equivalent of oracles "fetch first n rows only" for postgreSQL is:
    select *
    from myschema.mytable
    limit 10;
    
    If you want to select the coming next 20 rows, but skip the rows you already selected first time, use the keyword OFFSET:
    select *
    from myschema.mytable
    limit 20 offset 10;
    
    The PostgreSQL 11 documentation for this feature can be found here

    Wednesday, August 18, 2021

    Solution for no ocijdbc18 in java.library.path when using sqlcl

    sqlnet.ora:
    NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)
    NAMES.DEFAULT_DOMAIN = oric.no
    
    tnsnames.ora
    testdb01,testdb01.oric.no =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = testserver.oric.no)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = testdb01)
        )
      )
      


    Using tnsping:
    C:\>tnsping testdb01
    
    TNS Ping Utility for 64-bit Windows: Version 19.0.0.0.0 - Production on 17-AUG-2021 12:42:14
    
    Used parameter files:
    C:\app\client\vegard\network\admin\sqlnet.ora
    Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = testserver.oric.no)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = testdb01)))
    OK (40 msec)
    
    Using sqlplus:
    C:\>sqlplus system/mypassword@testdb01
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Tue Aug 17 12:39:47 2021
    Version 19.3.0.0.0
    
    Last Successful login time: Tue Aug 17 2021 12:36:35 +02:00
    
    Connected to:
    Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
    Version 18.14.0.0.0
    
    Using sqlcl:
    C:\>sql.exe system/mypassword@testdb01
    
    SQLcl: Release 19.2.1 Production on Tue Aug 17 12:44:36 2021
    
    Copyright (c) 1982, 2021, Oracle.  All rights reserved.
    
      USER          = system
      URL           = jdbc:oracle:oci8:@testdb01
      Error Message = no ocijdbc18 in java.library.path
      USER          = system
      URL           = jdbc:oracle:thin:@testdb01
      Error Message = IO Error: could not resolve the connect identifier  "testdb01
      USER          = system
      URL           = jdbc:oracle:thin:@testdb01:1521/testdb01
      Error Message = IO Error: could not resolve the connect identifier  "testdb01:1521/testdb01"
    Username? (RETRYING) ('system/*********@testdb01'?)
    
    Why is sqlcl having trouble connecting, when the listener is accepting requests and connections over sqlplus are indeed accepted?

    My experience is that sqlcl doesn't seem to accept multiple tnsnames.ora aliases.

    Solution: Split up the tnsnames.ora entry in two different entries.

    Change it from
    testdb01, testdb01.oric.no =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = testserver.oric.no)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = testdb01)
        )
      )
      
    to
    testdb01 =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = testserver.oric.no)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = testdb01)
        )
      )
    
    testdb01.oric.no =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = testserver.oric.no)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = testdb01)
        )
      )
      

    Monday, August 2, 2021

    How to work around ORA-38323: policy conflicts with policy

    You try to enable a tablespace-wide ADO policy:
    ORA-38323: policy conflicts with policy 42
    
    If you lookup the error with oerr ora 38323, oracle will give you the following solution:
    * Cause: An attempt was made to create a policy with the same action
             type and based on the same statistic as another policy on the
             object.
     *Action: Use a different action and/or statistic for the new policy,
              or delete the old policy.
    
    Verify that there is a policy on the tablespace already:
    SELECT * 
    FROM DBA_ILMPOLICIES 
    WHERE TABLESPACE IS NOT NULL;
    
    Result:
    POLICY_NAME POLICY_TYPE TABLESPACE ENABLED DELETED
    P43 DATA MOVEMENT DATA1 YES NO

    Theres is indeed a policy named P43. What kind of policy is it?
    SELECT policy_name,action_type,scope,compression_level,condition_type,condition_days,policy_subtype
    FROM DBA_ILMDATAMOVEMENTPOLICIES 
    WHERE POLICY_NAME IN (SELECT POLICY_NAME FROM DBA_ILMPOLICIES WHERE TABLESPACE IS NOT NULL); 
    
    Result:
    POLICY_NAME ACTION_TYPE SCOPE COMPRESSION_LEVEL CONDITION_TYPE CONDITION_DAYS POLICY_SUBTYPE
    P43 COMPRESSION GROUP ADVANCED LAST ACCESS TIME
    1
    DISK

    So the policy is a directive to the oracle server to compress all objects after one day of no access. If you need to alter this policy, the old one must be dropped:
    ALTER TABLESPACE DATA1 DEFAULT ILM DELETE POLICY P43;
    
    You are now free to add a new ADO policy to the tablespace:
    ALTER TABLESPACE DATA1
    DEFAULT ILM ADD POLICY
    ROW STORE COMPRESS ADVANCED
    GROUP AFTER 30 DAYS OF NO ACCESS;
    

    How to add a default ADO compression policy to a tablespace

    Create the tablespace:
    CREATE BIGFILE TABLESPACE DATA1 DATAFILE
      '/oradata/mydb/data1.dbf' SIZE 1024M AUTOEXTEND ON NEXT 256M MAXSIZE 2T
    DEFAULT
    COMPRESS FOR OLTP
    EXTENT MANAGEMENT LOCAL AUTOALLOCATE
    SEGMENT SPACE MANAGEMENT AUTO
    DEFAULT ILM ADD POLICY 
    COMPRESS FOR ALL OPERATIONS 
    GROUP 
    AFTER 1 DAYS OF NO ACCESS;
    
    Notice the "GROUP" keyword. It states the scope of an ADO policy, and can be a group of related objects, segment level or row-level, indicated by the keywords GROUP, ROW, or SEGMENT.

    According to the documentation, by default, this will give you advanced compression on heap tables, standard compression for indexes and LOW for LOB segments created in the tablespace.

    Any table created in this tablespace from now on will now inherit the ADO policy. Let's see it in action:
    create table sh.test_ado_tabspc_compression as select * from sh.sales;
    
    Table created.
    
    Verify:
    SELECT policy_name,
           object_owner,
           object_name,
           subobject_name,
           object_type,
           inherited_from,
           enabled,
           deleted
    FROM   dba_ilmobjects
    WHERE  object_owner='SH'
    AND    object_name='SALES'
    ORDER BY 1;
    
    Result:
    POLICY_NAME OBJECT_OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE INHERITED_FROM ENABLED DELETED
    P43 SH TEST_ADO_TABSPC_COMPRESSION   TABLE TABLESPACE YES NO

    The table has inherited the ADO policy from the tablespace it was created in.