Wednesday, February 23, 2022

When explaining a query that is accessing a partitioned table, what does the Pstart=KEY or Pstop=KEY indicate?

The Pstart=KEY or Pstop=KEY indicate that the exact partition cannot be determined at compile time, but will most likely be found during run time.

Some earlier step in the plan is producing one or more values for the partition key, so that pruning can take place.

Example: I have a composite partitioned table, with a locally partitioned index:
create table published_documents(
  UNIQUE_ID                   VARCHAR2(160 BYTE) NOT NULL,
  REGYEAR                     NUMBER(18),
  DOCUMENT_TYPE               VARCHAR2(100 CHAR),
  DOCUMENT_NAME               VARCHAR2(1000 CHAR),
  TOPIC                       VARCHAR2(30 CHAR),
  VALID                       CHAR(1 BYTE),
  VERSION                     NUMBER(18),
  DATA_XML                    CLOB,
  FORMAT                      VARCHAR2(1000 CHAR),
  PERIOD                      VARCHAR2(1000 CHAR)
)
PARTITION BY LIST (DOCUMENT_TYPE)
SUBPARTITION BY LIST (PERIOD)
...
);

create index pub_docs_idx1 on published_documents
(regyear, document_type, period)
  local;
Send the following query to the database:
select  document_type, count(*)
from myuser.published_documents
partition(LEGAL)
group by document_type;

The output is as expected:
DOKUMENTTYPE COUNT(*)
Affidavit
7845
Amending Agreement
29909
Contract
6647

And result in the following execution plan:
-------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                     |     4 |   128 |   195M  (1)| 02:07:06 |       |       |
|   1 |  PARTITION LIST SINGLE|                     |     4 |   128 |   195M  (1)| 02:07:06 |   KEY |   KEY | 
|   2 |   HASH GROUP BY       |                     |     4 |   128 |   195M  (1)| 02:07:06 |       |       |
|   3 |    PARTITION LIST ALL |                      |  2198M|    65G|   195M  (1)| 02:07:03|     1 |   114 |
|   4 |     TABLE ACCESS FULL | PUBLISHED_DOCUMENTS |   2198M|    65G|   195M  (1)| 02:07:03|   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------
When we specifiy a named partition, we can see how the optimzer is limiting its search only to the partition mentioned in the predicate, but it does not yet know how many subpartitions to scan. Since there is no mention of a date range to match the PERIOD column in the predicate, all 114 subpartitions must be scanned.

Note that the text "TABLE ACCESS FULL" in step 4 can be somewhat confusing: we are only talking about a full table access of the partition called "LEGAL", not the the entire table.

In my experience, specifying the partition name directly is rather unusual, and mostely done by DBAs.
Let's try it with a predicate that is more likely to be sent to the oracle server by a user or a batch program:
select dokumenttype, period, count(*)
from myuser.published_documents
where periode = '2018-01'
group by dokumenttype, period;
The output is as expected:
DOKUMENTTYPE PERIODE COUNT(*)
Affidavit 2018-01
7845
Amending Agreement 2018-01
29909
Contract 2018-01
6647
Payroll 2018-01
7824
HA_related 2018-01
36608
Banking 2018-01
14167
IT 2018-01
4094

The rows in the output above belongs to many different partitions, but they are all from the period 2018-01.

The explain plan for this query would be:
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                           |    50 |  1950 |  6589K  (1)| 00:04:18 |       |       |
|   1 |  PARTITION LIST ALL     |                           |    50 |  1950 |  6589K  (1)| 00:04:18 |     1 |    11 |
|   2 |   HASH GROUP BY         |                           |    50 |  1950 |  6589K  (1)| 00:04:18 |       |       |
|   3 |    PARTITION LIST SINGLE|                           |  8122K|   302M|  6589K  (1)| 00:04:18 |       |       |
|*  4 |     INDEX SKIP SCAN     |        PUB_DOCS_IDX1      |  8122K|   302M|  6589K  (1)| 00:04:18 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("PERIOD"='2018-01')
       filter("PERIOD"='2018-01')
Here, too, we see that the optimizer first selects all 11 partitions, but then use the partitioned index PUB_DOCS_IDX1 to find the rows that would match the string '2018-01'. The optimizer does not yet know how many index subpartitions to scan; this will be determined during run-time.

Thanks to

  • Jim Brull
  • Justin Cave
  • Thursday, February 3, 2022

    Observation: rman saves files according to end date of backup

    I noticed the following: An archivelog file backed up at 1.01.2022 23:57:18 will NOT be saved in the folder for 31.01.2022. Instead, it will be saved in the folder for 01.02.2022.
    Output from RMAN:
     list archivelog from time '31.01.2022' until time '01.02.2022';
     
    Output (excerpt)
    450706  1    450701  A 31.01.2022 23:56:46
            Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450701__vwojnvs6_.arc
    
    450707  1    450702  A 31.01.2022 23:57:16
            Name: /u04/fra/proddb01/archivelog/2022_01_31/o1_mf_1_450702__vwokkx0p_.arc
    
    450708  1    450703  A 31.01.2022 23:57:18
            Name: /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc
    

    The file /u04/fra/proddb01/archivelog/2022_02_01/o1_mf_1_450703__vx4cmycs_.arc has the timestamp Feb 1 00:05
    So in this case, the last file generated on 31.01 actually ended up in the folder for files generated on the 01.02

    Monday, January 31, 2022

    Simple SQL to list

    The following SQL lists the indexes defined on a table, along with the columns and their positioning:
    SELECT I.INDEX_NAME,I.INDEX_TYPE,I.NUM_ROWS,I.DEGREE, C.COLUMN_NAME,C.COLUMN_POSITION
    FROM DBA_INDEXES I JOIN DBA_IND_COLUMNS C
    ON (I.INDEX_NAME = C.INDEX_NAME)
    WHERE I.OWNER='MYSCHEMA'
    AND I.OWNER = C.INDEX_OWNER
    AND I.TABLE_NAME='MYTABLE'
    ORDER BY I.INDEX_NAME, C.COLUMN_POSITION;
    

    Wednesday, January 5, 2022

    How to set a dynamic parameter in a postgreSQL database cluster

    As with oracle, some parameters may be set dynamically in a postgreSQL database cluster. A postgreSQL database cluster uses a parameter file called postgres.conf. This file holds the cluster wide parameters. If you set a dynamic parameter using the ALTER SYSTEM SET command, the parameter will be written to yet another file called postgres.auto.conf, which values will always override the ones parameters in the postgres.conf Before the change, postgres.auto.conf look like this:
    log_line_prefix = '[%m] – %p %q- %u@%h:%d – %a'
    wal_level = 'replica'
    hot_standby = 'on'
    
    I then make a change to the system configuration:
    alter system set hot_standby_feedback=on;
    ALTER SYSTEM
    
    After this change, the file postgres.auto.conf has another entry:
    log_line_prefix = '[%m] – %p %q- %u@%h:%d – %a'
    wal_level = 'replica'
    hot_standby = 'on'
    hot_standby_feedback = 'on'
    
    I will then have to reload the database using the function pg_reload_conf() for the new parameter to take effect:
    postgres=#  select pg_reload_conf();
     pg_reload_conf
    ----------------
     t
    (1 row)
    
    The current logfile for the postgreSQL database cluster records this fact:
    [2022-01-03 14:45:23.127 CET] – 1683 LOG:  received SIGHUP, reloading configuration files
    [2022-01-03 14:45:23.129 CET] – 1683 LOG:  parameter "hot_standby_feedback" changed to "on"
    
    For details, check the documentation

    How to find out if a hot standby postgres database is receiving logs

     select pg_is_in_recovery();
     pg_is_in_recovery
    -------------------
     t
    (1 row)
    
    https://www.postgresql.org/docs/11/functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE

    Wednesday, December 22, 2021

    What is the difference between "force parallel" and "enable parallel" used in the "alter session" statement in Oracle?

    What is the difference between these two statements?
    ALTER SESSION ENABLE PARALLEL DML | DDL | QUERY;
    
    and
    ALTER SESSION FORCE PARALLEL DDL | DML | QUERY;
    
    Answer:

    The difference here lays in the details: the ENABLE statement merely enables parallelization using a concrete parallel directive or parallel hint. If this is not specified, Oracle will execute the statements sequenctually. The FORCE statement will parallelize everything it can with the default DOP (degree of parallelism), without you having to state anyting about this in your DML | DDL or query statements.

    If the default DOP isn't good enough for you (for example during an index rebuild), you can force your session to use a DOP higher than the default, like this:
    ALTER SESSION FORCE PARALLEL DDL | DML | QUERY PARALLEL 32;
    
    This will override any other DOP in the same session and use 32 parallel workers.

    Alter session in 19c is documentet here
    The concept of forcing/enabling parallelization is explained here

    Tuesday, December 21, 2021

    How to work around error [INS-08101] Unexpected error while executing the action at state: ‘supportedOSCheck’

    Thanks so much to Martin Berger for his blog post showing how to work around an error that shows up when you attempt to install Oracle 19c software on a RHEL8 distribution.

    I wanted to install Oracle software on a RH8.5 Linux server:

    cat /etc/redhat-release
    Red Hat Enterprise Linux release 8.5 (Ootpa)
      

    This is the error I received:




        














    In my case, the only thing I had to do was to add another environmental variable:
    export CV_ASSUME_DISTID=OEL7.8
    
    Execute the installer again and you will see a different screen:
    ./runInstaller