Wednesday, March 8, 2017

How to partition a table using a virtual column and online redefinition


Consider the following table:
desc DOCUMENT_SUMMARY
Navn                                      Null?    Type
----------------------------------------- -------- ----------------------------
DOCUMENTNAME                              NOT NULL VARCHAR2(100)
SUMMARY                                            VARCHAR2(32)
A SELECT * FROM DOCUMENT_SUMMARY reveals contents of the following type (excerpt only):

DOCUMENT1:2015-08:9200060665151:9000010707472:9000044000014 9b37f5bcee4ce643058ee633359b3253
DOCUMENT2:2016-01:9200060665151:8837773777371:9000088000009 17bf7170217960f303755bdd4c648676

The table is quite big, and expected to grow.
The customer would like to get it partitioned.

The column DOCUMENTNAME already contains a string that is partly generated from a date, as revealed by the query below:

SELECT SUBSTR(DOCUMENTNAME,32,7) "datepart" 
FROM DOCUMENT_SUMMARY
FETCH FIRST 2 ROWS ONLY;

datepart
2015-01
2015-01


We could most likely use a substring of the value to create a new, virtual column and use it as a partition key.

So let's do that:
CREATE TABLE DOCUMENT_SUMMARY_INTERIM
(
  DOCUMENTNAME  VARCHAR2(100 BYTE),
  SUMMARY       VARCHAR2(32 BYTE),
  PERIOD AS (
     CAST(
        TO_DATE(SUBSTR(documentname,32,7),'YYYY-MM') 
        AS DATE)
  ) VIRTUAL
)
PARTITION BY RANGE(PERIOD)
INTERVAL(NUMTOYMINTERVAL(1,'MONTH') )
(
PARTITION P_INIT VALUES LESS THAN (TO_DATE('2013-01','YYYY-MM') )
)
TABLESPACE USERS
;

Run through a normal online redefinition, and the tables should be switched seamlessly.

Selecting from the new table shows that the date is being generated as a virtual column for every row:

SELECT * 
FROM DOCUMENT_SUMMARY
FETCH FIRST 2 ROWS ONLY;

DOKUMENTNAME SUMMARY PERIOD
DOCUMENT1:2015-08:9200060665151:9000010707472:9000044000014 f1081c88eb379b736954641589d5715b 01.08.2015
DOCUMENT2:2016-01:9200060665151:8837773777371:9000088000009 051722d803b2fbaca4192a047b7689a6 01.01.2016

You can verify your partition key attributes with this SQL:

SELECT K.COLUMN_NAME, C.DATA_TYPE,C.VIRTUAL_COLUMN ,PT.INTERVAL
FROM DBA_PART_KEY_COLUMNS K JOIN DBA_TAB_COLS C
    ON (K.COLUMN_NAME = C.COLUMN_NAME) 
        JOIN DBA_PART_TABLES PT ON PT.TABLE_NAME = C.TABLE_NAME
WHERE  K.NAME='DOCUMENT_SUMMARY' ;

COLUMN_NAME DATA_TYPE VIRTUAL_COLUMN INTERVAL
PERIODE DATE YES NUMTOYMINTERVAL(1,'MONTH')

Thursday, February 23, 2017

What is the INHERIT PRIVILEGES granted to PUBLIC in 12c?

I was checking which object privileges my user had made, and I found the following:

select * from user_tab_privs_made;

GRANTEE TABLE_NAME GRANTOR PRIVILEGE GRANTABLE HIERARCHY COMMON TYPE
PUBLIC MYUSER MYUSER INHERIT PRIVILEGES NO NO NO USER

I couldn't understand when this privilege had been made, and why the type was set to USER.

Searching the internet, I soon found that this is a new privilege in Oracle 12c. By default the privilege INHERIT PRIVILEGES is granted to PUBLIC, and it secures a weakness where a malicious user may accidently get access to a privilege that user shouldn't have.

For backward compability INHERIT PRIVILEGES is granted to PUBLIC in Oracle 12c, but it can and probably should be revoked.

Tim Hall has setup a simple example on how this new privilege can secure your database. Instead of trying to mimic his work, I will simply refer you to his site.

You can find his article at oracle-base.com

See also the release changes section for Oracle 12c


Wednesday, February 22, 2017

How to alter a subpartition template


Use the SET SUBPARTITION TEMPLATE clause, like this:
ALTER TABLE SCOTT.MYTABLE
SET SUBPARTITION TEMPLATE
  (SUBPARTITION SP_2014 VALUES ('2014') TABLESPACE USERS,
   SUBPARTITION SP_2015 VALUES ('2015') TABLESPACE USERS,
   SUBPARTITION SP_2016 VALUES ('2016') TABLESPACE USERS,
   SUBPARTITION SP_2017 VALUES ('2017') TABLESPACE USERS,
   SUBPARTITION SP_2018 VALUES ('2018') TABLESPACE USERS,
   SUBPARTITION SP_2019 VALUES ('2019') TABLESPACE USERS,
   SUBPARTITION SP_2020 VALUES ('2020') TABLESPACE USERS,
   SUBPARTITION SP_DEFAULT VALUES (default) TABLESPACE USERS
  );

Source: Oracle Documentation

Insert As Select statements

The syntax for an Insert As Select Statement would be:
INSERT INTO MY_INTERIM_TABLE (E_ID, ENAME,CREATED_DT, REVISION, FISCALYEAR, FPERIOD, PHONENBR)
                       SELECT E_ID, ENAME,CREATED_DT, REVISION, FISCALYEAR, TO_DATE(FPERIOD,'YYYY-MM'), PHONENBR
                       FROM   MY_ORIGINAL_TABLE
                       FETCH FIRST 5 ROWS ONLY;

The FETCH FIRST num ROWS ONLY is a convenient way to limit the number of rows when you are setting up tables for test purposes.

How to create a range partitioned table



Using interval partitioning:

A simple example of an Interval Range Partitioned table:
CREATE TABLE DOCUMENTS
(
  DOCNAME          VARCHAR2(100 BYTE),
  TRACKING#        VARCHAR2(32 BYTE),
  ESTABLISHED_DT   DATE
)
PARTITION BY RANGE (ESTABLISHED_DT)
INTERVAL
(
   NUMTOYMINTERVAL(1,'MONTH')
)
(
PARTITION P_INIT VALUES LESS THAN (TO_DATE('2014-03','YYYY-MM') )
)
TABLESPACE USERS;

If you use INTERVAL partitioning, you are not allowed to specify an overflow partition.

Without internval partitioning:

Before Oracle 11g, this was the only way to create a RANGE partitioned table:
You had to manually define your partitions, and remember to manually add new ones if you expected new partition key values to be inserted.

Here I am using a column of type NUMBER to implement range partitioning.
I am defining an overflow partition to catch values outside the range of expected values:

CREATE TABLE mytable
(
  ID                      NUMBER(18)           NOT NULL,
  f_name                  NUMBER(18)           NOT NULL,
  l_name                  NUMBER(18)           NOT NULL,
  created_year            NUMBER(4)            NOT NULL
   ....
)
PARTITION BY RANGE (created_year)
(
PARTITION P_2003 values less than (2004) TABLESPACE 2003_DATA,
PARTITION P_2004 values less than (2005) TABLESPACE 2004_DATA,
PARTITION P_2005 values less than (2006) TABLESPACE 2005_DATA,
PARTITION P_DEFAULT values less than (MAXVALUE)
);

How to speed up statistics gathering on a large table - inter object parallelism

When you use parallelism to analyze a specific object, Oracle refer to this as "Inter Object Parallelism".
In short, the Degree Of Parallelism (DOP) is decided based on the object's DEGREE setting, OR the DEGREE Directive in the dbms_stats.gather_table_stats procedure.

In my example, the partitioned table HR.SALES was created with DEGREE=1, which is the default.

I then collected statistics on it, like this:

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => 'HR'
    ,TabName        => 'SALES'
    ,Degree            => DBMS_STATS.DEFAULT_DEGREE 
);
END;
/
When dbms_stats.gather_table_stats was executed with DEGREE=DBMS_STATS.DEFAULT_DEGREE, Oracle used a DOP of 8, which is calculated using the formula

PARALLEL_THREADS_PER_CPU (2) * CPU_COUNT (4)

If you look in v$sql, this Call to dbms_stats triggers a parallel SQL:
/* SQL Analyze(0) */
select /*+ full(t) parallel(t,8) parallel_index(t,8) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl no_substrb_pad */
I found that for a table sized 100G, the statistics gathering dropped from 3 hours to 47 minutes using the DEGREE directive.

Note that Oracle will still collect statistics one partition at a time:

“You should note that setting the DEGREE for a partitioned table means that multiple parallel sever processes will be used to gather statistics on each partition but the statistics will not be gathered concurrently on the different partitions. Statistics will be gathered on each partition one after the other”.

Sources: Whitepaper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c"
Oracles Documentation about dbms_stats

Oracle 12c feature: Online statistics gathering

Oracle 12c introduced the concept of "online statitistics gathering".

The white paper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c" released in June 2013, explains:


"In Oracle Database 12c, online statistics gathering "piggybacks" statistics gather as part of a direct-path data loading operation"

and

Gathering statistics as part of the data loading operation, means no additional full data scan is required to have statistics available immediately after the data is loaded.”

and

"online statistics gathering was designed to have a minimal impact on the performance of a direct path load operation it can only occur when data is being loaded into an empty object"

This can be a huge time saver in database where regular heap-organized tables are loaded and truncated as a part of their ETL Processing.

Note that some restrictions apply:

1. Histograms or index statistics are not gathered as a part of online statistics gathering, since they require additional data scans

2. Online statistics gathering will only follow a Direct-path load:

   * CTAS - Create Table As Select
   * IAS - Insert As Select
   * INSERT INTO /* +APPEND */

3. The base tables which are loaded must be empty


Any negative consequence of the first restriction can be mitigated by collecting historgram data on the table *after* the online statistics have been collected. Oracle does this without re-gathering the base column statistics:
exec dbms_stats.gather_table_stats(OwnName => 'SCOTT',TabName => 'DEPT', options=>'GATHER AUTO');

The easiest way to check if the statistics in a column has been gathered via the Online statisitcs feature, is to check the "histogram" and the "notes" column in the user|dba|all _tab_col_statistics view:

set lines 200
col column_name format a30
col notes format a30
select column_name,num_distinct,num_nulls,histogram,notes
from dba_tab_col_statistics
where table_name='DEPT';

If output in the histogram column says "NONE" and the notes columns says "STATS_ON_LOAD", Online statistics gathering has indeed been utilized.

If the output of the same columns says "HYBRID" and "HISTOGRAM_ONLY", it indicates that histograms were gathered without regathering basic column statistics.