Wednesday, February 22, 2017

Oracle 12c feature: Online statistics gathering

Oracle 12c introduced the concept of "online statitistics gathering".

The white paper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c" released in June 2013, explains:


"In Oracle Database 12c, online statistics gathering "piggybacks" statistics gather as part of a direct-path data loading operation"

and

Gathering statistics as part of the data loading operation, means no additional full data scan is required to have statistics available immediately after the data is loaded.”

and

"online statistics gathering was designed to have a minimal impact on the performance of a direct path load operation it can only occur when data is being loaded into an empty object"

This can be a huge time saver in database where regular heap-organized tables are loaded and truncated as a part of their ETL Processing.

Note that some restrictions apply:

1. Histograms or index statistics are not gathered as a part of online statistics gathering, since they require additional data scans

2. Online statistics gathering will only follow a Direct-path load:

   * CTAS - Create Table As Select
   * IAS - Insert As Select
   * INSERT INTO /* +APPEND */

3. The base tables which are loaded must be empty


Any negative consequence of the first restriction can be mitigated by collecting historgram data on the table *after* the online statistics have been collected. Oracle does this without re-gathering the base column statistics:
exec dbms_stats.gather_table_stats(OwnName => 'SCOTT',TabName => 'DEPT', options=>'GATHER AUTO');

The easiest way to check if the statistics in a column has been gathered via the Online statisitcs feature, is to check the "histogram" and the "notes" column in the user|dba|all _tab_col_statistics view:

set lines 200
col column_name format a30
col notes format a30
select column_name,num_distinct,num_nulls,histogram,notes
from dba_tab_col_statistics
where table_name='DEPT';

If output in the histogram column says "NONE" and the notes columns says "STATS_ON_LOAD", Online statistics gathering has indeed been utilized.

If the output of the same columns says "HYBRID" and "HISTOGRAM_ONLY", it indicates that histograms were gathered without regathering basic column statistics.

No comments:

Post a Comment