Wednesday, February 22, 2017

How to speed up statistics gathering on a large table - inter object parallelism

When you use parallelism to analyze a specific object, Oracle refer to this as "Inter Object Parallelism".
In short, the Degree Of Parallelism (DOP) is decided based on the object's DEGREE setting, OR the DEGREE Directive in the dbms_stats.gather_table_stats procedure.

In my example, the partitioned table HR.SALES was created with DEGREE=1, which is the default.

I then collected statistics on it, like this:

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => 'HR'
    ,TabName        => 'SALES'
    ,Degree            => DBMS_STATS.DEFAULT_DEGREE 
);
END;
/
When dbms_stats.gather_table_stats was executed with DEGREE=DBMS_STATS.DEFAULT_DEGREE, Oracle used a DOP of 8, which is calculated using the formula

PARALLEL_THREADS_PER_CPU (2) * CPU_COUNT (4)

If you look in v$sql, this Call to dbms_stats triggers a parallel SQL:
/* SQL Analyze(0) */
select /*+ full(t) parallel(t,8) parallel_index(t,8) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl no_substrb_pad */
I found that for a table sized 100G, the statistics gathering dropped from 3 hours to 47 minutes using the DEGREE directive.

Note that Oracle will still collect statistics one partition at a time:

“You should note that setting the DEGREE for a partitioned table means that multiple parallel sever processes will be used to gather statistics on each partition but the statistics will not be gathered concurrently on the different partitions. Statistics will be gathered on each partition one after the other”.

Sources: Whitepaper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c"
Oracles Documentation about dbms_stats

No comments:

Post a Comment