Showing posts with label Statistics. Show all posts
Showing posts with label Statistics. Show all posts

Wednesday, June 12, 2019

How to use dbms_stats.report_col_usage



set long 10000
select dbms_stats.report_col_usage('SALES', 'ORDERS') from dual;
exit

Output could look something like this:
DBMS_STATS.REPORT_COL_USAGE('SALES','ORDERS')
--------------------------------------------------------------------------------
LEGEND:
.......

EQ         : Used in single table EQuality predicate
RANGE      : Used in single table RANGE predicate
LIKE       : Used in single table LIKE predicate
NULL       : Used in single table is (not) NULL predicate
EQ_JOIN    : Used in EQuality JOIN predicate
NONEQ_JOIN : Used in NON EQuality JOIN predicate
FILTER     : Used in single table FILTER predicate
JOIN       : Used in JOIN predicate

DBMS_STATS.REPORT_COL_USAGE('SALES','ORDERS')
--------------------------------------------------------------------------------
GROUP_BY   : Used in GROUP BY expression
...............................................................................

###############################################################################

COLUMN USAGE REPORT FOR SALES.ORDERS
.......................................................

1. ORDER_DATE                          : EQ RANGE
2. ORDER_ID                            : EQ
3. ITEM_NAME                           : EQ
###############################################################################

This makes it easy to see which columns could benefit from an index.

Thursday, May 2, 2019

What exactly is meant by "global statistics" on a partitioned table?


From the whitepaper "Understanding Optimizer Statistics":

"When dealing with partitioned tables the Optimizer relies on both the statistics for the entire table
(global statistics) as well as the statistics for the individual partitions (partition statistics) to select a
good execution plan for a SQL statement. If the query needs to access only a single partition, the
Optimizer uses only the statistics of the accessed partition. If the query access more than one partition,
it uses a combination of global and partition statistics."

Tuesday, April 9, 2019

A workaround for ORA-54033: column to be modified is used in a virtual column expression



Applicable for Oracle 12.1.0.2.

My customer reports that an attempt to execute a simple DDL statement failed:
ALTER TABLE mytable
MODIFY (COL2 DATE );

The error thrown was: ORA-54033: column to be modified is used in a virtual column expression.

Quite correctly, a look at the table revealed a virtual column:

set lines 200
col owner format a15
col table_name format a30
col column_name format a30
col data_default format a60

select owner,table_name,column_name, data_Default
from dba_tab_cols
where table_name='MYTABLE'
and hidden_column='YES';

Output:
OWNER           TABLE_NAME      COLUMN_NAME                    DATA_DEFAULT                                                                    
--------------- --------------- ------------------------------ -----------------------------------------------
SCOTT           MYTABLE         SYS_STSC13O20ML6_5OD25YOF16STK SYS_OP_COMBINED_HASH("COL1","COL2","COL3","COL4")             
1 row selected.


You can also query the DBA_STAT_EXTENSION to get similar information:
set lines 200
col extension_name format a30
col extension format a50

SELECT EXTENSION_NAME, EXTENSION,creator,droppable
FROM DBA_STAT_EXTENSIONS
WHERE TABLE_NAME='MYTABLE';

Output:
EXTENSION_NAME                 EXTENSION                      CREATOR DROPPABLE
------------------------------ ------------------------------ ------- ---------
SYS_STSC13O20ML6_5OD25YOF16STK ("COL1","COL2","COL3","COL4")  SYSTEM  YES      
1 row selected.

So in order to modify the existing column, the extended statistics will have to be dropped and recreated:
set lines 200
set serveroutput on
set timing on

BEGIN
  dbms_stats.drop_extended_stats('SCOTT', 'MYTABLE', '(COL1,COL2,COL3,COL4)');
END;
/

ALTER TABLE mytable
MODIFY (COL2 DATE );

select dbms_stats.create_extended_stats('SCOTT', 'MYTABLE', '(COL1,COL2,COL3,COL4)') 
from dual;

Wednesday, April 3, 2019

How to gather histograms on a column of a table




Use the METHOD_OPT directive in DBMS_STATS.GATHER_TABLE_STATS to collect histograms for column ARTICLE_NAME in table SALES_Q1_RESULTS. In this example, 6 buckets are defined.
Check the view DBA_TAB_COLUMNS to confirm whether or not there are histograms collected on the column:

SELECT  NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM
FROM    DBA_TAB_COLUMNS 
WHERE   TABLE_NAME = 'SALES_Q1_RESULTS' 
AND     COLUMN_NAME = 'ARTICLE_NAME';

NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM
0 0 0 0 NONE
BEGIN
  DBMS_STATS.GATHER_TABLE_STATS (OWNNAME => 'SH', TABNAME => 'SALES_Q1_RESULTS', METHOD_OPT => 'FOR COLUMNS SIZE 6 ARTICLE_NAME');
END;
/
SELECT  NUM_DISTINCT,DENSITY,NUM_NULLS,NUM_BUCKETS,HISTOGRAM
FROM    DBA_TAB_COLUMNS 
WHERE   TABLE_NAME = 'SALES_Q1_RESULTS' 
AND     COLUMN_NAME = 'ARTICLE_NAME';
NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS HISTOGRAM
7 0.166666666666667 0 6 HEIGHT BALANCED


Read more in Oracles Documentation

Thursday, February 14, 2019

Syntax for using the dbms_stats.gather_index_stats procedure




How to analyze an index using dbms_stats:
BEGIN
  SYS.DBMS_STATS.GATHER_INDEX_STATS (
     OwnName           => 'SCOTT'
    ,IndName           => 'STATUS_IDX_1'
    ,Estimate_Percent  => DBMS_STATS.AUTO_SAMPLE_SIZE
    ,Degree            => DBMS_STATS.DEFAULT_DEGREE 
    ,No_Invalidate     => DBMS_STATS.AUTO_INVALIDATE);
END;
/

Official documentation here

Thursday, June 21, 2018

What is the meaning of INCREMENTAL_STALENESS in dbms_stats?



INCREMENTAL_STALENESS means "how we decide if a partition's statistics are stale".

Its values can be either USE_STALE_PERCENT or USE_LOCKED_STATS:

USE_STALE_PERCENT - a partition/subpartition is not considered as stale if DML changes are less than the
STALE_PERCENT preference value.

USE_LOCKED_STATS - locked partitions/subpartitions statistics are not considered as stale, regardless of DML changes

NULL - this is the default value, meaning a partition or subpartition is considered as stale as long as it has any DML changes

STALE_PERCENT is the value that determine the percentage of rows in a table that have to change stats before the statistics are deemed stale and should be regathered. Default = 10%


I have used the following script to implement this in one of my 12.2 database.
The last preference (DEGREE) set has nothing to do with incremental statistics, but I have found that using the default degree of parallelism is a good setting for most databases

define owner = 'SCOTT';
define table_name = 'EMP';
BEGIN
   dbms_stats.set_table_prefs('&&owner','&&table_name','INCREMENTAL','TRUE');
   dbms_stats.set_table_prefs('&&owner','&&table_name','INCREMENTAL_STALENESS','USE_STALE_PERCENT, USE_LOCKED_STATS');
   dbms_stats.set_table_prefs('&&owner','&&table_name','DEGREE','DBMS_STATS.DEFAULT_DEGREE');
END;
/


Verfiy your settings with the following script::
 
define owner = 'SCOTT';
define table_name = 'EMP';
set lines 200
set verify off
col "incremental" format a20
col "incr staleness" format a40
col "incremental level" format a10
col "stale percent" format a20
col "synopsis type" format a40
col "degree" format a30
col "granularity" format a20

select dbms_stats.get_prefs('INCREMENTAL','&&owner','&&table_name') "incremental",
        dbms_stats.get_prefs('INCREMENTAL_STALENESS','&&owner','&&table_name') "incr staleness",
        dbms_stats.get_prefs('INCREMENTAL_LEVEL','&&owner','&&table_name') "incremental level",
        dbms_stats.get_prefs('STALE_PERCENT','&&owner','&&table_name') "stale percent",
        dbms_stats.get_prefs('APPROXIMATE_NDV_ALGORITHM','&&owner','&&table_name') "synopsis type",
        dbms_stats.get_prefs('DEGREE','&&owner','&&table_name') "degree" ,
        dbms_stats.get_prefs('GRANULARITY','&&owner','&&table_name') "granularity"
FROM DUAL
/

Example output:

incremental incr staleness incremental level stale percent synopsis type degree granularity
TRUE USE_STALE_PERCENT, USE_LOCKED_STATS PARTITION 10 REPEAT OR HYPERLOGLOG 32 AUTO


Source: "Understanding Optimizer Statistics With Oracle Database 12c Release 2"

Tuesday, April 10, 2018

More on finding failed autostats jobs



In a previous post, I showed how to check if an autostats job had succeeded.

Recently I had to unlock table stats on a very large partitioned table, and then lock down older partition stats explicitly, to allow Oracle to gather statistics for the most recent partitions automatically with the auto stats job.

I wanted to know if Oracle was able to analyze the partitions within the designated 4 hours of weekday window.
When you read the code snippet below, please keep in mind that I have defined my weekday windows to start at 23:59 every evening.

I found the following query to be useful:

select distinct 
       opid, 
       target,
       target_type, 
       to_char(start_Time,'dd.mm.yyyy hh24:mi:ss') "starttime",
       to_char(end_time,'dd.mm.yyyy hh24:mi:ss')   "endtime",
       status
from dba_optstat_operation_tasks 
where opid IN (
            SELECT ID
            FROM DBA_OPTSTAT_OPERATIONS
            WHERE OPERATION LIKE 'gather_database_stats (auto)'
            AND TO_CHAR(START_TIME,'dd.mm.yyyy hh24:mi') = '09.04.2018 23:59'
)
and target like '%SALES%';
The result:

OPID TARGET TARGET_TYPE starttime endtime STATUS
47815
SH.SALES TABLE 09.04.2018 23:59:39 10.04.2018 03:25:17 COMPLETED
47815
SH.SALES.P_SALES_DATA_2016 TABLE PARTITION 10.04.2018 03:25:41 10.04.2018 03:59:00 TIMED OUT
47815
SH.SALES.P_SALES_DATA_MAX TABLE PARTITION 10.04.2018 03:25:33 10.04.2018 03:25:33 COMPLETED

All in all, the autostat job started on April 9th, 2018 at 23:59, had the following outcome:
select status,count(*)
from dba_optstat_operation_tasks 
where opid IN (
            SELECT ID
            FROM DBA_OPTSTAT_OPERATIONS
            WHERE OPERATION LIKE 'gather_database_stats (auto)'
            AND TO_CHAR(START_TIME,'dd.mm.yyyy hh24:mi') = '09.04.2018 23:59'
)
group by status;

STATUS COUNT(*)
FAILED
1
TIMED OUT
12
COMPLETED
339

From the output you can now proceed to find out why the analysis failed or timed out. The query below will list these objects for you:

select distinct opid, 
       target,
       target_type, 
    to_char(start_Time,'dd.mm.yyyy hh24:mi:ss') "starttime",
    to_char(end_time,'dd.mm.yyyy hh24:mi:ss')   "endtime",
    status
from dba_optstat_operation_tasks 
where opid IN (
            SELECT ID
            FROM DBA_OPTSTAT_OPERATIONS
            WHERE OPERATION LIKE 'gather_database_stats (auto)'
            AND TO_CHAR(START_TIME,'dd.mm.yyyy hh24:mi') = '09.04.2018 23:59'
)
and STATUS IN ('TIMED OUT','FAILED');

Wednesday, February 14, 2018

How to view default settings for dbms_stats



Thanks to the author of the blog "Thinking out loud" for writing this piece of code. It makes it easy to view the default settings that applies when gathering schema statistics:

SELECT 
  username,
  DBMS_STATS.get_prefs(ownname=>USER,pname=>'INCREMENTAL') incr,
  DBMS_STATS.get_prefs(ownname=>USER,pname=>'GRANULARITY') grty,
  DBMS_STATS.get_prefs(ownname=>USER,pname=>'STALE_PERCENT') "stale%",
  DBMS_STATS.get_prefs(ownname=>USER,pname=>'ESTIMATE_PERCENT') "estimate%",
  DBMS_STATS.get_prefs(ownname=>USER,pname=>'CASCADE') cascade,
  DBMS_STATS.get_prefs(pname=>'METHOD_OPT') method_opt,
  DBMS_STATS.get_prefs(pname=>'DEGREE') degree,
  DBMS_STATS.get_prefs(pname=>'CONCURRENT') CONC
FROM dba_users
WHERE username IN ('BILL','BOB','SCOTT','JANE')
ORDER BY username;

USERNAME INCR GRTY STALE% ESTIMATE% CASCADE METHOD_OPT DEGREE CONC
BILL TRUE AUTO 10 DBMS_STATS.AUTO_SAMPLE_SIZE DBMS_STATS.AUTO_CASCADE FOR ALL COLUMNS SIZE AUTO DBMS_STATS.DEFAULT_DEGREE OFF
BOB TRUE AUTO 10 DBMS_STATS.AUTO_SAMPLE_SIZE DBMS_STATS.AUTO_CASCADE FOR ALL COLUMNS SIZE AUTO DBMS_STATS.DEFAULT_DEGREE OFF
SCOTT TRUE AUTO 10 DBMS_STATS.AUTO_SAMPLE_SIZE DBMS_STATS.AUTO_CASCADE FOR ALL COLUMNS SIZE AUTO DBMS_STATS.DEFAULT_DEGREE OFF
JANE TRUE AUTO 10 DBMS_STATS.AUTO_SAMPLE_SIZE DBMS_STATS.AUTO_CASCADE FOR ALL COLUMNS SIZE AUTO DBMS_STATS.DEFAULT_DEGREE OFF

Tuesday, April 4, 2017

How to enable incremental statistics gathering for a large partitioned table

What is incremental statistics?

It's a way to improve performance of global statisics gathering on large partitioned tables.

Incremental statistics allows the database to scan the table once to gather partition statistics and from there to derive global statistics by aggregating partition-level statistics.

During later statistics gathering, the database only needs to scan stale partitions and then update their statistics.
From there the database can derive global statistics from the fresh partition statistics, which in turn saves a full table scan.

To view the current settings:

SQL> select dbms_Stats.get_prefs('INCREMENTAL','SCOTT','EMP') "paramvalue" from dual;

paramvalue
--------------------
FALSE

To change the setting:
SQL> exec dbms_stats.set_table_prefs(ownname=>'SCOTT',tabname=>'EMP',pname=>'INCREMENTAL',pvalue=>'TRUE');

Verify:
SQL> select dbms_Stats.get_prefs('INCREMENTAL','SCOTT','EMP') "paramvalue" from dual;

paramvalue
--------------------
TRUE

For more information about incremental statistics and how it can benefit large table, read this from the Oracle Documentation

Wednesday, February 22, 2017

How to speed up statistics gathering on a large table - inter object parallelism

When you use parallelism to analyze a specific object, Oracle refer to this as "Inter Object Parallelism".
In short, the Degree Of Parallelism (DOP) is decided based on the object's DEGREE setting, OR the DEGREE Directive in the dbms_stats.gather_table_stats procedure.

In my example, the partitioned table HR.SALES was created with DEGREE=1, which is the default.

I then collected statistics on it, like this:

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => 'HR'
    ,TabName        => 'SALES'
    ,Degree            => DBMS_STATS.DEFAULT_DEGREE 
);
END;
/
When dbms_stats.gather_table_stats was executed with DEGREE=DBMS_STATS.DEFAULT_DEGREE, Oracle used a DOP of 8, which is calculated using the formula

PARALLEL_THREADS_PER_CPU (2) * CPU_COUNT (4)

If you look in v$sql, this Call to dbms_stats triggers a parallel SQL:
/* SQL Analyze(0) */
select /*+ full(t) parallel(t,8) parallel_index(t,8) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl no_substrb_pad */
I found that for a table sized 100G, the statistics gathering dropped from 3 hours to 47 minutes using the DEGREE directive.

Note that Oracle will still collect statistics one partition at a time:

“You should note that setting the DEGREE for a partitioned table means that multiple parallel sever processes will be used to gather statistics on each partition but the statistics will not be gathered concurrently on the different partitions. Statistics will be gathered on each partition one after the other”.

Sources: Whitepaper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c"
Oracles Documentation about dbms_stats

Oracle 12c feature: Online statistics gathering

Oracle 12c introduced the concept of "online statitistics gathering".

The white paper "Best Practices for Gathering Optimizer Statistics with Oracle Database 12c" released in June 2013, explains:


"In Oracle Database 12c, online statistics gathering "piggybacks" statistics gather as part of a direct-path data loading operation"

and

Gathering statistics as part of the data loading operation, means no additional full data scan is required to have statistics available immediately after the data is loaded.”

and

"online statistics gathering was designed to have a minimal impact on the performance of a direct path load operation it can only occur when data is being loaded into an empty object"

This can be a huge time saver in database where regular heap-organized tables are loaded and truncated as a part of their ETL Processing.

Note that some restrictions apply:

1. Histograms or index statistics are not gathered as a part of online statistics gathering, since they require additional data scans

2. Online statistics gathering will only follow a Direct-path load:

   * CTAS - Create Table As Select
   * IAS - Insert As Select
   * INSERT INTO /* +APPEND */

3. The base tables which are loaded must be empty


Any negative consequence of the first restriction can be mitigated by collecting historgram data on the table *after* the online statistics have been collected. Oracle does this without re-gathering the base column statistics:
exec dbms_stats.gather_table_stats(OwnName => 'SCOTT',TabName => 'DEPT', options=>'GATHER AUTO');

The easiest way to check if the statistics in a column has been gathered via the Online statisitcs feature, is to check the "histogram" and the "notes" column in the user|dba|all _tab_col_statistics view:

set lines 200
col column_name format a30
col notes format a30
select column_name,num_distinct,num_nulls,histogram,notes
from dba_tab_col_statistics
where table_name='DEPT';

If output in the histogram column says "NONE" and the notes columns says "STATS_ON_LOAD", Online statistics gathering has indeed been utilized.

If the output of the same columns says "HYBRID" and "HISTOGRAM_ONLY", it indicates that histograms were gathered without regathering basic column statistics.

Wednesday, February 15, 2017

How to lock table statstics in Oracle

Use dbms_stats.lock_table_stats

conn scott
exec dbms_stats.lock_table_stats('SCOTT','DEPT');

Verify the setting:

select table_name,object_type,stattype_locked 
from user_tab_statistics;

TABLE_NAME                     OBJECT_TYPE                          STATTYPE_LOCKED
------------------------------ ------------------------------------ ---------------
DEPT                           TABLE                                ALL
EMP                            TABLE
BONUS                          TABLE
SALGRADE                       TABLE
RECEIVED_DOCUMENTS             TABLE

Lock stats on a specific partition:
exec dbms_stats.lock_partition_stats('SCOTT','RECEIVED_DOCUMENTS','SETTELM');
Verify the setting:
select table_name,partition_name,subpartition_name,object_type,stattype_locked 
from user_tab_statistics 
where table_name='RECEIVED_DOCUMENTS';

TABLE_NAME                     PARTITION_NAME  SUBPARTITION_NAME OBJECT_TYPE          STATTYPE_LOCKED
------------------------------ --------------- ----------------- -------------------- ---------------
RECEIVED_DOCUMENTS                                               TABLE
RECEIVED_DOCUMENTS             APPLICANT                         PARTITION
RECEIVED_DOCUMENTS             APPLICATIONS                      PARTITION
RECEIVED_DOCUMENTS             SETTELM                           PARTITION            ALL

To reverse the process:
exec dbms_stats.unlock_table_stats('SCOTT','DEPT);
exec dbms_stats.unlock_partition_stats('SCOTT','RECEIVED_DOCUMENTS','SETTELM');

Tuesday, December 27, 2016

How to analyze a specific partition only using dbms_stats

To analyze only specific partitions, including subpartitions, use the dbms_stats.gather_table_stats while specifying the directives PART_NAME and GRANULARITY.

The following query reveals that several of the recently added subpartitions in the COMPLETED_ORDERS partition have not yet been analyzed:
SELECT P.PARTITION_NAME "partition name",P.NUM_ROWS "partition rows",P.LAST_ANALYZED "last analyzed", SP.SUBPARTITION_NAME "subpart name", SP.NUM_ROWS "subpartition rows",SP.LAST_ANALYZED "subpart last analyzed"
FROM DBA_TAB_PARTITIONS P INNER JOIN DBA_TAB_SUBPARTITIONS SP
ON (P.TABLE_NAME = SP.TABLE_NAME)
WHERE P.TABLE_NAME='ORDERS'
AND P.PARTITION_NAME = SP.PARTITION_NAME
AND P.PARTITION_NAME = 'COMPLETED_ORDERS'
AND SP.NUM_ROWS IS NULL
ORDER BY P.PARTITION_NAME DESC,SP.SUBPARTITION_NAME;

Output:

partition name partition rows last analyzed subpart name subpartition rows subpart last analyzed
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2017    
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2018    
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2019    
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2020    

To analyze them, use dbms_stats with the PART_NAME directive, and the GRANULARITY directive set to "SUBPARTITION"
BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => 'SCOTT'
    ,TabName        => 'ORDERS'
    ,partname       => 'COMPLETED_ORDERS'
    ,Estimate_Percent  => DBMS_STATS.AUTO_SAMPLE_SIZE
    ,method_opt        => 'FOR ALL COLUMNS SIZE AUTO'
    ,Degree            => DBMS_STATS.AUTO_DEGREE
    ,Cascade           => DBMS_STATS.AUTO_CASCADE
    ,granularity       => 'SUBPARTITION'
    ,No_Invalidate     => DBMS_STATS.AUTO_INVALIDATE);
END;
/

After statistics gathering, execute the query again, but without the "AND SP.NUM_ROWS IS NULL" predicative:

partition name partition rows last analyzed subpart name subpartition rows subpart last analyzed
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2017
0
27/12/2016 09:10:04
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2018
0
27/12/2016 09:10:05
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2019
0
27/12/2016 09:10:07
COMPLETED_ORDERS
2527257
19/10/2015 07:17:44 CO_SP_2020
0
27/12/2016 09:10:08


The 12.2 version of dbms_stats.gather_table_stats is documented here

Thursday, January 22, 2015

A complete script for gathering system, data dictionary and fixed objects stats

alter session set nls_language='AMERICAN';
set timing on
set serveroutput on
set lines 200
col pname format a20

BEGIN
DBMS_STATS.GATHER_DICTIONARY_STATS (
Estimate_Percent  => DBMS_STATS.AUTO_SAMPLE_SIZE
,method_opt        => 'FOR ALL COLUMNS SIZE AUTO'
,Degree            => NULL
,Cascade           => DBMS_STATS.AUTO_CASCADE
,granularity       => 'AUTO'
,No_Invalidate     => DBMS_STATS.AUTO_INVALIDATE);
END;
/

BEGIN
DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;
END;
/

SELECT PNAME, PVAL1 FROM SYS.AUX_STATS$;

PROMPT Gather workload system stats, sample for 1 hour
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS (
GATHERING_MODE=>'INTERVAL',
interval => 60,
statid   => 'DAYTIME');
END;
/

SELECT PNAME, PVAL1 FROM SYS.AUX_STATS$;
exit

Tim Hall has written a good article about how the gather_system_stats procedure works, read it at www.oracle-base.com

Wednesday, June 25, 2014

How to gather dictionary statistics and fixed-objects statistics

Use the procedures

dbms_stats.gather_dictionary_stats
and
dbms_stats.gather_fixed_objects_stats

Examples:
BEGIN
   DBMS_STATS.GATHER_DICTIONARY_STATS (
     estimate_percent  => DBMS_STATS.AUTO_SAMPLE_SIZE
     ,method_opt        => 'FOR ALL COLUMNS SIZE AUTO'
     ,degree            => NULL
     ,cascade           => DBMS_STATS.AUTO_CASCADE
     ,granularity       => 'AUTO'
     ,no_Invalidate     => DBMS_STATS.AUTO_INVALIDATE);
 END;
 /

 BEGIN
   DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;
 END;
 /

Used in a simple shell script:
#!/usr/bin/ksh
# Gather dictionary and fixed objects stats as recommended by oracle

cat << EoF > ${SCRIPT_BASE}/sql/gather_dict_stats.sql
alter session set nls_date_format='DD-MM-YYYY HH24:MI:SS';
spool ${SCRIPT_BASE}/log/gather_dict_stats.log append
set timing on
select 'Start time: ' || sysdate as "startime" from dual;
execute dbms_stats.gather_dictionary_stats(estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,method_opt=>'FOR ALL COLUMNS SIZE AUTO',degree=>DBMS_STATS.DEFAULT_DEGREE,granularity=>'ALL',cascade=>DBMS_STATS.AUTO_CASCADE,options=>'GATHER AUTO',no_invalidate=>DBMS_STATS.AUTO_INVALIDATE);
execute dbms_stats.gather_fixed_objects_stats(NULL);
select 'End time: ' || sysdate as "endtime" from dual;
exit
EoF
sqlplus / as sysdba @${SCRIPT_BASE}/sql/gather_dict_stats.sql
rm -r ${SCRIPT_BASE}/sql/gather_dict_stats.sql
exit


Good sources for further reading: Maria Colgan's blog

Tuesday, June 3, 2014

How to check progress of a long running statistics gathering job

If you have a long-running statistics job running, you can check it from v$session_longops:

For example, you execute:
SQL> EXECUTE dbms_stats.gather_dictionary_stats;
PL/SQL procedure successfully completed.

Check progress with:
SQL> select sofar, totalwork,units,start_time,time_remaining,message  
     from v$session_longops
     where opname = 'Gather Dictionary Schema Statistics';

SOFAR  TOTALWORK UNITS                START_TIM TIME_REMAINING MESSAGE
---------- ---------- -------------------- --------- -------------- ------------------------------------------------
       423        423 Objects              03-JUN-14              0 Gather Dictionary Schema Statistics: Dictionary Schema : 423 out of 423 Objects done

Thursday, October 31, 2013

Syntax for using the dbms_stats.gather_table_stats and gather_schema_stats procedures

To use the default options:

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => 'SCOTT'
    ,TabName        => 'EMP'
    ,Estimate_Percent  => DBMS_STATS.AUTO_SAMPLE_SIZE
    ,method_opt        => 'FOR ALL COLUMNS SIZE AUTO'
    ,Degree            => NULL
    ,Cascade           => DBMS_STATS.AUTO_CASCADE
    ,granularity       => 'AUTO'
    ,No_Invalidate     => DBMS_STATS.AUTO_INVALIDATE);
END;
/

BEGIN
    DBMS_STATS.GATHER_SCHEMA_STATS (
    ownname          => 'SCOTT',
    estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
    method_opt       => 'FOR ALL COLUMNS SIZE AUTO',
    degree           => DBMS_STATS.DEFAULT_DEGREE,
    granularity      => 'ALL',
    cascade          => DBMS_STATS.AUTO_CASCADE,
    options          => 'GATHER', 
    no_invalidate => DBMS_STATS.AUTO_INVALIDATE);
END;
/

Estimate_percent means percentage of rows to estamite. NULL means compute statistics, and gives the most accurate results, but may be completely innappropriate with large tables. Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to let Oracle figure out the appropriate number autmatically. In my opinion, this is sufficient for most situations.

Granularity means Granularity of statistics to collect (only relevant if the table is partitioned).
'ALL' - gathers all (subpartition, partition, and global) statistics
'AUTO' - determines the granularity based on the partitioning type. (DEFAULT)
'DEFAULT' - This option is obsolete. You should use the 'GLOBAL AND PARTITION' for this functionality.
'GLOBAL' - gathers global statistics
'GLOBAL AND PARTITION' - gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object
'PARTITION' - gathers partition-level statistics
'SUBPARTITION' - gathers subpartition-level statistics

cascade means whether or not statistics for the indexes on this table should be gathered.
Using this option is equivalent to running the GATHER_INDEX_STATS Procedure on each of the table's indexes.
Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine whether index statistics to be collected or not (DEFAULT).

degree is the degree of parallelism. The default is NULL, and instructs Oracle to use the DOP (Degree of Parallelism) specified by the DEGREE-clause when the table or index was created. By using the constant DBMS_STATS.DEFAULT_DEGREE you tell Oracle to calculate the DOP for you, using the formula (PARALLEL_THREADS_PER_CPU * CPU_COUNT). Oracle may still choose to use serial execution, if the size of the Object doesn't warrant a parallel execution. See this post for more information about the DEGREE setting.


method_opt means "collect histograms"
Can be:
FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]...]

size_clause: SIZE {integer | REPEAT | AUTO | SKEWONLY}

integer : Number of histogram buckets. Must be in the range [1,254].
'REPEAT' - Collects histograms only on the columns that already have histograms.
'AUTO' - Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
'SKEWONLY' - Oracle determines the columns to collect histograms based on the data distribution of the columns.
The default is FOR ALL COLUMNS SIZE AUTO. Note that you can change what's default using the SET_PARAM Procedure.

option GATHER AUTO is the default. GATHER forces a complete re-gathering of all tables

What is a histogram?
Column statistics may be stored as histograms. These histograms provide accurate estimates of the distribution of column data. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.

About privileges:
To call this procedure you must either a) be owner of the table or b) have been granted the ANALYZE ANY privilege.

See also this post about how to gather statistics on a specific partition only

Source: Oracle documentation