Monday, December 18, 2017

How to make use of the GATHER_PLAN_STATISTICS hint

Use the gather_plan_statistics hint to view the optimizer's estimation of rows vs. the actual returned number of rows.

Notice the case - gather_plan_statistics - I had trouble getting it to work when stated in capital letters as GATHER_PLAN_STATISTICS. So stick to lower case!

Use the hint in the SELECT part of your query, for example:
select /*+ gather_plan_statistics */
from mytable t1 join mytable t2....

set linesize 200
set pagesize 0
select * from table
(dbms_xplan.display_cursor(NULL,NULL,'TYPICAL ALLSTATS LAST'));
exit

Alternatively, after the query has run to completion, check the cursor cache for the details of your specific SQL ID, and use the display_cursor with the SQL ID explicitly set:
set linesize 200
set pagesize 0
select * from table
(dbms_xplan.display_cursor( 'dnyrjuumj8psq',1,'TYPICAL ALLSTATS LAST'));
exit

Either way will work and give you the following example output:

Plan hash value: 4147659309

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name    | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |         |      1 |        |       |  1862 (100)|          |   3012 |00:08:14.63 |     576K|    553K|       |       |          |
|   1 |  SORT ORDER BY                           |         |      1 |      2 |   294 |  1862   (1)| 00:00:01 |   3012 |00:08:14.63 |     576K|    553K|   372K|   372K|  330K (0)|
|*  2 |   FILTER                                 |         |      1 |        |       |            |          |   3012 |00:08:45.50 |     576K|    553K|       |       |          |
|   3 |    NESTED LOOPS                          |         |      1 |      2 |   294 |  1861   (1)| 00:00:01 |   3012 |00:08:45.50 |     576K|    553K|       |       |          |
|   4 |     NESTED LOOPS                         |         |      1 |      2 |   294 |  1861   (1)| 00:00:01 |   3012 |00:08:45.34 |     573K|    553K|       |       |          |
|   5 |      NESTED LOOPS                        |         |      1 |      2 |   270 |  1859   (1)| 00:00:01 |   3012 |00:08:45.31 |     573K|    553K|       |       |          |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1      |      1 |      1 |    31 |     3   (0)| 00:00:01 |    439 |00:00:00.27 |      89 |     73 |       |       |          |
|*  7 |        INDEX RANGE SCAN                  | T1_IDX4 |      1 |      4 |       |     2   (0)| 00:00:01 |    440 |00:00:00.01 |       3 |      1 |       |       |          |
|*  8 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2      |    439 |      2 |   208 |  1856   (1)| 00:00:01 |   3012 |00:08:54.15 |     573K|    553K|       |       |          |
|*  9 |        INDEX RANGE SCAN                  | T2_IDX4 |    439 |  23118 |       |    69   (0)| 00:00:01 |     12M|00:00:54.73 |   38436 |  37439 |       |       |          |
|* 10 |      INDEX UNIQUE SCAN                   | T3_IDX1 |   3012 |      1 |       |     0   (0)|          |   3012 |00:00:00.01 |       9 |      2 |       |       |          |
|  11 |     TABLE ACCESS BY INDEX ROWID          | T3      |   3012 |      1 |    12 |     1   (0)| 00:00:01 |   3012 |00:00:00.01 |    3012 |      2 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Notice the column "E-Rows". It deviates substantially from the column A-Rows, which are the actual rows returned.

So the optimizer is clearly not able to make a good estimation about the number of rows.

Let's add a couple of statements which will give the optimizer better statistics. This particular database is not certified with any of the 12.1 adaptive tuning features, so I will have to feed the optimizer using other methods:

alter session set statistics_level='ALL';
alter session set optimizer_dynamic_sampling=11;

After running it a second time, the following plan is generated:

Plan hash value: 1509429641

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name   | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |        |      1 |        |       |   735K(100)|          |   3012 |00:09:44.25 |     572K|    551K|       |       |          |
|   1 |  SORT ORDER BY                           |        |      1 |   2986 |   428K|   735K  (1)| 00:00:29 |   3012 |00:09:44.25 |     572K|    551K|   372K|   372K|  330K (0)|
|*  2 |   FILTER                                 |        |      1 |        |       |            |          |   3012 |00:09:44.25 |     572K|    551K|       |       |          |
|*  3 |    HASH JOIN                             |        |      1 |   2986 |   428K|   735K  (1)| 00:00:29 |   3012 |00:09:44.25 |     572K|    551K|  1066K|  1066K|  954K (0)|
|   4 |     NESTED LOOPS                         |        |      1 |   2986 |   393K|   735K  (1)| 00:00:29 |   3012 |00:09:43.01 |     569K|    547K|       |       |          |
|   5 |      NESTED LOOPS                        |        |      1 |     10M|   393K|   735K  (1)| 00:00:29 |     12M|00:00:38.92 |   38488 |  37114 |       |       |          |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1     |      1 |    439 | 13609 |    72   (0)| 00:00:01 |    439 |00:00:00.01 |      76 |      0 |       |       |          |
|*  7 |        INDEX RANGE SCAN                  | T1_IDX4|      1 |    440 |       |     2   (0)| 00:00:01 |    440 |00:00:00.01 |       3 |      0 |       |       |          |
|*  8 |       INDEX RANGE SCAN                   | T2_IDX4|    439 |  23118 |       |    69   (0)| 00:00:01 |     12M|00:00:28.47 |   38412 |  37114 |       |       |          |
|*  9 |      TABLE ACCESS BY INDEX ROWID         | T2     |     12M|      7 |   728 |  1856   (1)| 00:00:01 |   3012 |00:08:47.65 |     531K|    510K|       |       |          |
|  10 |     TABLE ACCESS FULL                    | T3     |      1 |  74866 |   877K|   572   (1)| 00:00:01 |  74866 |00:00:01.22 |    3022 |   3017 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Consider the same column - the E-Rows and the A-Rows, and how much more accurate the optimizer is estimating the rows to be returned.
Notice also that the former of these two plans generates an index lookup on the T3 table, while the ladder generates a full table scan. The FTS proved to be the most efficient, since the runtime dropped from 15 to 9 minutes.

NOTE: You can only get the results you expect when you actually execute the statement. If you try to add the hint along with the EXPLAIN PLAN FOR statement, you will see warnings as these in your output:
Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

These will of course be shown even if you do set the statistics_level to ALL. As Mathew McPeak points out in an aswer on stackoverflow.com
The way you are using it, you are printing the plan of the last statement you explained, not the last statement you executed. And "explain" will not execute the query, so it will not benefit from a gather_plan_statistics hint.

1 comment:

  1. Thank You Vegard,

    Your blog is having good articles. Please share more posts on Performance tuning. Expecting more examples and scenarios on Baselines and outlines..

    Appreciate your help..



    Thank You,
    Buk Lau

    ReplyDelete