Oracle Tips and Tricks — David Fitzjarrell

November 13, 2016

“If You Change This, Right Here …”

Filed under: Performance,stats — dfitzjarrell @ 10:47

"changes are so frightening."
-- Norton Juster, The Phantom Tollbooth

It isn’t unusual for Oracle to expand functionality with each new release; a short list includes AWR, ADDM, the SYSAUX tablespace, Flashback Query, Flashback Database and Edition-based Redefinition. Oracle 12.1 is no different as it provides an interesting performance enhancement, the in-memory column store, that can increase performance by reducing physical I/O load by making tables memory-resident (to simplify the description somewhat). In addition to the in-memory access to table data Oracle has introduced a new optimization called “Vector Transformation, including Vector Optimization” for in-memory operations. Even when this new optimization isn’t in use there are new code paths to take advantage of this technology. As a result execution plans can change; let’s look at an example originally provided by Jonathan Lewis that illustrates this.

(In-memory is fairly easy to configure. Set inmemory_size to a ‘reasonable’ non-zero value and then restart the database:


SQL> alter system set inmemory_size=500m scope=spfile;

System altered.

SQL>

For a complete reference to in-memory configuration read here.)

Two tables are created, neither one especially noteworthy, an index is created on each table and extended statistics are generated on the (n1,n2) column grouping:


SQL> 
SQL> create table t1
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level  
SQL> create table t2
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level 'T1',
  5  		     method_opt       => 'for columns (n1,n2) size 1'
  6  	     );
  7  end;
  8  /
SQL> 

Executing the following query and displaying the execution plan shows no surprises since neither table has been altered to be in-memory:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 1718706536                                                                                                                           
                                                                                                                                                      
-------------------------------------------------------------------------------------------------                                                     
| Id  | Operation                               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |                                                     
-------------------------------------------------------------------------------------------------                                                     
|   0 | SELECT STATEMENT                        |       |       |       |  5127 (100)|          |                                                     
|   1 |  SORT AGGREGATE                         |       |     1 |       |            |          |                                                     
|   2 |   VIEW                                  |       | 10001 |       |  5127   (1)| 00:00:01 |                                                     
|   3 |    HASH UNIQUE                          |       | 10001 |   351K|  5127   (1)| 00:00:01 |                                                     
|*  4 |     HASH JOIN                           |       | 10001 |   351K|  5125   (1)| 00:00:01 |                                                     
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   100 |  2100 |     3   (0)| 00:00:01 |                                                     
|*  6 |       INDEX RANGE SCAN                  | T1_N1 |   100 |       |     1   (0)| 00:00:01 |                                                     
|   7 |      TABLE ACCESS FULL                  | T2    |  1000K|    14M|  5113   (1)| 00:00:01 |                                                     
-------------------------------------------------------------------------------------------------                                                     
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   5 - filter("T1"."N2"=50)                                                                                                                           
   6 - access("T1"."N1"=50)                                                                                                                           
                                                                                                                                                      
SQL> 

Having the extended statistics allowed the optimizer to estimate the correct cardinality between columns n1 and n2, providing a better initial plan. Now things will get a bit interesting; table T2 is altered and put in-memory:


SQL> 
SQL> alter table t2 inmemory;
SQL> 

The same query is executed again and the plan changes because of T2 now being in-memory, even though the new vector transformation has not been applied:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 106371239                                                                                                                            
                                                                                                                                                      
----------------------------------------------------------------------------------------------------                                                  
| Id  | Operation                                | Name    | Rows  | Bytes | Cost (%CPU)| Time     |                                                  
----------------------------------------------------------------------------------------------------                                                  
|   0 | SELECT STATEMENT                         |         |       |       |   223 (100)|          |                                                  
|   1 |  SORT AGGREGATE                          |         |     1 |       |            |          |                                                  
|   2 |   VIEW                                   |         | 10001 |       |   223  (15)| 00:00:01 |                                                  
|   3 |    HASH UNIQUE                           |         | 10001 |   351K|   223  (15)| 00:00:01 |                                                  
|*  4 |     HASH JOIN                            |         | 10001 |   351K|   222  (14)| 00:00:01 |                                                  
|   5 |      JOIN FILTER CREATE                  | :BF0000 |   100 |  2100 |     3   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1      |   100 |  2100 |     3   (0)| 00:00:01 |                                                  
|*  7 |        INDEX RANGE SCAN                  | T1_N1   |   100 |       |     1   (0)| 00:00:01 |                                                  
|   8 |      JOIN FILTER USE                     | :BF0000 |  1000K|    14M|   209  (11)| 00:00:01 |
|*  9 |       TABLE ACCESS INMEMORY FULL         | T2      |  1000K|    14M|   209  (11)| 00:00:01 |                                                  
----------------------------------------------------------------------------------------------------                                                  
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   6 - filter("T1"."N2"=50)                                                                                                                           
   7 - access("T1"."N1"=50)                                                                                                                           
   9 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
                                                                                                                                                      
SQL> 

The cost of the tablescan was reduced considerably, from 5113 to 209 but that isn’t the most thought-provoking part of the plan, however, as a serial Bloom filter is used for the hash join due to the in-memory code path. With a table stored ‘normally’ (not in-memory) a Bloom filter would not be effective; with the in-memory option a Bloom filter can use enhancements such as simultaneous multiple probe access into table t2 (in this example) in addition to the benefits of avoiding disk access, disk buffer activity and row-by-row data retrieval. All of those avoidances reduce CPU time considerably when compared to a ‘regular’ table, one that is not in-memory.

The in-memory column store is a very nice addition to an already powerful database, made even better by new optimizer transformations that speed data access by providing execution paths not available for tables accessed from disk. ‘Experiment’ with the in-memory option to see what performance improvements it might provide; you might be pleasantly surprised at what you find.

Sometimes unexpected change is good.

January 8, 2016

“Say What?!?!?”

Filed under: General,Indexes,Performance — dfitzjarrell @ 16:04

"The only thing you can do easily is be wrong, and that's hardly worth the effort." 
Norton Juster, The Phantom Tollbooth

Oracle can lie to you. Not like a disreputable used-car salesman but more like the ‘little white lie’ sometimes told in order to hide less-than-desirable parts of the truth. And it’s not Oracle, really, it’s the optimizer and it does it by reporting query plans that may not accurately report the execution path. Sometimes we can make Oracle tell lies that it doesn’t know its telling, by using features not everyone uses. Take, for example, NLS settings. There have been improvements in National Language Support in most Oracle releases, and those improvements can introduce behavior that causes Oracle to lie to you. Let’s look at an example where this is true.

Jonathan Lewis brought this example to light in a recent blog post where he dislplayed the plan using dbms_xplan. After reading the post I decided to add to it by using autotrace to generate the plan, just to see if the two plans matched. Let’s look at the modified example:


SQL> --
SQL> --  Setup NLS parameters
SQL> --
SQL> ALTER session SET nls_sort=binary_ci;

Session altered.

SQL> ALTER session SET nls_comp=linguistic;

Session altered.

SQL> 
SQL> --
SQL> -- Create table, populate table and
SQL> -- create index
SQL> --
SQL> CREATE TABLE log_data(
  2    account_id NUMBER,
  3    log_type NUMBER,
  4    sys_name VARCHAR2(30),
  5    log_time TIMESTAMP,
  6    msg varchar2(4000)
  7  )
  8  nologging
  9  ;

Table created.

SQL> 
SQL> insert /*+ append */ into log_data(
  2    account_id,
  3    log_type,
  4    sys_name,
  5    log_time,
  6    msg
  7  )
  8  select
  9  	     5,
 10  	     2,
 11  	     dbms_random.string('a',1),
 12  	     sysdate + dbms_random.value,
 13  	     rpad('x',200)
 14  from
 15  	     dual
 16  connect by
 17  	     level  user,
  4  		     tabname	      =>'LOG_DATA',
  5  		     method_opt       => 'for all columns size 1'
  6  	     );
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> 

The setup is complete so let’s see what autotrace reports as the execution plan:


SQL> 
SQL> column msg format a20
SQL> 
SQL> --
SQL> -- Use autotrace to generate the plan
SQL> --
SQL> set autotrace on
SQL> 
SQL> SELECT
  2  	     *
  3  FROM
  4    (
  5  	 SELECT
  6  	     sys_name, log_time,  substr(msg,1,40) msg
  7  	 FROM log_data
  8  	 WHERE
  9  	   account_id=5
 10  	   AND log_type=2
 11  	   AND sys_name='a'
 12  	 ORDER BY
 13  	   log_time  desc
 14    )
 15  WHERE
 16    rownum <= 10
 17  ;

SYS_NAME                       LOG_TIME                                                                    MSG                                        
------------------------------ --------------------------------------------------------------------------- --------------------                       
A                              09-JAN-16 02.42.54.000000 PM                                                x                                          
a                              09-JAN-16 02.41.02.000000 PM                                                x                                          
a                              09-JAN-16 02.40.54.000000 PM                                                x                                          
a                              09-JAN-16 02.36.38.000000 PM                                                x                                          
A                              09-JAN-16 02.36.02.000000 PM                                                x                                          
a                              09-JAN-16 02.34.15.000000 PM                                                x                                          
a                              09-JAN-16 02.31.15.000000 PM                                                x                                          
A                              09-JAN-16 02.30.59.000000 PM                                                x                                          
a                              09-JAN-16 02.30.36.000000 PM                                                x                                          
a                              09-JAN-16 02.29.53.000000 PM                                                x                                          

10 rows selected.


Execution Plan
----------------------------------------------------------                                                                                            
Plan hash value: 1444908817                                                                                                                           
                                                                                                                                                      
--------------------------------------------------------------------------------------------                                                          
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT                |          |    10 |  1120 |    13   (0)| 00:00:01 |                                                          
|*  1 |  COUNT STOPKEY                  |          |       |       |            |          |                                                          
|   2 |   VIEW                          |          |    11 |  1232 |    13   (0)| 00:00:01 |                                                          
|*  3 |    SORT ORDER BY STOPKEY        |          |    11 |  2453 |    13   (0)| 00:00:01 |                                                          
|   4 |     TABLE ACCESS BY INDEX ROWID | LOG_DATA |  1000 |   217K|    13   (0)| 00:00:01 |                                                          
|*  5 |      INDEX RANGE SCAN DESCENDING| LOG_DATE |    11 |       |     2   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------------------------                                                          
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   1 - filter(ROWNUM<=10)                                                                                                                             
   3 - filter(ROWNUM 
SQL> set autotrace off
SQL> 

SQL> 
SQL> --
SQL> -- Now use dbms_xplan to generate the plan
SQL> --
SQL> SELECT /*+ gather_plan_statistics */
  2  	     *
  3  FROM
  4    (
  5  	 SELECT
  6  	     sys_name, log_time,  substr(msg,1,40) msg
  7  	 FROM log_data
  8  	 WHERE
  9  	   account_id=5
 10  	   AND log_type=2
 11  	   AND sys_name='a'
 12  	 ORDER BY
 13  	   log_time  desc
 14    )
 15  WHERE
 16    rownum 'allstats last'));

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  36qykj7j9tsrp, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
SELECT /*+ gather_plan_statistics */         * FROM   (     SELECT                                                                                    
   sys_name, log_time,  substr(msg,1,40) msg     FROM log_data                                                                                        
WHERE       account_id=5       AND log_type=2       AND sys_name='a'                                                                                  
 ORDER BY       log_time  desc   ) WHERE   rownum <= 10                                                                                               
                                                                                                                                                      
Plan hash value: 1444908817                                                                                                                           
                                                                                                                                                      
---------------------------------------------------------------------------------------------------------------------------------                     
| Id  | Operation                       | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |                     

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------                     
|   0 | SELECT STATEMENT                |          |      1 |        |     10 |00:00:00.01 |     979 |       |       |          |                     
|*  1 |  COUNT STOPKEY                  |          |      1 |        |     10 |00:00:00.01 |     979 |       |       |          |                     
|   2 |   VIEW                          |          |      1 |     11 |     10 |00:00:00.01 |     979 |       |       |          |                     
|*  3 |    SORT ORDER BY STOPKEY        |          |      1 |     11 |     10 |00:00:00.01 |     979 |  2048 |  2048 | 2048  (0)|                     
|   4 |     TABLE ACCESS BY INDEX ROWID | LOG_DATA |      1 |   1000 |    975 |00:00:00.01 |     979 |       |       |          |                     
|*  5 |      INDEX RANGE SCAN DESCENDING| LOG_DATE |      1 |     11 |    975 |00:00:00.01 |       5 |       |       |          |                     
---------------------------------------------------------------------------------------------------------------------------------                     
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                      
   1 - filter(ROWNUM<=10)                                                                                                                             
   3 - filter(ROWNUM 

Notice that both methods report the same execution plan; it’s the access predicates that differ, and autotrace reports the index as the access point rather than the hidden column in the table. Historically it’s been dbms_xplan.display_cursor that’s been the ‘truth’, but in this case it may be autotrace reporting the actual predicates in use.

And that’s the truth.

July 28, 2015

It Isn’t The Index

Filed under: Indexes,Performance — dfitzjarrell @ 15:36

 


"If you want sense, you'll have to make it yourself."
- Norton Juster, The Phantom Tollbooth

One of the more confusing statistics in Oracle is, apparently, one called the clustering factor. Associated with an index (as evidenced by the columns in DBA_INDEXES) it’s actually dependent on the table data, more specifically the distance between ‘jumps’ for a given index key. Let’s define a ‘jump’ to be the number of blocks between rows containing the given index key starting with the first block found containing that key. That’s confusing but don’t despair, this will be explained in detail.

Oracle, in its infinite wisdom, computes a clustering factor for an index based upon the key value and the scattering of the rows containing that key across the table data. Looking at a simple example let’s look at a simple non-unique index (indexes for which the clustering factor matters) and the associated table data. The example starts by building the table and loading the data:


SQL> 
SQL> create table clust_fact(
  2  	     id      number,
  3  	     nm      varchar2(40),
  4  	     dt      date,
  5  	     sq      number);

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into clust_fact(id,nm,dt,sq)
  4  		     values(mod(i,77), 'Record '||i, sysdate - mod(i, 77), i);
  5  	     end loop;
  6  
  7  	     commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL> 

Next, build the index and generate the statistics:


SQL>
SQL> create index clust_fact_idx on clust_fact(id);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Query the clustering factor computed by Oracle:


SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL> 

The clustering factor provides a representation of how far Oracle needs to ‘jump’, on average, between blocks containing the index key value. To understand this a bit better let’s look at the code Oracle actually uses to compute the clustering factor and explain what it does:


SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Oracle uses a function named sys_op_countchg, which takes two parameters, the first 15 bytes of the rowid and the number of blocks to traverse to trigger the count to increase; by default Oracle chooses to increment the clustering factor counter every time the block address changes. If three rows, for example, exist in the current block the counter does not increment; if the next three rows are in a different block the counter is incremented by 1 (Oracle doesn’t change blocks for the next two reads after the change). This keeps going througout the table to compute the final clustering factor. Knowing the function used by Oracle and its parameters it is possible to modify the call to accept larger values for the number of block changes before the clustering factor counter is incremented. To allow Oracle to consider 3 blocks ‘close enough’ to not change the counter the query is modified as follows:


SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Given how the values were computed for the ID column the clustering factor doesn’t change when we increase the block change count Oracle considers to be within range of the starting block address. Possible values for this second parameter are 1, 3, 5, 9 and 16. Going through the list of possible values it’s obvious with the current data loading the clustering factor won’t improve no matter which value we choose:


SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Since how the table data is ordered affects the clustering factor the table is reloaded with data ordered using the ID column:


SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Querying the clustering factor reveals it has significantly decreased since the keys are much closer together:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4885                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4881                                                                                                                          

SQL> 

It is worth noting that the clustering factor slightly decreases with larger values for the number of blocks Oracle can skip before updating the change count. The changes are not huge, and only differ by 8 from the value Oracle calculates by default to setting the ‘blocks to skip’ parameter to 16, its highest possible value. Since the original load created the SQ values in sequence the table is reloaded with the SQ values in order and a new concatenated index using both the ID and SQ columns is created. This is done to show how the clustering factor behaves for other indexes when it’s ‘optimized’ for one index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by sq;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index clust_fact_idx2 on clust_fact(id, sq);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Querying the clustering factor the value for the original table load again appears and, again, doesn’t change even when the block skip parameter to the sys_op_countchg function is increased through the available values:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           370731                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Ordering the data by ID shows that for both indexes the clustering factor decreases, since the indexes both include the ID column. The presence of the SQ column causes the second index clustering factor to be greater when the default parameters are used; interestingly as the block skip count increases for the second index the clustering factor decreases to that of the single-column index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                            75994                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
     75994                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4967                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 

For multiple indexes on the same table, where no column is common between them, the clustering factor can skyrocket for some indexes when the table data is ordered to favor one particular index. That example wasn’t run and is left for you to investigate. Let’s do something similar and change the order of the columns in the second index and see how that affects the clustering factor:


SQL%gt; select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           999810                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 

Even though there is a common column between the indexes the column order matters; ordering the data by the key column for the single-column index caused the clustering factor for the second index to be much greater than it was when the leading column for the second index was the ID column.

Although associated with an index, the clustering factor is affected by the table data so no series of index rebuilds will ‘fix’ a ‘bad’ value. The table data needs to be ordered to favor an index (or indexes if they contain the column used to order the table data). This, of course, improves performance for the positively affected indexes and, unfortunately, adversely affects other indexes created against that same table. Care must be taken when deciding whether or not to order the table data; ‘reasonable’ performance metrics must be decided upon before embarking on such a change, and it should be done in test or dev before considering it in production. It may be that overall ‘acceptable’ performance across all of the indexes is better than ‘outstanding’ performance for one index that causes ‘terrible’ performance for all others. It’s not a simple decision and needs to be carefully considered.

If you want to implement the modified clustering factor computed by modifying the query Oracle uses to generate this statistic you will need to manually update the index statistics using the DBMS_STATS package using one of the SET_INDEX_STATS procedures:


SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> 
SQL> column clf new_value cl_f
SQL> 
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> exec dbms_stats.set_index_stats(ownname => 'BING', indname => 'CLUST_FACT_IDX2', clstfct => &cl_f);

PL/SQL procedure successfully completed.

SQL> 

Using the new_value functionality in SQL*Plus saves the computed value into a reusable variable to pass to procedures and functions that need it. The above code would set the clustering factor for the CLUST_FACT_IDX2 to 4861, replacing the Oracle-calculated value of 75994. Again, care must be taken when undertaking such a change since you cannot predict the resulting performance change.

Since this post was originally written Oracle has implemented an enhancement to DBMS_STATS to allow the DBA to tell Oracle how many table blocks can expect to be cached. The SET_TABLE_PREFS, SET_SCHEMA_PREFS and SET_DATABASE_PREFS procedures have been modified to accept the TABLE_CACHED_BLOCKS parameter; this tells Oracle how many index blocks will be cached so these can be ignored when incrementing the clustering factor when statistics are gathered. Looking at an example we see:


SQL> create table mytab(
  2  myid    number,
  3  mydata  varchar2(40),
  4  mydate  date)
  5  cache;

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into mytab
  4  		     values(trunc(9999*(dbms_random.value)), 'Line '||i, trunc(sysdate - mod(i, 4637)));
  5  	     end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index mytab_idx
  2  on mytab(myid, mydate);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 996238                                                     

SQL> 

For no adjustment to the default TABLE_CACHED_BLOCKS parameter we see a really large clustering factor (and this is due to the random nature of the MYID values generated). Let’s set TABLE_CACHED_BLOCKS to 47 for MYTAB and see what Oracle calculates for the clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>47);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 987551                                                     

SQL> 

The clustering factor has decreased somewhat (again it’s not a dramatic decrease due to the random nature of the MYID values). Let’s set the parameter to its maximum value (255) and see what Oracle provides as a clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 939451                                                     

SQL> 

It’s reduced again, although not dramatically, but it is less than it was originally. Using this new parameter to DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS and DBMS_STATS.SET_DATABASE_PREFS procedures can potentially fix a pessimistic clustering factor and improve index selection and, in turn, improve query performance.

The clustering factor determines which index Oracle will choose for a query against a given table. Changing the clustering factor will influence the CBO calculations sometimes for the better, sometimes for the worse. Plan carefully and test, test, test before second-guessing Oracle.

Which just makes sense.

May 26, 2015

Stop That!

Filed under: General,Performance — dfitzjarrell @ 13:05

"It seems to me that almost everything is a waste of time."
- Norton Juster, The Phantom Tollbooth 

Jonathan Lewis has coined a term and its definition that, I believe, we’ll be using quite a bit:



Stoptimisation - the art of optimisation by not doing the things you don't need to do.

It’s a term we’ve needed for some time now, given the pendulum swing back to the days of yore (meaning pre-9i) with respect to database, application and query tuning. I’m seeing more and more posts in forums and newsgroups asking about tasks that I thought were long since relegated to the ‘gee, that’s really useless’ pile. Let’s look at a few of them and see why you probably shouldn’t be doing them,



Regular Index Rebuilds

Index rebuilds, for the most part, are busy work for a DBA. Yes, they can ‘clean house’ with respect to empty index blocks buried in the index tree but since the index will simply grow again to the size before the rebuild and the empty index blocks will again be scattered across the index tree I see no point in performing regularly scheduled index rebuilds. In fact, for non-partitioned indexes, I see no real reason to rebuild at all since it:



*	Locks the table and index until the rebuild is finished
*	Invalidates cursors using the associated table
*	Serves no useful purpose

Yes, there are times when local indexes on a partitioned table will need to be rebuilt, but those times are usually due to DDL against the partitioned table (dropping partitions, adding/splitting partitions, swapping partitions). The listed DDL actions can make local index partitions unusable, requiring them to be rebuilt. It’s not a ‘size’ thing, it’s not a ‘performance’ thing, it’s because the unusable index interferes with table access and thus production processes. What is troubling is the number of DBAs asking about regularly scheduled index rebuilds based on antiquated ‘criteria’ such as B-tree depth or index size. Even MOS has taken down it’s original document on when to rebuild an index and replaced it with a more responsible version that better addresses the topic. Still, there are DBAs firmly convinced that scheduling index rebuilds once a week, once a month through cron or DBMS_SCHEDULER must certainly improve the performance of queries and production processes. It doesn’t, and it really just inconveniences the end users while the index rebuilds are in progress. Since most application tables have more than one index in place, rebuilding indexes for a given table can take hours, and lock the table for the entire time, essentially shutting down production for the duration of the rebuilds. I can’t see how that improves performance.



Continually 'Tweaking' Database Parameters

There are DBAs who can’t seem to be satisfied with the performance of the database, and constantly hunt for ‘problem’ areas whether or not they actually affect performance. This is commonly referred to as Compulsive Tuning Disorder (henceforth referred to as CTD). CTD creates an endless cycle of ‘tune this, now tune that, now tune something else’, all based on the most minute deviationsa in wait statistics. CTD is one of the biggest time wasters for a DBA; it’s like Ponce De Leon’s search for the Fountain Of Youth. Not every wait statistic needs to be perfectly adjusted, and it’s impossible to do in the real world. It all boils down to what the DBA considers performance versus what the end-users consider performance. In the throes of CTD EVERYTHING is a potential ‘problem’ even if it doesn’t affect the end-user experience. On the end-user side a performance problem exists when processes take longer than they expect. Once the end-user problem has been addressed tuning should stop since further adjustments won’t provide any additional benefit. Wait statistics will never be ideal in a production system; multiple users accessing and modifying data will cause concurrency waits, no matter how small, and it’s useless to address such waits to get the response time perfect. CTD turns the most reliable of DBAs into a micro-manager of epic proportions and the efforts expended to ‘tweak’ parameters that have no bearing on the actual system performance are, honestly, useless.



Table Reorganization

As tables get larger and larger queries accessing data can gradually take longer to complete, notably if the query relies on a full table scan to fetch that data. There are still some DBAs who are convinced that reorganizing the table data will go a long way in improving performance. One of those reorganizations involves sorting the table data to improve the clustering factor. Yes, the clustering factor for an index is calculated based on the ‘location’ of the table data with respect to the sorted index keys, but ‘improving’ the clustering factor for one index usually makes the clustering factor worse for every other index against that table. Another point to make is that these are heap tables, with no real data order established; inserted data can go anywhere there is room in any data block associated with that table. Sorting the data only lasts until the first insert afterwards, after which index keys again become scattered through the table data. Years ago vendors argued that sorting the data was the best way to improve performance, and ‘suggested’ that their product be used on a regular basis to maintain this wonderful sorted order. The problems with that idea are that the data can be sorted for one index built against that table, not all and that the data will gradually return to its unsorted order as the end-users process data. This becomes, like CTD, a never-ending cycle of ‘sort, sort, sort, sort, sort’ all under the misguided notion that it’s making things better. It can’t, since every time the data is sorted the production system is rendered unavailable. If the end-users can’t work it doesn’t matter how fast the queries run AFTER the sort process has run. Unfortunately once the DBA has installed such blinders he or she is unaware of the inconvenience such processes create. The end-users need to work, not wait for some DBA to sort a pile of data only to have it return to being an unsorted pile a day or two down the road. The DBA should be working to create meaningful and lasting changes to improve performance; query tuning, plan stability and index analysis are worthwhile tasks that can produce tangible benefits that last far longer than the tasks it took to create them.

There are probably other time-wasters for the DBA that aren’t listed here; knowing what SHOULD be done versus what CAN be done is the hallmark of a great DBA. Looking again at the term introduced in this post should direct you toward useful tuning tasks that provide long-term benefit and that don’t require constant attention or repeated execution. Nothing is perfect, especially your database, so don’t try to make it so because you’ll only end up on The Endless Tuning Treadmill suffering from Compulsive Tuning Disorder. “Stoptimization” is what the DBA needs to be concerned with, and that’s not doing the things you don’t need to do. There is plenty of REAL work for the DBA to accomplish day-to-day. Don’t get wrapped up in the minutiae; it’s simply not worth the effort.


"For there's always something to do to keep you from what you really should be doing, ..."
- Norton Juster, The Phantom Tollbooth 

Ain’t that the truth.

March 30, 2015

Sound Advice

Filed under: Performance — dfitzjarrell @ 09:33

"There is much worth noticing that often escapes the eye."
- Norton Juster, The Phantom Tollbooth

Oracle has provided views to assist in managing the SGA and PGA settings when you’re not using Automatic Memory Management (AMM) and a view to manage memory_target when you are using AMM. These views were designed to be fairly easy to interpret but it seems that there has been confusion of late in how to actually use the recommendations Oracle is presenting. Let’s fix that.

Starting with 10gR2 Oracle provided both V$SGA_TARGET_ADVICE and V$PGA_TARGET_ADVICE to assist the DBA in sizing the SGA and PGA based on how the database was using memory. The definitions of these views (as of 11.2.0.4) are shown below:


SQL> desc v$sga_target_advice
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 SGA_SIZE                                                                          NUMBER
 SGA_SIZE_FACTOR                                                                   NUMBER
 ESTD_DB_TIME                                                                      NUMBER
 ESTD_DB_TIME_FACTOR                                                               NUMBER
 ESTD_PHYSICAL_READS                                                               NUMBER

SQL> desc v$pga_target_advice
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 PGA_TARGET_FOR_ESTIMATE                                                           NUMBER
 PGA_TARGET_FACTOR                                                                 NUMBER
 ADVICE_STATUS                                                                     VARCHAR2(3)
 BYTES_PROCESSED                                                                   NUMBER
 ESTD_TIME                                                                         NUMBER
 ESTD_EXTRA_BYTES_RW                                                               NUMBER
 ESTD_PGA_CACHE_HIT_PERCENTAGE                                                     NUMBER
 ESTD_OVERALLOC_COUNT                                                              NUMBER

SQL>

Nothing unusual or esoteric here, however it appears that interpreting the provided advice can be confusing. Looking first at the contents of V$SGA_TARGET_ADVICE let’s see what Oracle reports:


SQL> select * From v$sga_target_advice
  2  /

  SGA_SIZE SGA_SIZE_FACTOR ESTD_DB_TIME ESTD_DB_TIME_FACTOR ESTD_PHYSICAL_READS
---------- --------------- ------------ ------------------- -------------------
       520              .5        65793              5.8208             2579079
       780             .75        23911              2.1155             2068005
      1040               1        11303                   1             1134459
      1300            1.25         9162               .8106              970643
      1560             1.5         9098               .8049              970643
      1820            1.75         9058               .8014              970643
      2080               2         9058               .8014              970643

7 rows selected.

SQL>

Based on resource usage since startup this view provides information on SGA memory settings that may improve performance if adopted. The key element in any of these views is the size factor, with a size factor of 1 being the current setting. Looking at the above output it’s possible that physical reads can be reduced a small amount if sga_target is increased from 1040 MB to 1300 MB; any further increase to sga_target will show no improvement. This is proven by the ESTD_PHYSICAL_READS column, the value of which stays constant for an SGA sized at 1300 MB or greater. In addition the ESTD_DB_TIME_FACTOR doesn’t improve much at greater SGA sizes indicating that further increases provide no real benefit. Ideally the goal is to find the setting providing substantial benefit and use it; the problem appears to be ‘where do I stop looking?’ With the above output we stopped looking for improvement at the first entry providing the lowest estimated physical reads. All entries below that provided no additional benefit.

If sga_target and sga_max_size are set diffently (to allow some degree of dynamic SGA management) it may not be necessary to restart the database since sga_target may be able to be set without adjusting sga_max_size. This view may, however, provide an SGA setting that requires both sga_target and sga_max_size be adjusted, and in that case the database will need to be restarted.

In a similar fashion the V$PGA_TARGET_ADVICE view can also be used to ‘fine tune’ overall PGA settings for the database:


SQL> select * From v$pga_target_advice;

PGA_TARGET_FOR_ESTIMATE PGA_TARGET_FACTOR ADV BYTES_PROCESSED  ESTD_TIME ESTD_EXTRA_BYTES_RW ESTD_PGA_CACHE_HIT_PERCENTAGE ESTD_OVERALLOC_COUNT
----------------------- ----------------- --- --------------- ---------- ------------------- ----------------------------- --------------------
               73400320              .125 ON       7.4905E+10   37173880          1395450880                            98                   88
              146800640               .25 ON       7.4905E+10   36716537           456747008                            99                    6
              293601280                .5 ON       7.4905E+10   36643991           307845120                           100                    0
              440401920               .75 ON       7.4905E+10   36494007                   0                           100                    0
              587202560                 1 ON       7.4905E+10   36494007                   0                           100                    0
              704643072               1.2 ON       7.4905E+10   36494007                   0                           100                    0
              822083584               1.4 ON       7.4905E+10   36494007                   0                           100                    0
              939524096               1.6 ON       7.4905E+10   36494007                   0                           100                    0
             1056964608               1.8 ON       7.4905E+10   36494007                   0                           100                    0
             1174405120                 2 ON       7.4905E+10   36494007                   0                           100                    0
             1761607680                 3 ON       7.4905E+10   36494007                   0                           100                    0

PGA_TARGET_FOR_ESTIMATE PGA_TARGET_FACTOR ADV BYTES_PROCESSED  ESTD_TIME ESTD_EXTRA_BYTES_RW ESTD_PGA_CACHE_HIT_PERCENTAGE ESTD_OVERALLOC_COUNT
----------------------- ----------------- --- --------------- ---------- ------------------- ----------------------------- --------------------
             2348810240                 4 ON       7.4905E+10   36494007                   0                           100                    0
             3523215360                 6 ON       7.4905E+10   36494007                   0                           100                    0
             4697620480                 8 ON       7.4905E+10   36494007                   0                           100                    0

14 rows selected.

SQL>

Again it’s the target factor that identifies the current setting; working toward larger target factors may show improvement in the PGA cache hit percentage which means processes will see more optimized (one-pass) reads. Like the V$SGA_TARGET_ADVICE view the method is to find a setting that provides improvement. In this example the PGA setting is already optimal (the cache hit percentage is 100 and 0 extra bytes will need to be read). Not every database will be so lucky; this view is provided to allow the DBA to make as few changes as possible to achieve improved performance and throughput, eliminating the need for trial and error PGA management. In this case the database will need to be restarted after the parameter values have been changed.

To restate the methodology of using these two views:


1.	Find the row referencing the current setting; this will be the row where the size factor is equal to 1
2.	Scan the data for the entry just before the improvement estimates no longer change; this will be the setting
	Oracle has calculated to provide the most benefit
3.	Implement that setting, restarting the database, if necessary, to effect the change

In a similar fashion when using Automatic Memory Management, by setting memory_target and memory_max_target, the V$MEMORY_TARGET_ADVICE view can be used to tune the overall memory settings. Output from this view looks like this:


SQL> select * From v$memory_target_advice;

MEMORY_SIZE MEMORY_SIZE_FACTOR ESTD_DB_TIME ESTD_DB_TIME_FACTOR    VERSION     CON_ID
----------- ------------------ ------------ ------------------- ---------- ----------
        392                .25       143586              1.0063          0          0
        784                 .5       142687                   1          0          0
       1176                .75       142687                   1          0          0
       1568                  1       142687                   1          0          0
       1960               1.25       142687                   1          0          0
       2352                1.5       142687                   1          0          0
       2744               1.75       142687                   1          0          0
       3136                  2       142687                   1          0          0

8 rows selected.

SQL>

The same methodology is used here, and if memory_max_target is larger than the suggested memory_target value the setting can be changed without restarting the database. Like the SGA settings if memory_max_target needs to also be adjusted this will require a database restart. Note that if you’re using Linux and hugepages using AMM is not supported.

Managing database memory settings doesn’t need to be a gruelling task of ‘try this, see if it works, it doesn’t so try another setting and repeat’. Using the provided memory advisors can make the DBA’s job much easier. Hopefully this has made that task easier, freeing the DBA to address more pressing issues.

Sometimes it’s good to take things under advisement.

March 12, 2015

Extra-cost Option?

Filed under: General,Performance — dfitzjarrell @ 09:14

"Just because you have a choice, it doesn't mean that any of them 'has' to be right."
 Norton Juster, The Phantom Tollbooth 

In an Oracle forum that I frequent a user posted that he found cardinality values to be severely inflated when using an analytic function. This user posted an example that will be used here to demonstrate the problem. It will also be used to demonstrate another issue that can make this problem even more confusing, but the latter issue is one that can easily be fixed.

The example script contains the following code, which includes statements not executed by the user posting the problem (creating a new user to run the example from, using an additional statistics gathering method to verify the method wasn’t part of the cause and modifying the dynamic sampling setting), but which have no effect on the results generated:



connect bing/#############

spool wrong_card_ex.log
select * From v$version;

set echo on linesize 132

create table t1
as
select * From all_objects;

exec dbms_stats.gather_table_stats(user, 'T1')

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
FROM (
	  SELECT object_type
	       , object_name
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

connect bing/#############

set echo on linesize 132

alter session set optimizer_dynamic_sampling=11;

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
     , object_rank
FROM (
	  SELECT object_type
	       , object_name
	       , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

exec dbms_stats.gather_schema_stats(user)

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
FROM (
	  SELECT object_type
	       , object_name
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

connect bing/#############

set echo on linesize 132

alter session set optimizer_dynamic_sampling=11;

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
     , object_rank
FROM (
	  SELECT object_type
	       , object_name
	       , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

spool off

Nothing unusual there, really, other than the second run of the code minus the table creation. That was done to test whether gathering schema stats over gathering table stats made a difference in the cardinality numbers; it did not. Let’s look at a run of this script in an 11.2.0.4 database and see what cardinalities Oracle generates:


BANNER                                                                                                                              
--------------------------------------------------------------------------------                                                    
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production                                                        
PL/SQL Release 11.2.0.4.0 - Production                                                                                              
CORE	11.2.0.4.0	Production                                                                                                          
TNS for 64-bit Windows: Version 11.2.0.4.0 - Production                                                                             
NLSRTL Version 11.2.0.4.0 - Production                                                                                              

SQL> 
SQL> create table t1
  2  as
  3  select * From all_objects;

Table created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'T1')

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 

13 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
         12  recursive calls                                                                                                        
          0  db block gets                                                                                                          
       1788  consistent gets                                                                                                        
          1  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 

So far, so good, as the cardinality accurately reflects the actual number of rows for DUAL in the ALL_OBJECTS view. Let’s add the ROW_NUMBER() analytic function and see what changes:


SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          7  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        989  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off

Interestingly (to put it mildly) the cardinality has jumped to over 30,000 times its original value. Maybe it’s the way the stats were gathered; another run is made, this time gathering stats for the entire schema (fortunately it contains only this one table):


SQL> 
SQL> exec dbms_stats.gather_schema_stats(user)

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

17 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> spool off

In both attempts (gathering table stats and gathering schema stats) the dynamic sampling setting was changed from 2 (the default for this database) to 11, and both autotrace and dbms_xplan were used to generate the execution plans. To clear out the environment set by autotrace the user was logged in again before running any SQL statements. No change to the results was noticed (meaning that the generated plan didn’t change using one method over the other) so autotrace had no ‘adverse’ affect on the outcome. Oracle has recently proclaimed this as a bug:


Bug 20591362 - WRONG CARDINALITY WHEN ANALYTICS FUNCTION IS USED

which can be found on MOS. In a nutshell the CBO ignores the selectivity of the WHERE clause when an analytic fuunction is introduced into a query and, as a result, computes the incorrect cardinality.

On a side note if your 11.2.0.x installation does not include Oracle JVM you can run afoul of even stranger numbers:


BANNER                                                                                                                              
--------------------------------------------------------------------------------                                                    
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production                                                        
PL/SQL Release 11.2.0.4.0 - Production                                                                                              
CORE	11.2.0.4.0	Production                                                                                                          
TNS for 64-bit Windows: Version 11.2.0.4.0 - Production                                                                             
NLSRTL Version 11.2.0.4.0 - Production                                                                                              

SQL> 
SQL> create table t1
  2  as
  3  select * From all_objects;

Table created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'T1')

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 

13 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
         11  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        385  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          7  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        196  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> exec dbms_stats.gather_schema_stats(user)

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

17 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> spool off

Notice that the number of rows reported in the execution plans for the non-analytic query are half of what they are supposed to be, and the inflated number of rows is roughly one-fourth of the previously reported value. In the absence of the JVM the data dictionary is ‘short’ around 50,000 rows which can skew the selectivity down to around 1 for the example query. Installing the JVM, using dbca, fixes that issue.

Until this bug is fixed by a one-off patch pay close attention to the row estimates produced for queries using analytic functions; at the least this little ‘inconvenience’ could make troubleshooting performance issues difficult. You may need to eliminate the analytic function to reveal a better estimate of the number of rows returned so you’re not chasing an issue that doesn’t really exist.

Numbers never lie, until they do.

February 23, 2015

Merge, Right?

Filed under: General,Performance — dfitzjarrell @ 15:33

"There are no wrong roads to anywhere."
 - Norton Juster, The Phantom Tollbooth 

Oracle can do some, well, strange things when fixing issues caused by the optimizer. For 10g releases up to 10.2.0.x Oracle chose to silently ignore a message and eliminate an outer join on the MERGE statement under certain conditions. Let’s examine this and see exactly what Oracle implements.

Occasionally in a 10046 trace file generated from 10g releases prior to 10.2.0.x and procesed by the tkprof utility you may see something similar to this:


SELECT 1 FROM DUAL WHERE <some condition here>
 
Error encountered: ORA-00904

where <some condition here> is the ‘ON’ condition from a MERGE statement executed by that session. That seems odd, I know, and Oracle silently ignores that ORA-00904 condition as it is never displayed when executing the MERGE. Why would Oracle do this? Looking at the 10053 trace file generated from the following example:


SQL> merge into t1_merge t1
  2          using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
  3          on (t1.merge_key = t2.merged_key)
  4  when matched then update set t1.xtra_stuff = t2.xtra
  5  when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra);

1 row merged.

SQL>

we see the following:


PARSE ERROR #5:len=36 dep=1 uid=60 oct=3 lid=60 tim=1266258229533447 err=904
SELECT 1 FROM DUAL WHERE t1.merge_key = t2.merged_key
CLOSE #5:c=0,e=4,dep=1,type=0,tim=1266258229533578
Registered qb: MRG$1 0x84379c (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
...
====================== END SQL Statement Dump ======================
=====================
PARSING IN CURSOR #1 len=302 dep=0 uid=1158 oct=189 lid=1158 tim=33189168479100 hv=1190985157 ad='1ec71da8'
merge into t1_merge t1
        using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
        on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra)
END OF STMT
PARSE #1:c=40000,e=51569,p=4,cr=29,cu=0,mis=1,r=0,dep=0,og=2,tim=33189168479094
BINDS #1:
EXEC #1:c=10000,e=558,p=0,cr=7,cu=3,mis=0,r=1,dep=0,og=2,tim=33189168479778
WAIT #1: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=2135111 tim=33189168479862
WAIT #1: nam='SQL*Net message from client' ela= 419 driver id=1650815232 #bytes=1 p3=0 obj#=2135111 tim=33189168480324
STAT #1 id=1 cnt=2 pid=0 pos=1 obj=0 op='MERGE  T1_MERGE (cr=7 pr=0 pw=0 time=425 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=0 op='VIEW  (cr=7 pr=0 pw=0 time=178 us)'
STAT #1 id=3 cnt=1 pid=2 pos=1 obj=0 op='NESTED LOOPS OUTER (cr=7 pr=0 pw=0 time=175 us)'
STAT #1 id=4 cnt=1 pid=3 pos=1 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 time=1 us)'
STAT #1 id=5 cnt=1 pid=3 pos=2 obj=2135111 op='TABLE ACCESS FULL T1_MERGE (cr=7 pr=0 pw=0 time=170 us)'

Notice there is no outer join in the plan steps. Additionally if we look at the outline data we see:


  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      FIRST_ROWS(10)
      OUTLINE_LEAF(@"SEL$5428C7F1")
      MERGE(@"SEL$2")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$4")
      OUTLINE_LEAF(@"MRG$1")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      NO_ACCESS(@"MRG$1" "from$_subquery$_008"@"MRG$1")
      NO_ACCESS(@"MRG$1" "T2"@"MRG$1")
      FULL(@"MRG$1" "T1"@"MRG$1")
      LEADING(@"MRG$1" "from$_subquery$_008"@"MRG$1" "T2"@"MRG$1" "T1"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T2"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T1"@"MRG$1")
    END_OUTLINE_DATA
  */

The outline data in these cases shows the outer join isn’t used as Oracle re-writes the query into this:


merge into t1
using (select 1 merged_key, lpad('y', 21, 'y') xtra from dual) t2
on (1 = 0)
when matched then update set t1.xtra = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra);

Inerestingly the reason for this behavior is available in the online documentationL


To insert all of the source rows into the table, you can use a constant filter predicate in the ON clause condition. An example of a 
constant filter predicate is ON (0=1). Oracle Database recognizes such a predicate and makes an unconditional insert of all source 
rows into the table. This approach is different from omitting the merge_update_clause. In that case, the database still must perform 
a join. With constant filter predicate, no join is performed.

This appears to be fixed in 10.2.0.x and later releases as the ORA-00904 error does not occur in either of the trace files. MOS document id 244055.1 lists this modification as an ‘enhancement’ in 10g. The document also lists the bug that requires this work-around is fixed in 10.1.0.1.0. Even though 10g in all its forms is an old release there still may be sites using it.

In 11.2.0.4 this same construct generates no ORA-00904 error, nor does the ‘dummy’ query appear. Notice that the join is restored as evidenced by the execution plan:


----- Current SQL Statement for this session (sql_id=5zv0cdsdp0bf4) -----
merge into t1_merge t1
	using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
	on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra)
sql_text_length=289
sql=merge into t1_merge t1
	using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
	on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.mer
sql=ged_key, t2.merged_key, t2.xtra)
----- Explain Plan Dump -----
----- Plan Table -----
 
============
Plan Table
============
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | MERGE STATEMENT       |         |       |       |     6 |           |
| 1   |  MERGE                | T1_MERGE|       |       |       |           |
| 2   |   VIEW                |         |       |       |       |           |
| 3   |    NESTED LOOPS OUTER |         |     1 |    31 |     6 |  00:00:01 |
| 4   |     TABLE ACCESS FULL | DUAL    |     1 |     2 |     2 |  00:00:01 |
| 5   |     TABLE ACCESS FULL | T1_MERGE|     1 |    29 |     4 |  00:00:01 |
----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
5 - filter("T1"."MERGE_KEY"=CASE  WHEN (ROWID IS NOT NULL) THEN 1 ELSE 1 END )

The outline data where this is fixed shows the join is used:


  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      DB_VERSION('11.2.0.4')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5428C7F1")
      MERGE(@"SEL$2")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$4")
      OUTLINE_LEAF(@"MRG$1")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$3")
      NO_ACCESS(@"MRG$1" "from$_subquery$_008"@"MRG$1")
      NO_ACCESS(@"MRG$1" "T2"@"MRG$1")
      FULL(@"MRG$1" "T1"@"MRG$1")
      LEADING(@"MRG$1" "from$_subquery$_008"@"MRG$1" "T2"@"MRG$1" "T1"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T2"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T1"@"MRG$1")
      PQ_DISTRIBUTE(@"MRG$1" "T2"@"MRG$1" NONE NONE)
      PQ_DISTRIBUTE(@"MRG$1" "T1"@"MRG$1" NONE NONE)
      FULL(@"SEL$5428C7F1" "DUAL"@"SEL$3")
      FULL(@"SEL$5428C7F1" "T1"@"SEL$2")
      LEADING(@"SEL$5428C7F1" "DUAL"@"SEL$3" "T1"@"SEL$2")
      USE_NL(@"SEL$5428C7F1" "T1"@"SEL$2")
    END_OUTLINE_DATA
  */

Oracle has, apparently, seen the ‘error of its ways’ with this ‘enhancement’ and re-thought how MERGE should be executed. Of course you may be stuck running an older version of 10g and could be affected by this ‘enhancement’. Even migrating to 10.2.0.4 can eliminate this strange query from your MERGE implementations. Know that this isn’t really a performance problem, but it can be disconcerting to see such essentially useless query text in trace files you may generate. Knowing where it comes from hopefully will make it easier to assess performance issues requiring 10046 and 10053 trace files.

It really helps to pay close attention when merging.

February 3, 2015

How Can I Compress Thee

Filed under: Exadata,General,Performance — dfitzjarrell @ 10:26

“You can swim all day in the Sea of Knowledge and not get wet.” 
― Norton Juster, The Phantom Tollbooth 

In previous posts compression options have been discussed, and now it’s time to see how Oracle performs basic compression. It isn’t really compression, it’e de-duplication, but it does result in space savings for data that won’t be modified after it’s ‘compressed’. Let’s look at how Oracle saves space with your data.

Oracle de-duplicates the data by finding common strings, tokenizing them and using the token identifier in the string to reduce the row length. So, what does that mean? Looking at an example might help; a table is built and populated as follows:


--
-- Create and populate the table
--
create table comptst(
	tstcol1	varchar2(4),
	tstcol2 varchar2(6),
	tstcol3	number(8,2),
	tstcol4	varchar2(10));

insert into comptst
values ('ZZXZ', 'bbddff', 199.44, 'PENDING');

insert into comptst
values ('ZZXZ', 'ghijkl', 43.08, 'PENDING');

insert into comptst
values ('ZZXZ', 'bbddff', 881.02, 'PENDING');

insert into comptst
values ('ZZXZ', 'bbddff', 54.97, 'PENDING');

commit;

insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;

commit;
				--
-- Compress the table with BASIC compression
--
alter table comptst compress basic;
alter table comptst move;

[The table was compressed after the data was inserted which required two steps, the first to set the compression level and the second, a table move in place, to actually compress the data. Had the table been built as compressed and direct path inserts used the data would have been compressed without further action.] Since the initial 4-row insert was re-inserted multiple times there is a lot of duplication in the data, and since Oracle de-duplicates rows to produce the effect of compression there should be a lot of data in a block dump indicating this. There is, and the first piece of that data is the following line:


  perm_9ir2[4]={ 0 2 3 1 }

Oracle builds a token table for each data block; this provides a reference for each data string that occurs more than once in the block. Additionally Oacle can re-arrange the column values in that token table so that multiple column values can be turned into a single token and, thus, a single reference. The line shown above indicates what column values map to the table positions in the token table for this block; in this case column 0 maps to the data in table column 0, column 1 maps to the data in table column 2, column 2 maps to data column 3 and column 3 maps to data column 1. Let’s look at the unique data that was inserted:


('ZZXZ', 'bbddff', 199.44, 'PENDING');
('ZZXZ', 'ghijkl', 43.08, 'PENDING');
('ZZXZ', 'bbddff', 881.02, 'PENDING');
('ZZXZ', 'bbddff', 54.97, 'PENDING');

Since these data rows are duplicated in each block every column is a potential compression token. Two values occur in every row, ‘ZZXZ’ and ‘PENDING’, so it should be expected that tokens for those values will be found in each of the compressed data rows. As mentioned previously Oracle builds a token table in each block so there are two tables in this block, the first, starting at offset 0, is the token table that has 7 rows and the second, starting at offset 7, is the actual table data and there are 721 rows:


0x24:pti[0]	nrow=7		offs=0
0x28:pti[1]	nrow=721	offs=7

Oracle has a clue with this implementation of compression and can create a token that includes a data value and a token, from the same token table, to reduce that row length even further. The examples provided here won’t be demonstrating that but know that it is possible. Now let’s look at the first row in this block for the data table:


tab 1, row 0, @0x1f31
tl: 5 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [ 4]  5a 5a 58 5a
col  1: [ 7]  50 45 4e 44 49 4e 47
col  2: [ 6]  62 62 64 64 66 66
col  3: [ 3]  c1 37 62
bindmp: 2c 00 01 04 02

The actual column lengths are supplied between the square brackets for each column; the total length should be the sum of those values plus 7 bytes, 4 of those for the column lengths, one for the lock byte, one for the flag byte and one for the column count. Using that information the total length should be 24 bytes; the block dump provides a different total length of 5, as reported by the tl entry. There is a line at the end of the row dump labeled bindmp (a binary dump of the row contents) revealing the actual contents of those 5 bytes. As expected there is the lock byte (0x2c), the number of columns at this location (0x01) and two bytes representing the token, reporting that 4 columns are in this token and that the reference row in the token table is row 2. So, let’s look at table 0, row 2:


tab 0, row 2, @0x1f5c
tl: 10 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [ 4]  5a 5a 58 5a
col  1: [ 7]  50 45 4e 44 49 4e 47
col  2: [ 6]  62 62 64 64 66 66
col  3: [ 3]  c1 37 62
bindmp: 00 b3 04 04 05 06 cb c1 37 62

It looks almost like the data row, but the total token length is 10 bytes. Looking at the bindmp the first two bytes indicate this token is used 179 times in this block, the third byte indicates that 4 columns are in this token, the two bytes after that report that the first two columns are also tokens, 0x04 and 0x05. Going back to the token table we see that those tokens are:


tab 0, row 4, @0x1f66
tl: 7 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 4]  5a 5a 58 5a
bindmp: 00 04 cc 5a 5a 58 5a
tab 0, row 5, @0x1f76
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  50 45 4e 44 49 4e 47
bindmp: 00 04 cf 50 45 4e 44 49 4e 47

These are single-column tokens, and each is used 4 times in this block. This is how Oracle reduced the row length from 24 bytes to 5 to save block space. Working through the block dump it’s now possible to re-construct the 24 bytes of data the row originally contained even though it now is only 5 bytes in length.


"The way you see things depends a great deal on where you look at them from."
 - Norton Juster, The Phantom Tollbooth 

We see that Oracle doesn’t actually compress data, it replaces duplicate values with tokens and, through those tokens, reconstructs the data at query time by using the row directory and the actual row pieces in each block. Depending on the select list some tokens won’t be accessed if that data isn’t required. Of course all of this re-constructing can be expensive at the CPU level, and for full table scans of large tables performance can be an issue, especially with the “cache buffers chains” latch because Oracle is performing fewer “consistent gets – examination”. This is because Oracle has to pin blocks for a longer period due to the reconstruction. On the plus side the number of physical reads can decrease since the data is in a smaller number of blocks and can stay in the cache longer. Using basic compression is a trade-off between size and performance, and for extremely large tables or in cases where the compression savings are quite large (meaning the data is compressed more) queries may become CPU-intensive rather than I/O intensive. The good and the bad need to be weighed carefully when making the decision to use compression; choose wisely. Space is relatively inexpensive when compared to end-user satisfaction. The DBA’s idea of performance and the end-users ideas of performance use different criteria, and it’s really the end-users idea that should take precendence.

Anyone up for a swim?

September 8, 2014

Blooms

Filed under: Exadata,General,Performance — dfitzjarrell @ 09:41

In Oracle releases 10.2.0.x and later join processing can be made more efficient by the use of Bloom filters, primarily to reduce traffic between parallel query slaves. What is a Bloom filter? Named after Burton Howard Bloom, who came up with the concept in the 1970s, it’s an efficient data structure used to quickly determine if an element has a high probability of being a member of a given set. It’s based on a bit array that allows for rapid searches and returns one of two results: either the element is probably in the set (which can produce false positives) or the element is definitely not in the set. The filter cannot produce false negatives, and the incidence of false positives is relatively rare. Another advantage to Bloom filters is their small size relative to other data structures used for similar purposes (self-balancing binary search trees, hash tables, or linked lists). The possibility of false positives necessitates the addition of another filter to eliminate them from the results, yet such a filter doesn’t add appreciably to the process time and, therefore, goes relatively unnoticed.

When Bloom filters are used for a query on an Exadata system, the filter predicate and the storage predicate will list the SYS_OP_Bloom_FILTER function as being called. This function includes the additional filter to eliminate any false positives that could be returned. It’s the storage predicate that provides the real power of the Bloom filter on Exadata. Using a Bloom filter to pre-join the tables at the storage server level reduces the volume of data the database servers need to process and can significantly reduce the execution time of the given query.

An example of Bloom filters in action follows; a script named Bloom_fltr_ex.sql was executed to generate this output on an Exadata system; the script can be emailed on request:


SQL> --
SQL> -- Create sample tables
SQL> --
SQL> -- Create them parallel, necessary 
SQL> -- to get a Smart Scan on these tables
SQL> --
SQL> create table emp(
  2  	empid   number,
  3  	empnmvarchar2(40),
  4  	empsal  number,
  5  	empssn  varchar2(12),
  6  	constraint emp_pk primary key (empid)
  7  ) parallel 4;

Table created.

SQL>
SQL> create table emp_dept(
  2  	empid   number,
  3  	empdept number,
  4  	emploc  varchar2(60),
  5    constraint emp_dept_pk primary key(empid)
  6  ) parallel 4;

Table created.

SQL>
SQL> create table dept_info(
  2  	deptnum number,
  3  	deptnm  varchar2(25),
  4    constraint dept_info_pk primary key(deptnum)
  5  ) parallel 4;

Table created.

SQL>
SQL> --
SQL> -- Load sample tables with data
SQL> --
SQL> begin
  2  	     for i in 1..2000000 loop
  3  		     insert into emp
  4  		     values(i, 'Fnarm'||i, (mod(i, 7)+1)*1000, mod(i,10)||mod(i,10)||mod(i,10)||'-'||mod(i,10)||mod(i,10)||'-'||mod(i,10)||mod(i,10)||mod(i,10)||mod(i,10));
  5  		     insert into emp_dept
  6  		values(i, (mod(i,8)+1)*10, 'Zanzwalla'||(mod(i,8)+1)*10);
  7  		     commit;
  8  	     end loop;
  9  	     insert into dept_info
 10  	     select distinct empdept, case when empdept = 10 then 'SALES'
 11  					   when empdept = 20 then 'PROCUREMENT'
 12  					   when empdept = 30 then 'HR'
 13  					   when empdept = 40 then 'RESEARCH'
 14  					   when empdept = 50 then 'DEVELOPMENT'
 15  					   when empdept = 60 then 'EMPLOYEE RELATIONS'
 16  					   when empdept = 70 then 'FACILITIES'
 17  					   when empdept = 80 then 'FINANCE' end
 18  	     from emp_dept;
 19  
 20  end;
 21  /

PL/SQL procedure successfully completed.

SQL>
SQL> --
SQL> -- Run join query using Bloom filter
SQL> --
SQL> -- Generate execution plan to prove Bloom
SQL> -- filter usage
SQL> --
SQL> -- Also report query execution time
SQL> --
SQL> set autotrace on
SQL> set timing on
SQL>
SQL> select /*+ Bloom join 2 parallel 2 use_hash(empemp_dept) */ e.empid, e.empnm, d.deptnm, e.empsal, count(*) emps
  2  fromemp e join emp_depted on (ed.empid = e.empid) join dept_info d on (ed.empdept = d.deptnum)
  3  whereed.empdept = 20
  4  group by e.empid, e.empnm, d.deptnm, e.empsal;

     EMPID EMPNM                  DEPTNM                        EMPSAL       EMPS
---------- ---------------------- ------------------------- ---------- ----------
    904505 Fnarm904505            PROCUREMENT                     1000          1
    907769 Fnarm907769            PROCUREMENT                     3000          1
    909241 Fnarm909241            PROCUREMENT                     5000          1
    909505 Fnarm909505            PROCUREMENT                     3000          1
    909641 Fnarm909641            PROCUREMENT                     6000          1
    910145 Fnarm910145            PROCUREMENT                     6000          1
...
    155833 Fnarm155833            PROCUREMENT                     7000          1
    155905 Fnarm155905            PROCUREMENT                     2000          1
    151081 Fnarm151081            PROCUREMENT                     1000          1
    151145 Fnarm151145            PROCUREMENT                     2000          1

250000 rows selected.

Elapsed: 00:00:14.27

Execution Plan
----------------------------------------------------------
Plan hash value: 2313925751

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name         | Rows | Bytes | Cost (%CPU)| Time     |   TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |              |  218K|    21M|  1378   (1)| 00:00:01 |       |      |            |
|   1 |  PX COORDINATOR                          |              |      |       |            |          |       |      |            |
|   2 |   PX SEND QC (RANDOM)                    | :TQ10003     |  218K|    21M|  1378   (1)| 00:00:01 | Q1,03 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY                         |              |  218K|    21M|  1378   (1)| 00:00:01 | Q1,03 | PCWP |            |
|   4 |     PX RECEIVE                           |              |  218K|    21M|  1378   (1)| 00:00:01 | Q1,03 | PCWP |            |
|   5 |      PX SEND HASH                        | :TQ10002     |  218K|    21M|  1378   (1)| 00:00:01 | Q1,02 | P->P | HASH       |
|   6 |       HASH GROUP BY                      |              |  218K|    21M|  1378   (1)| 00:00:01 | Q1,02 | PCWP |            |
|*  7 |        HASH JOIN                         |              |  218K|    21M|  1376   (1)| 00:00:01 | Q1,02 | PCWP |            |
|   8 |         PX RECEIVE                       |              |  218K|    11M|   535   (1)| 00:00:01 | Q1,02 | PCWP |            |
|   9 |          PX SEND BROADCAST               | :TQ10001     |  218K|    11M|   535   (1)| 00:00:01 | Q1,01 | P->P | BROADCAST  |
|  10 |           NESTED LOOPS                   |              |  218K|    11M|   535   (1)| 00:00:01 | Q1,01 | PCWP |            |
|  11 |            BUFFER SORT                   |              |      |       |            |          | Q1,01 | PCWC |            |
|  12 |             PX RECEIVE                   |              |      |       |            |          | Q1,01 | PCWP |            |
|  13 |              PX SEND BROADCAST           | :TQ10000     |      |       |            |          |       | S->P | BROADCAST  |
|  14 |               TABLE ACCESS BY INDEX ROWID| DEPT_INFO    |    1 |    27 |     1   (0)| 00:00:01 |       |      |            |
|* 15 |                INDEX UNIQUE SCAN         | DEPT_INFO_PK |    1 |       |     1   (0)| 00:00:01 |       |      |            |
|  16 |            PX BLOCK ITERATOR             |              |  218K|  5556K|   534   (1)| 00:00:01 | Q1,01 | PCWC |            |
|* 17 |             TABLE ACCESS STORAGE FULL    | EMP_DEPT     |  218K|  5556K|   534   (1)| 00:00:01 | Q1,01 | PCWP |            |
|  18 |         PX BLOCK ITERATOR                |              | 1657K|    75M|   839   (1)| 00:00:01 | Q1,02 | PCWC |            |
|* 19 |          TABLE ACCESS STORAGE FULL       | EMP          | 1657K|    75M|   839   (1)| 00:00:01 | Q1,02 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("ED"."EMPID"="E"."EMPID")
  15 - access("D"."DEPTNUM"=20)
  17 - storage("ED"."EMPDEPT"=20)
       filter("ED"."EMPDEPT"=20)
  19 - storage(SYS_OP_Bloom_FILTER(:BF0000,"E"."EMPID"))
       filter(SYS_OP_Bloom_FILTER(:BF0000,"E"."EMPID"))

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
     60  recursive calls
    174  db block gets
  40753  consistent gets
  17710  physical reads
   2128  redo size
9437983  bytes sent via SQL*Net to client
 183850  bytes received via SQL*Net from client
  16668  SQL*Net roundtrips to/from client
      6  sorts (memory)
      0  sorts (disk)
 250000  rows processed

SQL>

In less than 15 seconds 250,000 rows were returned from a three table join of over 4 million rows; the Bloom filter made a dramatic difference in how this query was processed and provided exceptional performance given the volume of data queried.

Without the storage level execution of the Bloom filter, the query execution time increased by 2.33 seconds, a 16.3 percent increase. For longer execution times, this difference can be significant. It isn’t the Bloom filter that gives Exadata such power with joins, it’s the fact that Exadata can execute it not only at the database level but also at the storage level, something commodity hardware configurations can’t do.

Of course Bloom filters are not exclusive to Exadata, as the following run of the same script on a non-Exadata system shows (only the abbreviated output section is reproduced here):


SQL> select /*+ Bloom join 2 parallel 2 use_hash(emp emp_dept) */ e.empid, e.empnm, d.deptnm, e.empsal	  --, count(*) emps
  2  from emp e join emp_dept ed on (ed.empid = e.empid) join dept_info d on (ed.empdept = d.deptnum)
  3  where ed.empdept = 20;

     EMPID EMPNM                                    DEPTNM                        EMPSAL                                                              
---------- ---------------------------------------- ------------------------- ----------                                                              
   1670281 Fnarm1670281                             PROCUREMENT                     5000                                                              
   1670289 Fnarm1670289                             PROCUREMENT                     6000                                                              
   1670297 Fnarm1670297                             PROCUREMENT                     7000                                                              
   1670305 Fnarm1670305                             PROCUREMENT                     1000                                                              
   1670313 Fnarm1670313                             PROCUREMENT                     2000                                                              
   1670321 Fnarm1670321                             PROCUREMENT                     3000                                                              
   1670329 Fnarm1670329                             PROCUREMENT                     4000                                                              
   1670337 Fnarm1670337                             PROCUREMENT                     5000                                                              
   1670345 Fnarm1670345                             PROCUREMENT                     6000                                                              
   1670353 Fnarm1670353                             PROCUREMENT                     7000                                                              
   1670361 Fnarm1670361                             PROCUREMENT                     1000                                                              
...
   1857369 Fnarm1857369                             PROCUREMENT                     4000                                                              
   1857377 Fnarm1857377                             PROCUREMENT                     5000                                                              
   1857385 Fnarm1857385                             PROCUREMENT                     6000                                                              

250000 rows selected.

Elapsed: 00:00:54.86

Execution Plan
----------------------------------------------------------                                                                                            
Plan hash value: 2643012915                                                                                                                           
                                                                                                                                                      
----------------------------------------------------------------------------------------------------------------------------------                    
| Id  | Operation                            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |                    
----------------------------------------------------------------------------------------------------------------------------------                    
|   0 | SELECT STATEMENT                     |              |   273K|    26M|  1378   (1)| 00:00:01 |        |      |            |                    
|   1 |  PX COORDINATOR                      |              |       |       |            |          |        |      |            |                    
|   2 |   PX SEND QC (RANDOM)                | :TQ10002     |   273K|    26M|  1378   (1)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |                    
|*  3 |    HASH JOIN                         |              |   273K|    26M|  1378   (1)| 00:00:01 |  Q1,02 | PCWP |            |                    
|   4 |     PX RECEIVE                       |              |   273K|    13M|   536   (1)| 00:00:01 |  Q1,02 | PCWP |            |                    
|   5 |      PX SEND BROADCAST               | :TQ10001     |   273K|    13M|   536   (1)| 00:00:01 |  Q1,01 | P->P | BROADCAST  |                    
|   6 |       NESTED LOOPS                   |              |   273K|    13M|   536   (1)| 00:00:01 |  Q1,01 | PCWP |            |                    
|   7 |        BUFFER SORT                   |              |       |       |            |          |  Q1,01 | PCWC |            |                    
|   8 |         PX RECEIVE                   |              |       |       |            |          |  Q1,01 | PCWP |            |                    
|   9 |          PX SEND BROADCAST           | :TQ10000     |       |       |            |          |        | S->P | BROADCAST  |                    
|  10 |           TABLE ACCESS BY INDEX ROWID| DEPT_INFO    |     1 |    27 |     1   (0)| 00:00:01 |        |      |            |                    
|* 11 |            INDEX UNIQUE SCAN         | DEPT_INFO_PK |     1 |       |     1   (0)| 00:00:01 |        |      |            |                    
|  12 |        PX BLOCK ITERATOR             |              |   273K|  6947K|   535   (1)| 00:00:01 |  Q1,01 | PCWC |            |                    
|* 13 |         TABLE ACCESS FULL            | EMP_DEPT     |   273K|  6947K|   535   (1)| 00:00:01 |  Q1,01 | PCWP |            |                    
|  14 |     PX BLOCK ITERATOR                |              |  2099K|    96M|   840   (1)| 00:00:01 |  Q1,02 | PCWC |            |                    
|* 15 |      TABLE ACCESS FULL               | EMP          |  2099K|    96M|   840   (1)| 00:00:01 |  Q1,02 | PCWP |            |                    
----------------------------------------------------------------------------------------------------------------------------------                    
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   3 - access("ED"."EMPID"="E"."EMPID")                                                                                                               
  11 - access("D"."DEPTNUM"=20)                                                                                                                       
  13 - filter("ED"."EMPDEPT"=20)                                                                                                                      
  15 - filter(SYS_OP_Bloom_FILTER(:BF0000,"E"."EMPID"))                                                                                               
                                                                                                                                                      
Note                                                                                                                                                  
-----                                                                                                                                                 
   - dynamic sampling used for this statement (level=2)                                                                                               


Statistics
----------------------------------------------------------                                                                                            
         33  recursive calls                                                                                                                          
        139  db block gets                                                                                                                            
      36224  consistent gets                                                                                                                          
      17657  physical reads                                                                                                                           
          0  redo size                                                                                                                                
    9526012  bytes sent via SQL*Net to client                                                                                                         
     183846  bytes received via SQL*Net from client                                                                                                   
      16668  SQL*Net roundtrips to/from client                                                                                                        
          5  sorts (memory)                                                                                                                           
          0  sorts (disk)                                                                                                                             
     250000  rows processed                                                                                                                           

SQL>

The only differences between the two plans involve the storage component of Exadata. Notice also the vast difference in elapsed times between Exadata and non-Exadata systems; Exadata ran the query returning 250,000 rows in just over 14 seconds and the non-Exadata system returned the same volume of data in just under 55 seconds. Both times are substantial improvements over not using a Bloom filter so any hardware running 11.2.0.2 or later will benefit from Bloom filters.

Of course there is a ‘gotcha’ with Bloom filters, and Oracle recognizes this as a bug in releases older than 12.1.0.2. Hash left outer joins, hash right outer joins and hash full outer joins do not use a Bloom Filter and for those interested the unpublished bug number is 17716301. MOS document 1919508.1 reports details on this behavior including an example. As reported in that document the patch can be backported to earlier releases when an SR is filed requesting it.

Bloom filters are a welcome addition to both Exadata and non-Exadata installations as they can improve query performance with parallel queries, as evidenced here. Now that you know how to spot when they are used you may find they are utilized more often than you expected.

It’s a bloomin’ miracle.

July 22, 2014

That’s … Huge!

Filed under: Exadata,General,Performance — dfitzjarrell @ 14:27

Recently I’ve noticed the occasional thread in Oracle newsgroups and lists asking about hugepages support in Linux, including ‘best practices’ for hugepages configuration. This information is out on that ‘world-wide web’ in various places; I’d rather put a lot of that information in this post to provide an easier way to get to it. I’ll cover what hugepages are, what they do, what they can’t do and how best to allocate them for your particular installation. Let’s get started.

“Normal” memory pages in Linux are 4 KB in size and are allocated as needed where the memory map will allow so they are likely not contiguous. Hugepages, in comparison, are 2 MB pages locked in memory and are allocated in a contiguous ‘chunk’; these are allocated at boot time using a parameter in the /etc/sysctl.conf file named vm.nr_hugepages for RedHat Linux kernel 2.6 and a parameter named vm.hugetlb_pool for RedHat Linux kernel 2.4. You must remember that each page is 2 MB is size, as this affects how many hugepages you need to allocate to cover all of the SGAs of running Oracle databases. Set this too high and your system might not boot; set this too low and you won’t get the benefit of using hugepages in one or more Oracle instances. [When use_large_pages is set to ONLY and insufficient hugepages are available the database won’t start; changing use_large_pages to TRUE allows the database to start but it won’t be using hugepages, which results in more paging activity, lowering performance.] Since hugepages are contiguous it stands to reason that performance will improve since Oracle won’t need to access memory segments across the entire range of installed memory. Hugepages also reduce the size of ‘page tables’ by reducing the overhead for page tables for each connection to the database. Page tables are part of the Linux memory management, created and used to map a virtual address to a physical memory location. For SGAs 100 MB and greater using hugepages is probably a good idea, and that idea gets better as the SGA size increases. Kevin Closson reported back in 2009 on the page table overhead for an Oracle database with an 8000 MB SGA running 500 simultaneous connections. The page table overhead reported was startling — almost 6.8 GB of memory used for page tables when hugepages were not configured. He ran the same test again, on the same Oracle database, only that test was run after hugepages were allocated and used by the Oracle database. The page table overhead was reduced to slightly less than 23 MB. That is a dramatic change in memory usage, one that can definitely improve performance.

Hugepages are available for IPC (Inter-Process Communication) shared memory; this is the ‘standard’ shared memory model (starting with UNIX System V) allowing multiple processes to access the same shared memory segment. There is also another form of shared memory segment, the memory-mapped file, and currently such segments are not supported by hugepages. Oracle, on Linux, gives you a choice of using hugepages or memory-mapped files and you implement that choice by selecting to use (or not use) Automatic Memory Management (AMM). Choosing AMM disallows hugepages since AMM uses memory-mapped files located in /dev/shm; you cannot use AMM and hugepages together. For databases with SGAs up to 100 MB using AMM is probably not a good idea; a script to determine how many hugepages to allocate won’t consider SGAs of less than 100 MB. For Oracle databases with larger SGAs hugepages is, in my opinion, the choice to make as you can significantly reduce shared memory overhead where page tables are concerned.

Calculating how many hugepages to allocate has been a difficult task in the past; Oracle supplies a script that can calculate that value for you provided you have all of the databases that you want to use hugepages running; the script from Oracle won’t consider databases you have created that aren’t up and available as it uses the ipcs -m command to return the allocated shared memory segments. The script is available from My Oracle Support in Doc ID 401749.1. It generates a list of IPC shared memory segment sizes owned by Oracle and computes the total number of hugepages required. It outputs the result to be used in setting the vm.nr_hugepages parameter in /etc/sysctl.conf. Once that setting is modified the system has to be rebooted before the changes are implemented. Sample output from that script is shown below:


This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please note following:
 * For ASM instance, it needs to configure ASMM instead of AMM.
 * The 'pga_aggregate_target' is outside the SGA and
   you should accommodate this while calculating SGA size.
 * In case you changes the DB SGA size,
   as the new SGA will not fit in the previous HugePages configuration,
   it had better disable the whole HugePages,
   start the DB with new SGA size and run the script again.
And make sure that:
 * Oracle Database instance(s) are up and running
 * Oracle Database 11g Automatic Memory Management (AMM) is not setup
   (See Doc ID 749851.1)
 * The shared memory segments can be listed by command:
     # ipcs -m


Press Enter to proceed...

Recommended setting: vm.nr_hugepages = 25240

For Linux RedHat kernel 2.6 this is the number of hugepages, not the overall size of the memory allocated for those hugepages. For RedHat kernel 2.4 a different parameter, vm.hugetlb_pool, is set to the actual size that pool needs to be. Fortunately the script knows which kernel is running and reports the correct parameter and value to set.

After you have hugepages allocated you need to tell Oracle to use them exclusively. An init parameter, use_large_pages, should be set to ONLY to keep Oracle from allocating conventionally managed memory segments for the SGA. If your hugepages calculations are correct then all of the databases on your database server will start without error. Allocate too few hugepages and databases with use_large_pages set to ONLY won’t start after all of the existing hugepages are used. Setting use_large_pages to TRUE will allow a database to start without using hugepages but that’s not what you want to get better performance from Oracle.

Earlier I mentioned that allocating too many hugepages can be worse than allocating too few hugepages. Since each hugepage consumes 2 MB of space multiplying the vm.nr_hugepages setting by 2097152 will show how much memory is occupied by hugepages. It is possible to try to allocate all of the system memory as hugepages resulting in a system that will not boot in multi-user mode. The kernel isn’t configured to use hugepages; the more hugepages that are allocated the less conventionally managed memory is available. Take it all away and there is no place for the operating system to start necessary processes after the kernel is loaded at boot time. Using single-user mode will allow the system adminstrator to modify the offending entry and change it to a more reasonable value so the system can be rebooted. Of course you don’t want to end up in that situation; manually calculating such a value can result in errors so use the Oracle-supplied script to calculate the hugepages value you actually need.

Exadata systems run best when hugepages are allocated and used exclusively. The exachk script checks for such settings in each running database on the system and reports those that do not have use_large_pages set to ONLY. It also checks if the currently configured hugepages setting meets or exceeds the estimated number of hugepages required by all running instances on the given database (compute) node. If this check fails it most likely will fail for all available nodes on the system. Failing this check isn’t the end of the world but it does mean that the failing databases aren’t using memory as efficiently as they could. For each node run the script, make the reported changes then reboot. Before starting any of the databases ensure that all have use_large_pages set to ONLY; start the databases across the cluster using srvctl. The next run ef exachk should report that the hugepages check passed.

RAC databases need to be configured the same way Exadata databases are configured (since Exadata databases are RAC databases). Modify kernel parameters on every database server in the cluster, then reboot. Modify the spfile for each database in the cluster then use srvctl to start those databases cluster-wide. Presuming the hugepages settings are correct across the cluster there should be no error when starting the databases.

Using hugepages for Oracle databases on Linux is, to me, the only way to go if you want to increase database performance. Of course, if you have loads memory to spare you may not see the need. Most systems I’ve worked on don’t have that luxury so being as ‘frugal’ with memory as you can is good. Making the most of available resources is key and hugepages on Linux can definitely help with respect to Oracle.

Size does matter.

Next Page »

Create a free website or blog at WordPress.com.