Oracle Tips and Tricks — David Fitzjarrell

October 25, 2018

Tuning Time

Filed under: General,Performance — dfitzjarrell @ 15:42

"“You see,” he continued, beginning to feel better, “once there was no time at all, and people found it very inconvenient.
They never knew whether they were eating lunch or dinner, and they were always missing trains. So time was invented to
help them keep track of the day and get places when they should. When they began to count all the time that was available,
what with 60 seconds in a minute and 60 minutes in an hour and 24 hours in a day and 365 days in a year, it seemed as if
there was much more than could ever be used. ‘If there’s so much of it, it couldn’t be very valuable,’ was the general
opinion, and it soon fell into disrepute. People wasted it and even gave it away. Then we were given the job of seeing
that no one wasted time again,” he said, sitting up proudly.” 
-- Norton Juster, The Phantom Tollbooth

DBAs usually get questions on query performance when one runs slower than the user would like and. as a result, questions on how best to go about tuning a query soon follow. End-users usually don’t like waiting for results and queries that take longer than a few seconds tend to be considered as suspect. What many end-users don’t know is that query tuning is a combination of art, science and (gasp) intuition. Elegant tuning solutions are art, based on science, and that science uses two of Oracle’s events, 10046 and 10053, to provoide more detailed information. Which one is more beneficial depends on the query and the ‘bad’ execution plan. A 10046 trace can show what Oracle is waiting on and where time is being spent in the execution phase of running a query. Sometimes, however, the waits and elapsed time breakdowns aren’t enough, which is where the 10053 trace can be extremely helpful. Note that this post takes one particular situation and gives abbreviated steps in solving it; it’s not meant to be a treatise on the subject nor a detailed case study. With that in mind let’s look at a plan where that event can provide much-needed information.

Queries using nested subqueries can take a number of paths to return data; one possibility involves what Oracle calls ‘subquery unnesting’, a process where Oracle re-writes the nested query into an unnested join with the intent that this will improve data access and thus query performance. This can be done for several nested subqueries, resulting in one or more NESTED LOOP joins in the execution plan. This is where the road can take a less than desirable turn; joins consume time, making the query execution longer than it probably should be. If the query in question is a ‘one-off’, run once, maybe twice in a day, this may not be a real problem as a few seconds here and there won’t throw off the regular day’s work. If, on the other hand, this is a regularly run statement, the longer run times can stack up on each other and cause delays in processing, and, as the old addage states: “Time is money.” Since the query plan won’t give details on how Oracle eventually generated it one course of action is setting the 10053 event in the current session and executing the problem statement.

The first step in tuning is to obtain the current execution plan and see if it makes ‘sense’. A query with joins will likely produce NESTED LOOP JOIN or HASH JOIN steps. In this case the plan produced a NESTED LOOPS SEMI step that, on the face of it, made no sense as the query contained no join conditions as originally written. This could be a clue that the optimizer did something different and posibly unexpected with the plan. That is exactly what prompted this investigation which used a 10053 event trace to ‘track down’ what the optimizer chose to do. This, in turn, caused a slight rewrite of the query to include the necessary hints to ‘nudge’ the optimizer toward a more efficient plan.

Setting event 10053 is not a difficult task; let’s look at a script that does this and then executes several SELECT statements before turning the 10053 ‘switch’ off (unfortunately the actual queries can’t be published as they are proprietary so placeholders will be used):


alter session set tracefile_identifier='MAILQUEUE';
alter session set events = '10053 trace name context forever, level 2';

[First query statement here]

[Second query statement here]

[Third query statement here]

[Fourth query statement here]

alter system set events = '10053 trace name context off';

The important parts of the above example are the lines prior to any query statements, which turn on the 10053 event and set a trace file identifier so the generated trace file can be easily identified, and the statement after the queries have run, which stops the 10053 event. Unlike the 10046 event, which is more easily read after processing through the tkprof utility, the 10053 trace file needs no additional work. It can be a bit daunting at first since it’s like no other trace file most DBAs have seen. Let’s go through the example, using the actual trace file generated by the unobfuscated script, and see what it reveals.

The beginning of the trace file provides some basic information on the environment and provides query ‘signatures’ that the optimizer uses to identify query and subquery blocks:


Trace file ....
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
ORACLE_HOME = ...
System name:	SunOS
Node name:	...
Release:	5.10
Version:	Generic_150400-61
Machine:	sun4v
Instance name: ...
Redo thread mounted by this instance: 1
Oracle process number: 1134
Unix process pid: 22631, image: oracle@... (TNS V1-V3)

Query Block Registry:
SEL$2 0x77a10ca0 (PARSER)
  SEL$F5BB74E1 0x77a198b0 (VIEW MERGE SEL$1; SEL$2) [FINAL]
SEL$1 0x77a198b0 (PARSER)
  SEL$F5BB74E1 0x77a198b0 (VIEW MERGE SEL$1; SEL$2) [FINAL]

:
    call(in-use=12456, alloc=347568), compile(in-use=95008, alloc=158520), execution(in-use=15312, alloc=16200)

End of Optimizer State Dump
Dumping Hints
=============
====================== END SQL Statement Dump ======================
*** SESSION ID:(758.2581) 2018-08-21 09:40:23.286
 
Registered qb: SEL$1 0x77a198b0 (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=68513 hint_alias="DQ"@"SEL$1"

Registered qb: SEL$2 0x7751be68 (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$2 nbfros=1 flg=0
    fro(0): flg=4 objn=69663 hint_alias="WPV"@"SEL$2"

Registered qb: SEL$3 0x7751b188 (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$3 nbfros=1 flg=0
    fro(0): flg=4 objn=10290765 hint_alias="MRE"@"SEL$3"

SPM: statement not found in SMB
SPM: capture of plan baseline is OFF

This begins the portion of the trace file of interest, including the query [not posted due to confidentiality concerns] and the legend of abbreviations:


**************************
Automatic degree of parallelism (AUTODOP)
**************************
Automatic degree of parallelism is disabled: Parameter.
kkopqSetForceParallelProperties: Hint:no
Query: compute:yes forced:no forceDop:0
kkopqSetDopReason: Reason why we chose this DOP is: table property.
table property forces parallelism

Global Manual DOP: 1 - Rounded?: no
PM: Considering predicate move-around in query block SEL$1 (#0)
**************************
Predicate Move-Around (PM)
**************************
OPTIMIZER INFORMATION

******************************************
----- Current SQL Statement for this session (sql_id=*************) -----
[ Removed for confidentiality reasons ]
*******************************************
Legend
The following abbreviations are used by optimizer trace.
CBQT - cost-based query transformation
JPPD - join predicate push-down
OJPPD - old-style (non-cost-based) JPPD
FPD - filter push-down
PM - predicate move-around
CVM - complex view merging
SPJ - select-project-join
SJC - set join conversion
SU - subquery unnesting
OBYE - order by elimination
OST - old style star transformation
ST - new (cbqt) star transformation
CNT - count(col) to count(*) transformation
JE - Join Elimination
JF - join factorization
CBY - connect by
SLP - select list pruning
DP - distinct placement
VT - vector transformation
qb - query block
LB - leaf blocks
DK - distinct keys
LB/K - average number of leaf blocks per key
DB/K - average number of data blocks per key
CLUF - clustering factor
NDV - number of distinct values
Resp - response cost
Card - cardinality
Resc - resource cost
NL - nested loops (join)
SM - sort merge (join)
HA - hash (join)
CPUSPEED - CPU Speed 
IOTFRSPEED - I/O transfer speed
IOSEEKTIM - I/O seek time
SREADTIM - average single block read time
MREADTIM - average multiblock read time
MBRC - average multiblock read count
MAXTHR - maximum I/O system throughput
SLAVETHR - average slave I/O throughput
dmeth - distribution method
  1: no partitioning required
  2: value partitioned
  4: right is random (round-robin)
  128: left is random (round-robin)
  8: broadcast right and partition left
  16: broadcast left and partition right
  32: partition left using partitioning of right
  64: partition right using partitioning of left
  256: run the join in serial
  0: invalid distribution method
sel - selectivity
ptn - partition
AP - adaptive plans

Other areas of the trace file provide environment information that is not necessary for this discussion. Because the execution plan includes a NESTED LOOPS SEMI step the reason for this should be ascertained since the original query had no join conditions. Looking through the trace file doesn’t reveal anything that looks unusual to the modestly trained eye until the section provided below is examined. Searching through the trace file for indications of join activity allowed the DBA to eventually ‘zero in’ on the offending optimizer step. Subquery unnesting isn’t uaually an issue, but in this case it involved two similar nested subqueries that Oracle thought would perform better as joins. Unfortunately this optimizer decision ended up as the more costly alternative:


*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SEL$1 (#1) that are valid to unnest.
Subquery removal for query block SEL$3 (#3)
RSW: Not valid for subquery removal SEL$3 (#3)
Subquery unchanged.
Subquery Unnesting on query block SEL$2 (#2)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$2 (#2).
SU:   Checking validity of unnesting subquery SEL$3 (#3)
SU:   Passed validity checks.
SU:   Transforming ANY subquery to a join.
Registered qb: SEL$BE5C8E5F 0x755990f8 (SUBQUERY UNNEST SEL$2; SEL$3)

The query unnesting results in performance degradation because Oracle now must process a nested loop join; hinting the subqueries altered the nested loops path into two filter operations, speeding up the execution time considerably. Having this information allows the DBA to use the proper hint (no_unnest) on each of the nested subqueries to eliminate this behavior.

Sometimes query tuning can require a considerable amount of time, especially when searching a trace file for a cause with little more than an execution plan and the fact that the plan doesn’t match how the query was constructed. Even DBAs experienced with the oddities of the optimizer may miss the cause when it’s buried in a section that doesn’t appear to apply at first glance.

No post on any aspect of query tuning can provide an absolute, foolproof method for improving performance. There are simply too many variables and too many optimizer possibilities to consider. Hopefully this post has provided a starting point for query and statement tuning; remember that each query is different and that performance tuning is statement-specific. Consider each query on its own and major improvements may not be guaranteed, but sometimes even the smallest change can produce measurable results.

Only time will tell.

Advertisements

August 29, 2017

Parallel Reality

Filed under: Performance — dfitzjarrell @ 14:33

"Expect everything, I always say, and the unexpected never happens."
-- Norton Juster, The Phantom Tollbooth

The following question was recently posted in an Oracle forum:


hi Friends,
I see  this wait event latch: parallel query alloc buffer, when a job meant for doing some cleanup ran this query.
Why does this wait event come happen? , i searched google,MOS no exact hit for explanation of the exact same event.
Looking at query does it happen because of the incorrect use of parallel hint i.e. no object mentioned and also it
is not specified like parallel(5) .  It is not causing any big  issues, just want to know for my understanding sake.
 
Regd,
Sachin

There were also queries, results and parallel execution plans posted but those aren’t included here. The main issue with this question is the lack of information on the following event:


				latch: parallel query alloc buffer

Parallel query slaves and coordinators allocate message buffers; what may be surprising is the sheer number of these message buffers that can be allocated for a given parallel execution. There’s a forumla that can be used to calculate the number of buffers required based upon the parallel degree calculated or specified; let’s call that value p. Given that information the buffers required would be calculated as:


				p(p+1)*3

for non-RAC databases and:


				p(p+1)*4

for RAC installations. Let’s go through some calculations for non-RAC systems to see how many buffers could be required for various degrees of parallelism. We’ll start with something simple and consider a query executed with a parallel degree of 4; using the first formula provided we get:


				4(4+1) * 3 -> 4(5) * 3 -> 20 * 3 -> 60

so, for a relatively small-scale parallel execution Oracle will need 60 message buffers for the parallel query coordinator and its slaves. Let’s consider a parallel degree of 16; since the degree is 4 times the size of the original query (4) one might expect that the number of buffers might be 240 but that would be considerably less than Oracle would require:


				16(16+1) * 3 -> 16(17) * 3 -> 272 * 3 -> 816

By running at a parallel degree of 16 Oracle now requires 816 message buffers, 13.6 times the number of buffers for the parallel degree 4 execution. Since it’s not uncommon with today’s servers let’s run one more calculation, this time at a parallel degree of 128:


				126(128+1) * 3 -> 128(129) * 3 -> 16512 * 3 -> 45936

Now Oracle is allocating (or attempting to, anyway) 45,936 message buffers. Given that number of buffers to allocate it’s not difficult to understand why such a latch wait could be in the top waits for a given database. One factor affecting the length of time it takes Oracle to allocate such a large number of message buffers is where the PX message pool (‘PX msg pool’) is located; the shared pool or the large pool are common areas where this pool can be found. To find out where, in your database configuration, the ‘PX msg pool’ is found the following query can be run:


break on pool skip 1
column name format a32
 
select * from v$sgastat where pool = 'large pool' or name like '%PX%' or name like '%para%' order by pool, name;

The results might look like these:


FNERBLE @ uulooloo > break on pool skip 1
FNERBLE @ uulooloo > column name format a32
FNERBLE @ uulooloo >
FNERBLE @ uulooloo > select * from v$sgastat where pool = 'large pool' or name like '%PX%' or name like '%para%' order by pool, name;

POOL         NAME                                  BYTES     CON_ID
------------ -------------------------------- ---------- ----------
large pool   PX msg pool                        15728640          0
             free memory                        17825792          0

shared pool  PX QC deq stats                        1696          0
             PX QC msg stats                        3080          0
             PX list of chunk lists                 1088          0
             PX msg pool struct                       56          0
             PX scan rate stats                     1080          0
             PX server deq stats                    1696          0
             PX server msg stats                    3080          0
             PX subheap                           314824          0

POOL         NAME                                  BYTES     CON_ID
------------ -------------------------------- ---------- ----------
shared pool  PX subheap desc                         256          0
             parallel kcbibr                          32          0
             parallel kcbibr dbwr bitv                16          0
             parallel_max_servers                  29440          0
             param hash values                     15872          0
             parameter blocks                      31744          0
             parameter handle                     153624          0
             parameter string values              290632          0
             parameter table block               1174584          0
             parameter text value                   9576          0
             parameter value memory                 1376          0

21 rows selected.

FNERBLE @ uulooloo >

In my database the ‘PX msg pool’ is found in the large pool, a desirable location since it’s probably easier to allocate such buffers there than in the shared pool; using the shared pool could cause the ‘latch: parallel query alloc buffer’ wait to have wait times that could put it as a top-10 wait in an AWR or Statspack report due to contention for available resources.

Looking at another database, this time using version 11.2.0.3, with that same query produces:


FNERBLE @ uulooloo > break on pool skip 1
FNERBLE @ uulooloo > 
FNERBLE @ uulooloo > column name format a32
FNERBLE @ uulooloo > select * from v$sgastat where pool = 'large pool' or name like '%PX%' or name like '%para%' order by pool, name;



POOL         NAME                                  BYTES
------------ -------------------------------- ----------
large pool   PX msg pool                         8192016
             free memory                        16973808

shared pool  PX QC deq stats                        1696
             PX QC msg stats                        2904
             PX list of chunk lists                 1632
             PX msg pool struct                       56
             PX server deq stats                    1696
             PX server msg stats                    2904
             PX subheap                            95592
             parallel kcbibr dbwr bitv                 8

POOL         NAME                                  BYTES
------------ -------------------------------- ----------
shared pool  parallel_max_servers                   5472
             param hash values                     11008
             parameter blocks                      22008
             parameter handle                     292400
             parameter string values              197856
             parameter table block               2179872
             parameter text value                   7328
             parameter value memory                  712


18 rows selected.

FNERBLE @ uulooloo >

Again we see the ‘PX msg pool’ located in the large pool, but don’t take that for granted as earlier releases may put that pool in the shared pool which could create long waits on the ‘latch: parallel query alloc buffer’ event.

Let’s now go through some calculations for RAC systems to see how those numbers change. We’ll use the same parallel degrees found in the first set of calculations:


				4(4+1) * 4 -> 4(5) * 4 -> 20 * 4 -> 80

so, for a relatively small-scale parallel execution Oracle will need 80 message buffers for the parallel query coordinator and its slaves for a RAC configuration, 20 more than the non-RAC example. Let’s consider a parallel degree of 16:


				16(16+1) * 4 -> 16(17) * 4 -> 272 * 4 -> 1088

By running at a parallel degree of 16 in a RAC configuration Oracle now requires 1088 message buffers. As Exadata and ODA sytems are more common now than they were a few years ago a parallel degree of 128 is not unheard of, even with Auto DOP. For such a query execution we find that Oracle will need:


				126(128+1) * 4 -> 128(129) * 4 -> 16512 * 4 -> 66048

Now Oracle is allocating 66,048 message buffers, a large number, indeed. On ‘lesser’ systems (read that as ‘commodity servers’) such a buffer allocation could easily and quicly generate long event wait times for the ‘parallel query alloc buffer’ latch; such long waits could still be possible on Oracle’s engineered systems if the overall utilization is high.

It may not be a top-10 wait in your database but it’s good to be aware of what that latch wait event represents and why it can possibly generate long wait times, especially with a parallel degree of 16 or higher, because the number of message buffers Oracle requires can skyrocket.

And that should be expected.

November 13, 2016

“If You Change This, Right Here …”

Filed under: Performance,stats — dfitzjarrell @ 10:47

"changes are so frightening."
-- Norton Juster, The Phantom Tollbooth

It isn’t unusual for Oracle to expand functionality with each new release; a short list includes AWR, ADDM, the SYSAUX tablespace, Flashback Query, Flashback Database and Edition-based Redefinition. Oracle 12.1 is no different as it provides an interesting performance enhancement, the in-memory column store, that can increase performance by reducing physical I/O load by making tables memory-resident (to simplify the description somewhat). In addition to the in-memory access to table data Oracle has introduced a new optimization called “Vector Transformation, including Vector Optimization” for in-memory operations. Even when this new optimization isn’t in use there are new code paths to take advantage of this technology. As a result execution plans can change; let’s look at an example originally provided by Jonathan Lewis that illustrates this.

(In-memory is fairly easy to configure. Set inmemory_size to a ‘reasonable’ non-zero value and then restart the database:


SQL> alter system set inmemory_size=500m scope=spfile;

System altered.

SQL>

For a complete reference to in-memory configuration read here.)

Two tables are created, neither one especially noteworthy, an index is created on each table and extended statistics are generated on the (n1,n2) column grouping:


SQL> 
SQL> create table t1
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level  
SQL> create table t2
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level 'T1',
  5  		     method_opt       => 'for columns (n1,n2) size 1'
  6  	     );
  7  end;
  8  /
SQL> 

Executing the following query and displaying the execution plan shows no surprises since neither table has been altered to be in-memory:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 1718706536                                                                                                                           
                                                                                                                                                      
-------------------------------------------------------------------------------------------------                                                     
| Id  | Operation                               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |                                                     
-------------------------------------------------------------------------------------------------                                                     
|   0 | SELECT STATEMENT                        |       |       |       |  5127 (100)|          |                                                     
|   1 |  SORT AGGREGATE                         |       |     1 |       |            |          |                                                     
|   2 |   VIEW                                  |       | 10001 |       |  5127   (1)| 00:00:01 |                                                     
|   3 |    HASH UNIQUE                          |       | 10001 |   351K|  5127   (1)| 00:00:01 |                                                     
|*  4 |     HASH JOIN                           |       | 10001 |   351K|  5125   (1)| 00:00:01 |                                                     
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   100 |  2100 |     3   (0)| 00:00:01 |                                                     
|*  6 |       INDEX RANGE SCAN                  | T1_N1 |   100 |       |     1   (0)| 00:00:01 |                                                     
|   7 |      TABLE ACCESS FULL                  | T2    |  1000K|    14M|  5113   (1)| 00:00:01 |                                                     
-------------------------------------------------------------------------------------------------                                                     
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   5 - filter("T1"."N2"=50)                                                                                                                           
   6 - access("T1"."N1"=50)                                                                                                                           
                                                                                                                                                      
SQL> 

Having the extended statistics allowed the optimizer to estimate the correct cardinality between columns n1 and n2, providing a better initial plan. Now things will get a bit interesting; table T2 is altered and put in-memory:


SQL> 
SQL> alter table t2 inmemory;
SQL> 

The same query is executed again and the plan changes because of T2 now being in-memory, even though the new vector transformation has not been applied:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 106371239                                                                                                                            
                                                                                                                                                      
----------------------------------------------------------------------------------------------------                                                  
| Id  | Operation                                | Name    | Rows  | Bytes | Cost (%CPU)| Time     |                                                  
----------------------------------------------------------------------------------------------------                                                  
|   0 | SELECT STATEMENT                         |         |       |       |   223 (100)|          |                                                  
|   1 |  SORT AGGREGATE                          |         |     1 |       |            |          |                                                  
|   2 |   VIEW                                   |         | 10001 |       |   223  (15)| 00:00:01 |                                                  
|   3 |    HASH UNIQUE                           |         | 10001 |   351K|   223  (15)| 00:00:01 |                                                  
|*  4 |     HASH JOIN                            |         | 10001 |   351K|   222  (14)| 00:00:01 |                                                  
|   5 |      JOIN FILTER CREATE                  | :BF0000 |   100 |  2100 |     3   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1      |   100 |  2100 |     3   (0)| 00:00:01 |                                                  
|*  7 |        INDEX RANGE SCAN                  | T1_N1   |   100 |       |     1   (0)| 00:00:01 |                                                  
|   8 |      JOIN FILTER USE                     | :BF0000 |  1000K|    14M|   209  (11)| 00:00:01 |
|*  9 |       TABLE ACCESS INMEMORY FULL         | T2      |  1000K|    14M|   209  (11)| 00:00:01 |                                                  
----------------------------------------------------------------------------------------------------                                                  
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   6 - filter("T1"."N2"=50)                                                                                                                           
   7 - access("T1"."N1"=50)                                                                                                                           
   9 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
                                                                                                                                                      
SQL> 

The cost of the tablescan was reduced considerably, from 5113 to 209 but that isn’t the most thought-provoking part of the plan, however, as a serial Bloom filter is used for the hash join due to the in-memory code path. With a table stored ‘normally’ (not in-memory) a Bloom filter would not be effective; with the in-memory option a Bloom filter can use enhancements such as simultaneous multiple probe access into table t2 (in this example) in addition to the benefits of avoiding disk access, disk buffer activity and row-by-row data retrieval. All of those avoidances reduce CPU time considerably when compared to a ‘regular’ table, one that is not in-memory.

The in-memory column store is a very nice addition to an already powerful database, made even better by new optimizer transformations that speed data access by providing execution paths not available for tables accessed from disk. ‘Experiment’ with the in-memory option to see what performance improvements it might provide; you might be pleasantly surprised at what you find.

Sometimes unexpected change is good.

January 8, 2016

“Say What?!?!?”

Filed under: General,Indexes,Performance — dfitzjarrell @ 16:04

"The only thing you can do easily is be wrong, and that's hardly worth the effort." 
Norton Juster, The Phantom Tollbooth

Oracle can lie to you. Not like a disreputable used-car salesman but more like the ‘little white lie’ sometimes told in order to hide less-than-desirable parts of the truth. And it’s not Oracle, really, it’s the optimizer and it does it by reporting query plans that may not accurately report the execution path. Sometimes we can make Oracle tell lies that it doesn’t know its telling, by using features not everyone uses. Take, for example, NLS settings. There have been improvements in National Language Support in most Oracle releases, and those improvements can introduce behavior that causes Oracle to lie to you. Let’s look at an example where this is true.

Jonathan Lewis brought this example to light in a recent blog post where he dislplayed the plan using dbms_xplan. After reading the post I decided to add to it by using autotrace to generate the plan, just to see if the two plans matched. Let’s look at the modified example:


SQL> --
SQL> --  Setup NLS parameters
SQL> --
SQL> ALTER session SET nls_sort=binary_ci;

Session altered.

SQL> ALTER session SET nls_comp=linguistic;

Session altered.

SQL> 
SQL> --
SQL> -- Create table, populate table and
SQL> -- create index
SQL> --
SQL> CREATE TABLE log_data(
  2    account_id NUMBER,
  3    log_type NUMBER,
  4    sys_name VARCHAR2(30),
  5    log_time TIMESTAMP,
  6    msg varchar2(4000)
  7  )
  8  nologging
  9  ;

Table created.

SQL> 
SQL> insert /*+ append */ into log_data(
  2    account_id,
  3    log_type,
  4    sys_name,
  5    log_time,
  6    msg
  7  )
  8  select
  9  	     5,
 10  	     2,
 11  	     dbms_random.string('a',1),
 12  	     sysdate + dbms_random.value,
 13  	     rpad('x',200)
 14  from
 15  	     dual
 16  connect by
 17  	     level  user,
  4  		     tabname	      =>'LOG_DATA',
  5  		     method_opt       => 'for all columns size 1'
  6  	     );
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> 

The setup is complete so let’s see what autotrace reports as the execution plan:


SQL> 
SQL> column msg format a20
SQL> 
SQL> --
SQL> -- Use autotrace to generate the plan
SQL> --
SQL> set autotrace on
SQL> 
SQL> SELECT
  2  	     *
  3  FROM
  4    (
  5  	 SELECT
  6  	     sys_name, log_time,  substr(msg,1,40) msg
  7  	 FROM log_data
  8  	 WHERE
  9  	   account_id=5
 10  	   AND log_type=2
 11  	   AND sys_name='a'
 12  	 ORDER BY
 13  	   log_time  desc
 14    )
 15  WHERE
 16    rownum <= 10
 17  ;

SYS_NAME                       LOG_TIME                                                                    MSG                                        
------------------------------ --------------------------------------------------------------------------- --------------------                       
A                              09-JAN-16 02.42.54.000000 PM                                                x                                          
a                              09-JAN-16 02.41.02.000000 PM                                                x                                          
a                              09-JAN-16 02.40.54.000000 PM                                                x                                          
a                              09-JAN-16 02.36.38.000000 PM                                                x                                          
A                              09-JAN-16 02.36.02.000000 PM                                                x                                          
a                              09-JAN-16 02.34.15.000000 PM                                                x                                          
a                              09-JAN-16 02.31.15.000000 PM                                                x                                          
A                              09-JAN-16 02.30.59.000000 PM                                                x                                          
a                              09-JAN-16 02.30.36.000000 PM                                                x                                          
a                              09-JAN-16 02.29.53.000000 PM                                                x                                          

10 rows selected.


Execution Plan
----------------------------------------------------------                                                                                            
Plan hash value: 1444908817                                                                                                                           
                                                                                                                                                      
--------------------------------------------------------------------------------------------                                                          
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT                |          |    10 |  1120 |    13   (0)| 00:00:01 |                                                          
|*  1 |  COUNT STOPKEY                  |          |       |       |            |          |                                                          
|   2 |   VIEW                          |          |    11 |  1232 |    13   (0)| 00:00:01 |                                                          
|*  3 |    SORT ORDER BY STOPKEY        |          |    11 |  2453 |    13   (0)| 00:00:01 |                                                          
|   4 |     TABLE ACCESS BY INDEX ROWID | LOG_DATA |  1000 |   217K|    13   (0)| 00:00:01 |                                                          
|*  5 |      INDEX RANGE SCAN DESCENDING| LOG_DATE |    11 |       |     2   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------------------------                                                          
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   1 - filter(ROWNUM<=10)                                                                                                                             
   3 - filter(ROWNUM 
SQL> set autotrace off
SQL> 

SQL> 
SQL> --
SQL> -- Now use dbms_xplan to generate the plan
SQL> --
SQL> SELECT /*+ gather_plan_statistics */
  2  	     *
  3  FROM
  4    (
  5  	 SELECT
  6  	     sys_name, log_time,  substr(msg,1,40) msg
  7  	 FROM log_data
  8  	 WHERE
  9  	   account_id=5
 10  	   AND log_type=2
 11  	   AND sys_name='a'
 12  	 ORDER BY
 13  	   log_time  desc
 14    )
 15  WHERE
 16    rownum 'allstats last'));

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  36qykj7j9tsrp, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
SELECT /*+ gather_plan_statistics */         * FROM   (     SELECT                                                                                    
   sys_name, log_time,  substr(msg,1,40) msg     FROM log_data                                                                                        
WHERE       account_id=5       AND log_type=2       AND sys_name='a'                                                                                  
 ORDER BY       log_time  desc   ) WHERE   rownum <= 10                                                                                               
                                                                                                                                                      
Plan hash value: 1444908817                                                                                                                           
                                                                                                                                                      
---------------------------------------------------------------------------------------------------------------------------------                     
| Id  | Operation                       | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |                     

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------                     
|   0 | SELECT STATEMENT                |          |      1 |        |     10 |00:00:00.01 |     979 |       |       |          |                     
|*  1 |  COUNT STOPKEY                  |          |      1 |        |     10 |00:00:00.01 |     979 |       |       |          |                     
|   2 |   VIEW                          |          |      1 |     11 |     10 |00:00:00.01 |     979 |       |       |          |                     
|*  3 |    SORT ORDER BY STOPKEY        |          |      1 |     11 |     10 |00:00:00.01 |     979 |  2048 |  2048 | 2048  (0)|                     
|   4 |     TABLE ACCESS BY INDEX ROWID | LOG_DATA |      1 |   1000 |    975 |00:00:00.01 |     979 |       |       |          |                     
|*  5 |      INDEX RANGE SCAN DESCENDING| LOG_DATE |      1 |     11 |    975 |00:00:00.01 |       5 |       |       |          |                     
---------------------------------------------------------------------------------------------------------------------------------                     
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   

PLAN_TABLE_OUTPUT                                                                                                                                     
------------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                                      
   1 - filter(ROWNUM<=10)                                                                                                                             
   3 - filter(ROWNUM 

Notice that both methods report the same execution plan; it’s the access predicates that differ, and autotrace reports the index as the access point rather than the hidden column in the table. Historically it’s been dbms_xplan.display_cursor that’s been the ‘truth’, but in this case it may be autotrace reporting the actual predicates in use.

And that’s the truth.

July 28, 2015

It Isn’t The Index

Filed under: Indexes,Performance — dfitzjarrell @ 15:36

 


"If you want sense, you'll have to make it yourself."
- Norton Juster, The Phantom Tollbooth

One of the more confusing statistics in Oracle is, apparently, one called the clustering factor. Associated with an index (as evidenced by the columns in DBA_INDEXES) it’s actually dependent on the table data, more specifically the distance between ‘jumps’ for a given index key. Let’s define a ‘jump’ to be the number of blocks between rows containing the given index key starting with the first block found containing that key. That’s confusing but don’t despair, this will be explained in detail.

Oracle, in its infinite wisdom, computes a clustering factor for an index based upon the key value and the scattering of the rows containing that key across the table data. Looking at a simple example let’s look at a simple non-unique index (indexes for which the clustering factor matters) and the associated table data. The example starts by building the table and loading the data:


SQL> 
SQL> create table clust_fact(
  2  	     id      number,
  3  	     nm      varchar2(40),
  4  	     dt      date,
  5  	     sq      number);

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into clust_fact(id,nm,dt,sq)
  4  		     values(mod(i,77), 'Record '||i, sysdate - mod(i, 77), i);
  5  	     end loop;
  6  
  7  	     commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL> 

Next, build the index and generate the statistics:


SQL>
SQL> create index clust_fact_idx on clust_fact(id);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Query the clustering factor computed by Oracle:


SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL> 

The clustering factor provides a representation of how far Oracle needs to ‘jump’, on average, between blocks containing the index key value. To understand this a bit better let’s look at the code Oracle actually uses to compute the clustering factor and explain what it does:


SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Oracle uses a function named sys_op_countchg, which takes two parameters, the first 15 bytes of the rowid and the number of blocks to traverse to trigger the count to increase; by default Oracle chooses to increment the clustering factor counter every time the block address changes. If three rows, for example, exist in the current block the counter does not increment; if the next three rows are in a different block the counter is incremented by 1 (Oracle doesn’t change blocks for the next two reads after the change). This keeps going througout the table to compute the final clustering factor. Knowing the function used by Oracle and its parameters it is possible to modify the call to accept larger values for the number of block changes before the clustering factor counter is incremented. To allow Oracle to consider 3 blocks ‘close enough’ to not change the counter the query is modified as follows:


SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Given how the values were computed for the ID column the clustering factor doesn’t change when we increase the block change count Oracle considers to be within range of the starting block address. Possible values for this second parameter are 1, 3, 5, 9 and 16. Going through the list of possible values it’s obvious with the current data loading the clustering factor won’t improve no matter which value we choose:


SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Since how the table data is ordered affects the clustering factor the table is reloaded with data ordered using the ID column:


SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Querying the clustering factor reveals it has significantly decreased since the keys are much closer together:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4885                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4881                                                                                                                          

SQL> 

It is worth noting that the clustering factor slightly decreases with larger values for the number of blocks Oracle can skip before updating the change count. The changes are not huge, and only differ by 8 from the value Oracle calculates by default to setting the ‘blocks to skip’ parameter to 16, its highest possible value. Since the original load created the SQ values in sequence the table is reloaded with the SQ values in order and a new concatenated index using both the ID and SQ columns is created. This is done to show how the clustering factor behaves for other indexes when it’s ‘optimized’ for one index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by sq;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index clust_fact_idx2 on clust_fact(id, sq);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 

Querying the clustering factor the value for the original table load again appears and, again, doesn’t change even when the block skip parameter to the sys_op_countchg function is increased through the available values:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           370731                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 

Ordering the data by ID shows that for both indexes the clustering factor decreases, since the indexes both include the ID column. The presence of the SQ column causes the second index clustering factor to be greater when the default parameters are used; interestingly as the block skip count increases for the second index the clustering factor decreases to that of the single-column index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                            75994                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
     75994                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4967                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 

For multiple indexes on the same table, where no column is common between them, the clustering factor can skyrocket for some indexes when the table data is ordered to favor one particular index. That example wasn’t run and is left for you to investigate. Let’s do something similar and change the order of the columns in the second index and see how that affects the clustering factor:


SQL%gt; select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           999810                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 

Even though there is a common column between the indexes the column order matters; ordering the data by the key column for the single-column index caused the clustering factor for the second index to be much greater than it was when the leading column for the second index was the ID column.

Although associated with an index, the clustering factor is affected by the table data so no series of index rebuilds will ‘fix’ a ‘bad’ value. The table data needs to be ordered to favor an index (or indexes if they contain the column used to order the table data). This, of course, improves performance for the positively affected indexes and, unfortunately, adversely affects other indexes created against that same table. Care must be taken when deciding whether or not to order the table data; ‘reasonable’ performance metrics must be decided upon before embarking on such a change, and it should be done in test or dev before considering it in production. It may be that overall ‘acceptable’ performance across all of the indexes is better than ‘outstanding’ performance for one index that causes ‘terrible’ performance for all others. It’s not a simple decision and needs to be carefully considered.

If you want to implement the modified clustering factor computed by modifying the query Oracle uses to generate this statistic you will need to manually update the index statistics using the DBMS_STATS package using one of the SET_INDEX_STATS procedures:


SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> 
SQL> column clf new_value cl_f
SQL> 
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> exec dbms_stats.set_index_stats(ownname => 'BING', indname => 'CLUST_FACT_IDX2', clstfct => &cl_f);

PL/SQL procedure successfully completed.

SQL> 

Using the new_value functionality in SQL*Plus saves the computed value into a reusable variable to pass to procedures and functions that need it. The above code would set the clustering factor for the CLUST_FACT_IDX2 to 4861, replacing the Oracle-calculated value of 75994. Again, care must be taken when undertaking such a change since you cannot predict the resulting performance change.

Since this post was originally written Oracle has implemented an enhancement to DBMS_STATS to allow the DBA to tell Oracle how many table blocks can expect to be cached. The SET_TABLE_PREFS, SET_SCHEMA_PREFS and SET_DATABASE_PREFS procedures have been modified to accept the TABLE_CACHED_BLOCKS parameter; this tells Oracle how many index blocks will be cached so these can be ignored when incrementing the clustering factor when statistics are gathered. Looking at an example we see:


SQL> create table mytab(
  2  myid    number,
  3  mydata  varchar2(40),
  4  mydate  date)
  5  cache;

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into mytab
  4  		     values(trunc(9999*(dbms_random.value)), 'Line '||i, trunc(sysdate - mod(i, 4637)));
  5  	     end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index mytab_idx
  2  on mytab(myid, mydate);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 996238                                                     

SQL> 

For no adjustment to the default TABLE_CACHED_BLOCKS parameter we see a really large clustering factor (and this is due to the random nature of the MYID values generated). Let’s set TABLE_CACHED_BLOCKS to 47 for MYTAB and see what Oracle calculates for the clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>47);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 987551                                                     

SQL> 

The clustering factor has decreased somewhat (again it’s not a dramatic decrease due to the random nature of the MYID values). Let’s set the parameter to its maximum value (255) and see what Oracle provides as a clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 939451                                                     

SQL> 

It’s reduced again, although not dramatically, but it is less than it was originally. Using this new parameter to DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS and DBMS_STATS.SET_DATABASE_PREFS procedures can potentially fix a pessimistic clustering factor and improve index selection and, in turn, improve query performance.

The clustering factor determines which index Oracle will choose for a query against a given table. Changing the clustering factor will influence the CBO calculations sometimes for the better, sometimes for the worse. Plan carefully and test, test, test before second-guessing Oracle.

Which just makes sense.

May 26, 2015

Stop That!

Filed under: General,Performance — dfitzjarrell @ 13:05

"It seems to me that almost everything is a waste of time."
- Norton Juster, The Phantom Tollbooth 

Jonathan Lewis has coined a term and its definition that, I believe, we’ll be using quite a bit:



Stoptimisation - the art of optimisation by not doing the things you don't need to do.

It’s a term we’ve needed for some time now, given the pendulum swing back to the days of yore (meaning pre-9i) with respect to database, application and query tuning. I’m seeing more and more posts in forums and newsgroups asking about tasks that I thought were long since relegated to the ‘gee, that’s really useless’ pile. Let’s look at a few of them and see why you probably shouldn’t be doing them,



Regular Index Rebuilds

Index rebuilds, for the most part, are busy work for a DBA. Yes, they can ‘clean house’ with respect to empty index blocks buried in the index tree but since the index will simply grow again to the size before the rebuild and the empty index blocks will again be scattered across the index tree I see no point in performing regularly scheduled index rebuilds. In fact, for non-partitioned indexes, I see no real reason to rebuild at all since it:



*	Locks the table and index until the rebuild is finished
*	Invalidates cursors using the associated table
*	Serves no useful purpose

Yes, there are times when local indexes on a partitioned table will need to be rebuilt, but those times are usually due to DDL against the partitioned table (dropping partitions, adding/splitting partitions, swapping partitions). The listed DDL actions can make local index partitions unusable, requiring them to be rebuilt. It’s not a ‘size’ thing, it’s not a ‘performance’ thing, it’s because the unusable index interferes with table access and thus production processes. What is troubling is the number of DBAs asking about regularly scheduled index rebuilds based on antiquated ‘criteria’ such as B-tree depth or index size. Even MOS has taken down it’s original document on when to rebuild an index and replaced it with a more responsible version that better addresses the topic. Still, there are DBAs firmly convinced that scheduling index rebuilds once a week, once a month through cron or DBMS_SCHEDULER must certainly improve the performance of queries and production processes. It doesn’t, and it really just inconveniences the end users while the index rebuilds are in progress. Since most application tables have more than one index in place, rebuilding indexes for a given table can take hours, and lock the table for the entire time, essentially shutting down production for the duration of the rebuilds. I can’t see how that improves performance.



Continually 'Tweaking' Database Parameters

There are DBAs who can’t seem to be satisfied with the performance of the database, and constantly hunt for ‘problem’ areas whether or not they actually affect performance. This is commonly referred to as Compulsive Tuning Disorder (henceforth referred to as CTD). CTD creates an endless cycle of ‘tune this, now tune that, now tune something else’, all based on the most minute deviationsa in wait statistics. CTD is one of the biggest time wasters for a DBA; it’s like Ponce De Leon’s search for the Fountain Of Youth. Not every wait statistic needs to be perfectly adjusted, and it’s impossible to do in the real world. It all boils down to what the DBA considers performance versus what the end-users consider performance. In the throes of CTD EVERYTHING is a potential ‘problem’ even if it doesn’t affect the end-user experience. On the end-user side a performance problem exists when processes take longer than they expect. Once the end-user problem has been addressed tuning should stop since further adjustments won’t provide any additional benefit. Wait statistics will never be ideal in a production system; multiple users accessing and modifying data will cause concurrency waits, no matter how small, and it’s useless to address such waits to get the response time perfect. CTD turns the most reliable of DBAs into a micro-manager of epic proportions and the efforts expended to ‘tweak’ parameters that have no bearing on the actual system performance are, honestly, useless.



Table Reorganization

As tables get larger and larger queries accessing data can gradually take longer to complete, notably if the query relies on a full table scan to fetch that data. There are still some DBAs who are convinced that reorganizing the table data will go a long way in improving performance. One of those reorganizations involves sorting the table data to improve the clustering factor. Yes, the clustering factor for an index is calculated based on the ‘location’ of the table data with respect to the sorted index keys, but ‘improving’ the clustering factor for one index usually makes the clustering factor worse for every other index against that table. Another point to make is that these are heap tables, with no real data order established; inserted data can go anywhere there is room in any data block associated with that table. Sorting the data only lasts until the first insert afterwards, after which index keys again become scattered through the table data. Years ago vendors argued that sorting the data was the best way to improve performance, and ‘suggested’ that their product be used on a regular basis to maintain this wonderful sorted order. The problems with that idea are that the data can be sorted for one index built against that table, not all and that the data will gradually return to its unsorted order as the end-users process data. This becomes, like CTD, a never-ending cycle of ‘sort, sort, sort, sort, sort’ all under the misguided notion that it’s making things better. It can’t, since every time the data is sorted the production system is rendered unavailable. If the end-users can’t work it doesn’t matter how fast the queries run AFTER the sort process has run. Unfortunately once the DBA has installed such blinders he or she is unaware of the inconvenience such processes create. The end-users need to work, not wait for some DBA to sort a pile of data only to have it return to being an unsorted pile a day or two down the road. The DBA should be working to create meaningful and lasting changes to improve performance; query tuning, plan stability and index analysis are worthwhile tasks that can produce tangible benefits that last far longer than the tasks it took to create them.

There are probably other time-wasters for the DBA that aren’t listed here; knowing what SHOULD be done versus what CAN be done is the hallmark of a great DBA. Looking again at the term introduced in this post should direct you toward useful tuning tasks that provide long-term benefit and that don’t require constant attention or repeated execution. Nothing is perfect, especially your database, so don’t try to make it so because you’ll only end up on The Endless Tuning Treadmill suffering from Compulsive Tuning Disorder. “Stoptimization” is what the DBA needs to be concerned with, and that’s not doing the things you don’t need to do. There is plenty of REAL work for the DBA to accomplish day-to-day. Don’t get wrapped up in the minutiae; it’s simply not worth the effort.


"For there's always something to do to keep you from what you really should be doing, ..."
- Norton Juster, The Phantom Tollbooth 

Ain’t that the truth.

March 30, 2015

Sound Advice

Filed under: Performance — dfitzjarrell @ 09:33

"There is much worth noticing that often escapes the eye."
- Norton Juster, The Phantom Tollbooth

Oracle has provided views to assist in managing the SGA and PGA settings when you’re not using Automatic Memory Management (AMM) and a view to manage memory_target when you are using AMM. These views were designed to be fairly easy to interpret but it seems that there has been confusion of late in how to actually use the recommendations Oracle is presenting. Let’s fix that.

Starting with 10gR2 Oracle provided both V$SGA_TARGET_ADVICE and V$PGA_TARGET_ADVICE to assist the DBA in sizing the SGA and PGA based on how the database was using memory. The definitions of these views (as of 11.2.0.4) are shown below:


SQL> desc v$sga_target_advice
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 SGA_SIZE                                                                          NUMBER
 SGA_SIZE_FACTOR                                                                   NUMBER
 ESTD_DB_TIME                                                                      NUMBER
 ESTD_DB_TIME_FACTOR                                                               NUMBER
 ESTD_PHYSICAL_READS                                                               NUMBER

SQL> desc v$pga_target_advice
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 PGA_TARGET_FOR_ESTIMATE                                                           NUMBER
 PGA_TARGET_FACTOR                                                                 NUMBER
 ADVICE_STATUS                                                                     VARCHAR2(3)
 BYTES_PROCESSED                                                                   NUMBER
 ESTD_TIME                                                                         NUMBER
 ESTD_EXTRA_BYTES_RW                                                               NUMBER
 ESTD_PGA_CACHE_HIT_PERCENTAGE                                                     NUMBER
 ESTD_OVERALLOC_COUNT                                                              NUMBER

SQL>

Nothing unusual or esoteric here, however it appears that interpreting the provided advice can be confusing. Looking first at the contents of V$SGA_TARGET_ADVICE let’s see what Oracle reports:


SQL> select * From v$sga_target_advice
  2  /

  SGA_SIZE SGA_SIZE_FACTOR ESTD_DB_TIME ESTD_DB_TIME_FACTOR ESTD_PHYSICAL_READS
---------- --------------- ------------ ------------------- -------------------
       520              .5        65793              5.8208             2579079
       780             .75        23911              2.1155             2068005
      1040               1        11303                   1             1134459
      1300            1.25         9162               .8106              970643
      1560             1.5         9098               .8049              970643
      1820            1.75         9058               .8014              970643
      2080               2         9058               .8014              970643

7 rows selected.

SQL>

Based on resource usage since startup this view provides information on SGA memory settings that may improve performance if adopted. The key element in any of these views is the size factor, with a size factor of 1 being the current setting. Looking at the above output it’s possible that physical reads can be reduced a small amount if sga_target is increased from 1040 MB to 1300 MB; any further increase to sga_target will show no improvement. This is proven by the ESTD_PHYSICAL_READS column, the value of which stays constant for an SGA sized at 1300 MB or greater. In addition the ESTD_DB_TIME_FACTOR doesn’t improve much at greater SGA sizes indicating that further increases provide no real benefit. Ideally the goal is to find the setting providing substantial benefit and use it; the problem appears to be ‘where do I stop looking?’ With the above output we stopped looking for improvement at the first entry providing the lowest estimated physical reads. All entries below that provided no additional benefit.

If sga_target and sga_max_size are set diffently (to allow some degree of dynamic SGA management) it may not be necessary to restart the database since sga_target may be able to be set without adjusting sga_max_size. This view may, however, provide an SGA setting that requires both sga_target and sga_max_size be adjusted, and in that case the database will need to be restarted.

In a similar fashion the V$PGA_TARGET_ADVICE view can also be used to ‘fine tune’ overall PGA settings for the database:


SQL> select * From v$pga_target_advice;

PGA_TARGET_FOR_ESTIMATE PGA_TARGET_FACTOR ADV BYTES_PROCESSED  ESTD_TIME ESTD_EXTRA_BYTES_RW ESTD_PGA_CACHE_HIT_PERCENTAGE ESTD_OVERALLOC_COUNT
----------------------- ----------------- --- --------------- ---------- ------------------- ----------------------------- --------------------
               73400320              .125 ON       7.4905E+10   37173880          1395450880                            98                   88
              146800640               .25 ON       7.4905E+10   36716537           456747008                            99                    6
              293601280                .5 ON       7.4905E+10   36643991           307845120                           100                    0
              440401920               .75 ON       7.4905E+10   36494007                   0                           100                    0
              587202560                 1 ON       7.4905E+10   36494007                   0                           100                    0
              704643072               1.2 ON       7.4905E+10   36494007                   0                           100                    0
              822083584               1.4 ON       7.4905E+10   36494007                   0                           100                    0
              939524096               1.6 ON       7.4905E+10   36494007                   0                           100                    0
             1056964608               1.8 ON       7.4905E+10   36494007                   0                           100                    0
             1174405120                 2 ON       7.4905E+10   36494007                   0                           100                    0
             1761607680                 3 ON       7.4905E+10   36494007                   0                           100                    0

PGA_TARGET_FOR_ESTIMATE PGA_TARGET_FACTOR ADV BYTES_PROCESSED  ESTD_TIME ESTD_EXTRA_BYTES_RW ESTD_PGA_CACHE_HIT_PERCENTAGE ESTD_OVERALLOC_COUNT
----------------------- ----------------- --- --------------- ---------- ------------------- ----------------------------- --------------------
             2348810240                 4 ON       7.4905E+10   36494007                   0                           100                    0
             3523215360                 6 ON       7.4905E+10   36494007                   0                           100                    0
             4697620480                 8 ON       7.4905E+10   36494007                   0                           100                    0

14 rows selected.

SQL>

Again it’s the target factor that identifies the current setting; working toward larger target factors may show improvement in the PGA cache hit percentage which means processes will see more optimized (one-pass) reads. Like the V$SGA_TARGET_ADVICE view the method is to find a setting that provides improvement. In this example the PGA setting is already optimal (the cache hit percentage is 100 and 0 extra bytes will need to be read). Not every database will be so lucky; this view is provided to allow the DBA to make as few changes as possible to achieve improved performance and throughput, eliminating the need for trial and error PGA management. In this case the database will need to be restarted after the parameter values have been changed.

To restate the methodology of using these two views:


1.	Find the row referencing the current setting; this will be the row where the size factor is equal to 1
2.	Scan the data for the entry just before the improvement estimates no longer change; this will be the setting
	Oracle has calculated to provide the most benefit
3.	Implement that setting, restarting the database, if necessary, to effect the change

In a similar fashion when using Automatic Memory Management, by setting memory_target and memory_max_target, the V$MEMORY_TARGET_ADVICE view can be used to tune the overall memory settings. Output from this view looks like this:


SQL> select * From v$memory_target_advice;

MEMORY_SIZE MEMORY_SIZE_FACTOR ESTD_DB_TIME ESTD_DB_TIME_FACTOR    VERSION     CON_ID
----------- ------------------ ------------ ------------------- ---------- ----------
        392                .25       143586              1.0063          0          0
        784                 .5       142687                   1          0          0
       1176                .75       142687                   1          0          0
       1568                  1       142687                   1          0          0
       1960               1.25       142687                   1          0          0
       2352                1.5       142687                   1          0          0
       2744               1.75       142687                   1          0          0
       3136                  2       142687                   1          0          0

8 rows selected.

SQL>

The same methodology is used here, and if memory_max_target is larger than the suggested memory_target value the setting can be changed without restarting the database. Like the SGA settings if memory_max_target needs to also be adjusted this will require a database restart. Note that if you’re using Linux and hugepages using AMM is not supported.

Managing database memory settings doesn’t need to be a gruelling task of ‘try this, see if it works, it doesn’t so try another setting and repeat’. Using the provided memory advisors can make the DBA’s job much easier. Hopefully this has made that task easier, freeing the DBA to address more pressing issues.

Sometimes it’s good to take things under advisement.

March 12, 2015

Extra-cost Option?

Filed under: General,Performance — dfitzjarrell @ 09:14

"Just because you have a choice, it doesn't mean that any of them 'has' to be right."
 Norton Juster, The Phantom Tollbooth 

In an Oracle forum that I frequent a user posted that he found cardinality values to be severely inflated when using an analytic function. This user posted an example that will be used here to demonstrate the problem. It will also be used to demonstrate another issue that can make this problem even more confusing, but the latter issue is one that can easily be fixed.

The example script contains the following code, which includes statements not executed by the user posting the problem (creating a new user to run the example from, using an additional statistics gathering method to verify the method wasn’t part of the cause and modifying the dynamic sampling setting), but which have no effect on the results generated:



connect bing/#############

spool wrong_card_ex.log
select * From v$version;

set echo on linesize 132

create table t1
as
select * From all_objects;

exec dbms_stats.gather_table_stats(user, 'T1')

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
FROM (
	  SELECT object_type
	       , object_name
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

connect bing/#############

set echo on linesize 132

alter session set optimizer_dynamic_sampling=11;

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
     , object_rank
FROM (
	  SELECT object_type
	       , object_name
	       , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

exec dbms_stats.gather_schema_stats(user)

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
FROM (
	  SELECT object_type
	       , object_name
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

connect bing/#############

set echo on linesize 132

alter session set optimizer_dynamic_sampling=11;

EXPLAIN PLAN FOR
SELECT object_type
     , object_name
     , object_rank
FROM (
	  SELECT object_type
	       , object_name
	       , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
	  FROM   t1
)
WHERE object_name = 'DUAL';
SELECT * FROM TABLE(dbms_xplan.display);

set autotrace on

select object_type, object_name
from
(select object_type, object_name from t1)
where object_name = 'DUAL';

select object_type, object_name, object_rank
from
(select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
where object_name = 'DUAL';

set autotrace off

spool off

Nothing unusual there, really, other than the second run of the code minus the table creation. That was done to test whether gathering schema stats over gathering table stats made a difference in the cardinality numbers; it did not. Let’s look at a run of this script in an 11.2.0.4 database and see what cardinalities Oracle generates:


BANNER                                                                                                                              
--------------------------------------------------------------------------------                                                    
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production                                                        
PL/SQL Release 11.2.0.4.0 - Production                                                                                              
CORE	11.2.0.4.0	Production                                                                                                          
TNS for 64-bit Windows: Version 11.2.0.4.0 - Production                                                                             
NLSRTL Version 11.2.0.4.0 - Production                                                                                              

SQL> 
SQL> create table t1
  2  as
  3  select * From all_objects;

Table created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'T1')

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 

13 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
         12  recursive calls                                                                                                        
          0  db block gets                                                                                                          
       1788  consistent gets                                                                                                        
          1  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 

So far, so good, as the cardinality accurately reflects the actual number of rows for DUAL in the ALL_OBJECTS view. Let’s add the ROW_NUMBER() analytic function and see what changes:


SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          7  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        989  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off

Interestingly (to put it mildly) the cardinality has jumped to over 30,000 times its original value. Maybe it’s the way the stats were gathered; another run is made, this time gathering stats for the entire schema (fortunately it contains only this one table):


SQL> 
SQL> exec dbms_stats.gather_schema_stats(user)

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

17 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     2 |    72 |   274   (1)| 00:00:04 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    72 |   274   (1)| 00:00:04 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        983  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|*  1 |  VIEW               |      | 68395 |  2738K|       |  1042   (1)| 00:00:13 |                                                
|   2 |   WINDOW SORT       |      | 68395 |  2938K|  3784K|  1042   (1)| 00:00:13 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 68395 |  2938K|       |   274   (1)| 00:00:04 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        982  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> spool off

In both attempts (gathering table stats and gathering schema stats) the dynamic sampling setting was changed from 2 (the default for this database) to 11, and both autotrace and dbms_xplan were used to generate the execution plans. To clear out the environment set by autotrace the user was logged in again before running any SQL statements. No change to the results was noticed (meaning that the generated plan didn’t change using one method over the other) so autotrace had no ‘adverse’ affect on the outcome. Oracle has recently proclaimed this as a bug:


Bug 20591362 - WRONG CARDINALITY WHEN ANALYTICS FUNCTION IS USED

which can be found on MOS. In a nutshell the CBO ignores the selectivity of the WHERE clause when an analytic fuunction is introduced into a query and, as a result, computes the incorrect cardinality.

On a side note if your 11.2.0.x installation does not include Oracle JVM you can run afoul of even stranger numbers:


BANNER                                                                                                                              
--------------------------------------------------------------------------------                                                    
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production                                                        
PL/SQL Release 11.2.0.4.0 - Production                                                                                              
CORE	11.2.0.4.0	Production                                                                                                          
TNS for 64-bit Windows: Version 11.2.0.4.0 - Production                                                                             
NLSRTL Version 11.2.0.4.0 - Production                                                                                              

SQL> 
SQL> create table t1
  2  as
  3  select * From all_objects;

Table created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'T1')

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 

13 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 


Statistics
----------------------------------------------------------                                                                          
          1  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
         11  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        385  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          7  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        196  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> exec dbms_stats.gather_schema_stats(user)

PL/SQL procedure successfully completed.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  FROM (
  5  	       SELECT object_type
  6  		    , object_name
  7  	       FROM   t1
  8  )
  9  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

17 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        499  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> connect bing/##########
Connected.
SQL> 
SQL> set echo on linesize 132
SQL> 
SQL> alter session set optimizer_dynamic_sampling=11;

Session altered.

SQL> 
SQL> EXPLAIN PLAN FOR
  2  SELECT object_type
  3  	  , object_name
  4  	  , object_rank
  5  FROM (
  6  	       SELECT object_type
  7  		    , object_name
  8  		    , ROW_NUMBER() OVER (PARTITION BY object_type ORDER BY created) object_rank
  9  	       FROM   t1
 10  )
 11  WHERE object_name = 'DUAL';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    

PLAN_TABLE_OUTPUT                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            

19 rows selected.

SQL> 
SQL> set autotrace on
SQL> 
SQL> select object_type, object_name
  2  from
  3  (select object_type, object_name from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                                                                                                     
------------------- ------------------------------                                                                                  
TABLE               DUAL                                                                                                            
SYNONYM             DUAL                                                                                                            


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 3617692013                                                                                                         
                                                                                                                                    
--------------------------------------------------------------------------                                                          
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
--------------------------------------------------------------------------                                                          
|   0 | SELECT STATEMENT  |      |     1 |    26 |    56   (0)| 00:00:01 |                                                          
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    26 |    56   (0)| 00:00:01 |                                                          
--------------------------------------------------------------------------                                                          
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        191  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        674  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          0  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> select object_type, object_name, object_rank
  2  from
  3  (select object_type, object_name, row_number() over (partition by object_type order by created) object_rank from t1)
  4  where object_name = 'DUAL';

OBJECT_TYPE         OBJECT_NAME                    OBJECT_RANK                                                                      
------------------- ------------------------------ -----------                                                                      
SYNONYM             DUAL                                     1                                                                      
TABLE               DUAL                                    43                                                                      


Execution Plan
----------------------------------------------------------                                                                          
Plan hash value: 2273146475                                                                                                         
                                                                                                                                    
------------------------------------------------------------------------------------                                                
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |                                                
------------------------------------------------------------------------------------                                                
|   0 | SELECT STATEMENT    |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|*  1 |  VIEW               |      | 14281 |   571K|       |   190   (2)| 00:00:03 |                                                
|   2 |   WINDOW SORT       |      | 14281 |   474K|   632K|   190   (2)| 00:00:03 |                                                
|   3 |    TABLE ACCESS FULL| T1   | 14281 |   474K|       |    56   (0)| 00:00:01 |                                                
------------------------------------------------------------------------------------                                                
                                                                                                                                    
Predicate Information (identified by operation id):                                                                                 
---------------------------------------------------                                                                                 
                                                                                                                                    
   1 - filter("OBJECT_NAME"='DUAL')                                                                                                 
                                                                                                                                    
Note                                                                                                                                
-----                                                                                                                               
   - dynamic sampling used for this statement (level=11)                                                                            


Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
        190  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
        753  bytes sent via SQL*Net to client                                                                                       
        500  bytes received via SQL*Net from client                                                                                 
          2  SQL*Net roundtrips to/from client                                                                                      
          1  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
          2  rows processed                                                                                                         

SQL> 
SQL> set autotrace off
SQL> 
SQL> spool off

Notice that the number of rows reported in the execution plans for the non-analytic query are half of what they are supposed to be, and the inflated number of rows is roughly one-fourth of the previously reported value. In the absence of the JVM the data dictionary is ‘short’ around 50,000 rows which can skew the selectivity down to around 1 for the example query. Installing the JVM, using dbca, fixes that issue.

Until this bug is fixed by a one-off patch pay close attention to the row estimates produced for queries using analytic functions; at the least this little ‘inconvenience’ could make troubleshooting performance issues difficult. You may need to eliminate the analytic function to reveal a better estimate of the number of rows returned so you’re not chasing an issue that doesn’t really exist.

Numbers never lie, until they do.

February 23, 2015

Merge, Right?

Filed under: General,Performance — dfitzjarrell @ 15:33

"There are no wrong roads to anywhere."
 - Norton Juster, The Phantom Tollbooth 

Oracle can do some, well, strange things when fixing issues caused by the optimizer. For 10g releases up to 10.2.0.x Oracle chose to silently ignore a message and eliminate an outer join on the MERGE statement under certain conditions. Let’s examine this and see exactly what Oracle implements.

Occasionally in a 10046 trace file generated from 10g releases prior to 10.2.0.x and procesed by the tkprof utility you may see something similar to this:


SELECT 1 FROM DUAL WHERE <some condition here>
 
Error encountered: ORA-00904

where <some condition here> is the ‘ON’ condition from a MERGE statement executed by that session. That seems odd, I know, and Oracle silently ignores that ORA-00904 condition as it is never displayed when executing the MERGE. Why would Oracle do this? Looking at the 10053 trace file generated from the following example:


SQL> merge into t1_merge t1
  2          using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
  3          on (t1.merge_key = t2.merged_key)
  4  when matched then update set t1.xtra_stuff = t2.xtra
  5  when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra);

1 row merged.

SQL>

we see the following:


PARSE ERROR #5:len=36 dep=1 uid=60 oct=3 lid=60 tim=1266258229533447 err=904
SELECT 1 FROM DUAL WHERE t1.merge_key = t2.merged_key
CLOSE #5:c=0,e=4,dep=1,type=0,tim=1266258229533578
Registered qb: MRG$1 0x84379c (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
...
====================== END SQL Statement Dump ======================
=====================
PARSING IN CURSOR #1 len=302 dep=0 uid=1158 oct=189 lid=1158 tim=33189168479100 hv=1190985157 ad='1ec71da8'
merge into t1_merge t1
        using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
        on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra)
END OF STMT
PARSE #1:c=40000,e=51569,p=4,cr=29,cu=0,mis=1,r=0,dep=0,og=2,tim=33189168479094
BINDS #1:
EXEC #1:c=10000,e=558,p=0,cr=7,cu=3,mis=0,r=1,dep=0,og=2,tim=33189168479778
WAIT #1: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=2135111 tim=33189168479862
WAIT #1: nam='SQL*Net message from client' ela= 419 driver id=1650815232 #bytes=1 p3=0 obj#=2135111 tim=33189168480324
STAT #1 id=1 cnt=2 pid=0 pos=1 obj=0 op='MERGE  T1_MERGE (cr=7 pr=0 pw=0 time=425 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=0 op='VIEW  (cr=7 pr=0 pw=0 time=178 us)'
STAT #1 id=3 cnt=1 pid=2 pos=1 obj=0 op='NESTED LOOPS OUTER (cr=7 pr=0 pw=0 time=175 us)'
STAT #1 id=4 cnt=1 pid=3 pos=1 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 time=1 us)'
STAT #1 id=5 cnt=1 pid=3 pos=2 obj=2135111 op='TABLE ACCESS FULL T1_MERGE (cr=7 pr=0 pw=0 time=170 us)'

Notice there is no outer join in the plan steps. Additionally if we look at the outline data we see:


  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      FIRST_ROWS(10)
      OUTLINE_LEAF(@"SEL$5428C7F1")
      MERGE(@"SEL$2")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$4")
      OUTLINE_LEAF(@"MRG$1")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      NO_ACCESS(@"MRG$1" "from$_subquery$_008"@"MRG$1")
      NO_ACCESS(@"MRG$1" "T2"@"MRG$1")
      FULL(@"MRG$1" "T1"@"MRG$1")
      LEADING(@"MRG$1" "from$_subquery$_008"@"MRG$1" "T2"@"MRG$1" "T1"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T2"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T1"@"MRG$1")
    END_OUTLINE_DATA
  */

The outline data in these cases shows the outer join isn’t used as Oracle re-writes the query into this:


merge into t1
using (select 1 merged_key, lpad('y', 21, 'y') xtra from dual) t2
on (1 = 0)
when matched then update set t1.xtra = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra);

Inerestingly the reason for this behavior is available in the online documentationL


To insert all of the source rows into the table, you can use a constant filter predicate in the ON clause condition. An example of a 
constant filter predicate is ON (0=1). Oracle Database recognizes such a predicate and makes an unconditional insert of all source 
rows into the table. This approach is different from omitting the merge_update_clause. In that case, the database still must perform 
a join. With constant filter predicate, no join is performed.

This appears to be fixed in 10.2.0.x and later releases as the ORA-00904 error does not occur in either of the trace files. MOS document id 244055.1 lists this modification as an ‘enhancement’ in 10g. The document also lists the bug that requires this work-around is fixed in 10.1.0.1.0. Even though 10g in all its forms is an old release there still may be sites using it.

In 11.2.0.4 this same construct generates no ORA-00904 error, nor does the ‘dummy’ query appear. Notice that the join is restored as evidenced by the execution plan:


----- Current SQL Statement for this session (sql_id=5zv0cdsdp0bf4) -----
merge into t1_merge t1
	using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
	on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.merged_key, t2.merged_key, t2.xtra)
sql_text_length=289
sql=merge into t1_merge t1
	using (select 1 merged_key, lpad('y', 21, 'y') xtra  from dual) t2
	on (t1.merge_key = t2.merged_key)
when matched then update set t1.xtra_stuff = t2.xtra
when not matched then insert (merge_key, more_txt, xtra_stuff) values (t2.mer
sql=ged_key, t2.merged_key, t2.xtra)
----- Explain Plan Dump -----
----- Plan Table -----
 
============
Plan Table
============
----------------------------------------+-----------------------------------+
| Id  | Operation             | Name    | Rows  | Bytes | Cost  | Time      |
----------------------------------------+-----------------------------------+
| 0   | MERGE STATEMENT       |         |       |       |     6 |           |
| 1   |  MERGE                | T1_MERGE|       |       |       |           |
| 2   |   VIEW                |         |       |       |       |           |
| 3   |    NESTED LOOPS OUTER |         |     1 |    31 |     6 |  00:00:01 |
| 4   |     TABLE ACCESS FULL | DUAL    |     1 |     2 |     2 |  00:00:01 |
| 5   |     TABLE ACCESS FULL | T1_MERGE|     1 |    29 |     4 |  00:00:01 |
----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
5 - filter("T1"."MERGE_KEY"=CASE  WHEN (ROWID IS NOT NULL) THEN 1 ELSE 1 END )

The outline data where this is fixed shows the join is used:


  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      DB_VERSION('11.2.0.4')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5428C7F1")
      MERGE(@"SEL$2")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$4")
      OUTLINE_LEAF(@"MRG$1")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$3")
      NO_ACCESS(@"MRG$1" "from$_subquery$_008"@"MRG$1")
      NO_ACCESS(@"MRG$1" "T2"@"MRG$1")
      FULL(@"MRG$1" "T1"@"MRG$1")
      LEADING(@"MRG$1" "from$_subquery$_008"@"MRG$1" "T2"@"MRG$1" "T1"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T2"@"MRG$1")
      USE_MERGE_CARTESIAN(@"MRG$1" "T1"@"MRG$1")
      PQ_DISTRIBUTE(@"MRG$1" "T2"@"MRG$1" NONE NONE)
      PQ_DISTRIBUTE(@"MRG$1" "T1"@"MRG$1" NONE NONE)
      FULL(@"SEL$5428C7F1" "DUAL"@"SEL$3")
      FULL(@"SEL$5428C7F1" "T1"@"SEL$2")
      LEADING(@"SEL$5428C7F1" "DUAL"@"SEL$3" "T1"@"SEL$2")
      USE_NL(@"SEL$5428C7F1" "T1"@"SEL$2")
    END_OUTLINE_DATA
  */

Oracle has, apparently, seen the ‘error of its ways’ with this ‘enhancement’ and re-thought how MERGE should be executed. Of course you may be stuck running an older version of 10g and could be affected by this ‘enhancement’. Even migrating to 10.2.0.4 can eliminate this strange query from your MERGE implementations. Know that this isn’t really a performance problem, but it can be disconcerting to see such essentially useless query text in trace files you may generate. Knowing where it comes from hopefully will make it easier to assess performance issues requiring 10046 and 10053 trace files.

It really helps to pay close attention when merging.

February 3, 2015

How Can I Compress Thee

Filed under: Exadata,General,Performance — dfitzjarrell @ 10:26

“You can swim all day in the Sea of Knowledge and not get wet.” 
― Norton Juster, The Phantom Tollbooth 

In previous posts compression options have been discussed, and now it’s time to see how Oracle performs basic compression. It isn’t really compression, it’e de-duplication, but it does result in space savings for data that won’t be modified after it’s ‘compressed’. Let’s look at how Oracle saves space with your data.

Oracle de-duplicates the data by finding common strings, tokenizing them and using the token identifier in the string to reduce the row length. So, what does that mean? Looking at an example might help; a table is built and populated as follows:


--
-- Create and populate the table
--
create table comptst(
	tstcol1	varchar2(4),
	tstcol2 varchar2(6),
	tstcol3	number(8,2),
	tstcol4	varchar2(10));

insert into comptst
values ('ZZXZ', 'bbddff', 199.44, 'PENDING');

insert into comptst
values ('ZZXZ', 'ghijkl', 43.08, 'PENDING');

insert into comptst
values ('ZZXZ', 'bbddff', 881.02, 'PENDING');

insert into comptst
values ('ZZXZ', 'bbddff', 54.97, 'PENDING');

commit;

insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;
insert into comptst select * From comptst;

commit;
				--
-- Compress the table with BASIC compression
--
alter table comptst compress basic;
alter table comptst move;

[The table was compressed after the data was inserted which required two steps, the first to set the compression level and the second, a table move in place, to actually compress the data. Had the table been built as compressed and direct path inserts used the data would have been compressed without further action.] Since the initial 4-row insert was re-inserted multiple times there is a lot of duplication in the data, and since Oracle de-duplicates rows to produce the effect of compression there should be a lot of data in a block dump indicating this. There is, and the first piece of that data is the following line:


  perm_9ir2[4]={ 0 2 3 1 }

Oracle builds a token table for each data block; this provides a reference for each data string that occurs more than once in the block. Additionally Oacle can re-arrange the column values in that token table so that multiple column values can be turned into a single token and, thus, a single reference. The line shown above indicates what column values map to the table positions in the token table for this block; in this case column 0 maps to the data in table column 0, column 1 maps to the data in table column 2, column 2 maps to data column 3 and column 3 maps to data column 1. Let’s look at the unique data that was inserted:


('ZZXZ', 'bbddff', 199.44, 'PENDING');
('ZZXZ', 'ghijkl', 43.08, 'PENDING');
('ZZXZ', 'bbddff', 881.02, 'PENDING');
('ZZXZ', 'bbddff', 54.97, 'PENDING');

Since these data rows are duplicated in each block every column is a potential compression token. Two values occur in every row, ‘ZZXZ’ and ‘PENDING’, so it should be expected that tokens for those values will be found in each of the compressed data rows. As mentioned previously Oracle builds a token table in each block so there are two tables in this block, the first, starting at offset 0, is the token table that has 7 rows and the second, starting at offset 7, is the actual table data and there are 721 rows:


0x24:pti[0]	nrow=7		offs=0
0x28:pti[1]	nrow=721	offs=7

Oracle has a clue with this implementation of compression and can create a token that includes a data value and a token, from the same token table, to reduce that row length even further. The examples provided here won’t be demonstrating that but know that it is possible. Now let’s look at the first row in this block for the data table:


tab 1, row 0, @0x1f31
tl: 5 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [ 4]  5a 5a 58 5a
col  1: [ 7]  50 45 4e 44 49 4e 47
col  2: [ 6]  62 62 64 64 66 66
col  3: [ 3]  c1 37 62
bindmp: 2c 00 01 04 02

The actual column lengths are supplied between the square brackets for each column; the total length should be the sum of those values plus 7 bytes, 4 of those for the column lengths, one for the lock byte, one for the flag byte and one for the column count. Using that information the total length should be 24 bytes; the block dump provides a different total length of 5, as reported by the tl entry. There is a line at the end of the row dump labeled bindmp (a binary dump of the row contents) revealing the actual contents of those 5 bytes. As expected there is the lock byte (0x2c), the number of columns at this location (0x01) and two bytes representing the token, reporting that 4 columns are in this token and that the reference row in the token table is row 2. So, let’s look at table 0, row 2:


tab 0, row 2, @0x1f5c
tl: 10 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [ 4]  5a 5a 58 5a
col  1: [ 7]  50 45 4e 44 49 4e 47
col  2: [ 6]  62 62 64 64 66 66
col  3: [ 3]  c1 37 62
bindmp: 00 b3 04 04 05 06 cb c1 37 62

It looks almost like the data row, but the total token length is 10 bytes. Looking at the bindmp the first two bytes indicate this token is used 179 times in this block, the third byte indicates that 4 columns are in this token, the two bytes after that report that the first two columns are also tokens, 0x04 and 0x05. Going back to the token table we see that those tokens are:


tab 0, row 4, @0x1f66
tl: 7 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 4]  5a 5a 58 5a
bindmp: 00 04 cc 5a 5a 58 5a
tab 0, row 5, @0x1f76
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  50 45 4e 44 49 4e 47
bindmp: 00 04 cf 50 45 4e 44 49 4e 47

These are single-column tokens, and each is used 4 times in this block. This is how Oracle reduced the row length from 24 bytes to 5 to save block space. Working through the block dump it’s now possible to re-construct the 24 bytes of data the row originally contained even though it now is only 5 bytes in length.


"The way you see things depends a great deal on where you look at them from."
 - Norton Juster, The Phantom Tollbooth 

We see that Oracle doesn’t actually compress data, it replaces duplicate values with tokens and, through those tokens, reconstructs the data at query time by using the row directory and the actual row pieces in each block. Depending on the select list some tokens won’t be accessed if that data isn’t required. Of course all of this re-constructing can be expensive at the CPU level, and for full table scans of large tables performance can be an issue, especially with the “cache buffers chains” latch because Oracle is performing fewer “consistent gets – examination”. This is because Oracle has to pin blocks for a longer period due to the reconstruction. On the plus side the number of physical reads can decrease since the data is in a smaller number of blocks and can stay in the cache longer. Using basic compression is a trade-off between size and performance, and for extremely large tables or in cases where the compression savings are quite large (meaning the data is compressed more) queries may become CPU-intensive rather than I/O intensive. The good and the bad need to be weighed carefully when making the decision to use compression; choose wisely. Space is relatively inexpensive when compared to end-user satisfaction. The DBA’s idea of performance and the end-users ideas of performance use different criteria, and it’s really the end-users idea that should take precendence.

Anyone up for a swim?

Next Page »

Blog at WordPress.com.