It Isn’t The Index

July 28, 2015

It Isn’t The Index

Filed under: Indexes,Performance — dfitzjarrell @ 15:36


"If you want sense, you'll have to make it yourself."
- Norton Juster, The Phantom Tollbooth

One of the more confusing statistics in Oracle is, apparently, one called the clustering factor. Associated with an index (as evidenced by the columns in DBA_INDEXES) it’s actually dependent on the table data, more specifically the distance between ‘jumps’ for a given index key. Let’s define a ‘jump’ to be the number of blocks between rows containing the given index key starting with the first block found containing that key. That’s confusing but don’t despair, this will be explained in detail.

Oracle, in its infinite wisdom, computes a clustering factor for an index based upon the key value and the scattering of the rows containing that key across the table data. Looking at a simple example let’s look at a simple non-unique index (indexes for which the clustering factor matters) and the associated table data. The example starts by building the table and loading the data:


SQL> 
SQL> create table clust_fact(
  2  	     id      number,
  3  	     nm      varchar2(40),
  4  	     dt      date,
  5  	     sq      number);

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into clust_fact(id,nm,dt,sq)
  4  		     values(mod(i,77), 'Record '||i, sysdate - mod(i, 77), i);
  5  	     end loop;
  6  
  7  	     commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL>

Next, build the index and generate the statistics:


SQL>
SQL> create index clust_fact_idx on clust_fact(id);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL>

Query the clustering factor computed by Oracle:


SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL>

The clustering factor provides a representation of how far Oracle needs to ‘jump’, on average, between blocks containing the index key value. To understand this a bit better let’s look at the code Oracle actually uses to compute the clustering factor and explain what it does:


SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL>

Oracle uses a function named sys_op_countchg, which takes two parameters, the first 15 bytes of the rowid and the number of blocks to traverse to trigger the count to increase; by default Oracle chooses to increment the clustering factor counter every time the block address changes. If three rows, for example, exist in the current block the counter does not increment; if the next three rows are in a different block the counter is incremented by 1 (Oracle doesn’t change blocks for the next two reads after the change). This keeps going througout the table to compute the final clustering factor. Knowing the function used by Oracle and its parameters it is possible to modify the call to accept larger values for the number of block changes before the clustering factor counter is incremented. To allow Oracle to consider 3 blocks ‘close enough’ to not change the counter the query is modified as follows:


SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL>

Given how the values were computed for the ID column the clustering factor doesn’t change when we increase the block change count Oracle considers to be within range of the starting block address. Possible values for this second parameter are 1, 3, 5, 9 and 16. Going through the list of possible values it’s obvious with the current data loading the clustering factor won’t improve no matter which value we choose:


SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL>

Since how the table data is ordered affects the clustering factor the table is reloaded with data ordered using the ID column:


SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL>

Querying the clustering factor reveals it has significantly decreased since the keys are much closer together:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4888                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4885                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4881                                                                                                                          

SQL>

It is worth noting that the clustering factor slightly decreases with larger values for the number of blocks Oracle can skip before updating the change count. The changes are not huge, and only differ by 8 from the value Oracle calculates by default to setting the ‘blocks to skip’ parameter to 16, its highest possible value. Since the original load created the SQ values in sequence the table is reloaded with the SQ values in order and a new concatenated index using both the ID and SQ columns is created. This is done to show how the clustering factor behaves for other indexes when it’s ‘optimized’ for one index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by sq;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index clust_fact_idx2 on clust_fact(id, sq);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL>

Querying the clustering factor the value for the original table load again appears and, again, doesn’t change even when the block skip parameter to the sys_op_countchg function is increased through the available values:


SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           370731                                                                                    
CLUST_FACT_IDX                            370731                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    370731                                                                                                                          

SQL>

Ordering the data by ID shows that for both indexes the clustering factor decreases, since the indexes both include the ID column. The presence of the SQ column causes the second index clustering factor to be greater when the default parameters are used; interestingly as the block skip count increases for the second index the clustering factor decreases to that of the single-column index:


SQL> 
SQL> drop table clust_tmp purge;

Table dropped.

SQL> 
SQL> create table clust_tmp as select * from clust_fact order by id;

Table created.

SQL> 
SQL> truncate table clust_fact;

Table truncated.

SQL> 
SQL> insert into clust_fact select * from clust_tmp;

1000000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);

PL/SQL procedure successfully completed.

SQL> 
SQL> select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                            75994                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
     75994                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4967                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL>

For multiple indexes on the same table, where no column is common between them, the clustering factor can skyrocket for some indexes when the table data is ordered to favor one particular index. That example wasn’t run and is left for you to investigate. Let’s do something similar and change the order of the columns in the second index and see how that affects the clustering factor:


SQL%gt; select index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'CLUST_FACT';

INDEX_NAME                     CLUSTERING_FACTOR                                                                                    
------------------------------ -----------------                                                                                    
CLUST_FACT_IDX2                           999810                                                                                    
CLUST_FACT_IDX                              4889                                                                                    

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4889                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4886                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4882                                                                                                                          

SQL> 
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),1) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),3) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),5) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),9) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
    999810                                                                                                                          

SQL>

Even though there is a common column between the indexes the column order matters; ordering the data by the key column for the single-column index caused the clustering factor for the second index to be much greater than it was when the leading column for the second index was the ID column.

Although associated with an index, the clustering factor is affected by the table data so no series of index rebuilds will ‘fix’ a ‘bad’ value. The table data needs to be ordered to favor an index (or indexes if they contain the column used to order the table data). This, of course, improves performance for the positively affected indexes and, unfortunately, adversely affects other indexes created against that same table. Care must be taken when deciding whether or not to order the table data; ‘reasonable’ performance metrics must be decided upon before embarking on such a change, and it should be done in test or dev before considering it in production. It may be that overall ‘acceptable’ performance across all of the indexes is better than ‘outstanding’ performance for one index that causes ‘terrible’ performance for all others. It’s not a simple decision and needs to be carefully considered.

If you want to implement the modified clustering factor computed by modifying the query Oracle uses to generate this statistic you will need to manually update the index statistics using the DBMS_STATS package using one of the SET_INDEX_STATS procedures:


SQL> 
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> 
SQL> column clf new_value cl_f
SQL> 
SQL> select /*+  no_parallel_index(t, "CLUST_FACT_IDX2")  dbms_stats
  2    cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
  3    no_substrb_pad  no_expand index(t,"CLUST_FACT_IDX2") */
  4    sys_op_countchg(substrb(t.rowid,1,15),16) as clf
  5  from
  6   "BING"."CLUST_FACT" t where "ID" is not null
  7  and "SQ" is not null;

       CLF                                                                                                                          
----------                                                                                                                          
      4861                                                                                                                          

SQL> exec dbms_stats.set_index_stats(ownname => 'BING', indname => 'CLUST_FACT_IDX2', clstfct => &cl_f);

PL/SQL procedure successfully completed.

SQL>

Using the new_value functionality in SQL*Plus saves the computed value into a reusable variable to pass to procedures and functions that need it. The above code would set the clustering factor for the CLUST_FACT_IDX2 to 4861, replacing the Oracle-calculated value of 75994. Again, care must be taken when undertaking such a change since you cannot predict the resulting performance change.

Since this post was originally written Oracle has implemented an enhancement to DBMS_STATS to allow the DBA to tell Oracle how many table blocks can expect to be cached. The SET_TABLE_PREFS, SET_SCHEMA_PREFS and SET_DATABASE_PREFS procedures have been modified to accept the TABLE_CACHED_BLOCKS parameter; this tells Oracle how many index blocks will be cached so these can be ignored when incrementing the clustering factor when statistics are gathered. Looking at an example we see:


SQL> create table mytab(
  2  myid    number,
  3  mydata  varchar2(40),
  4  mydate  date)
  5  cache;

Table created.

SQL> 
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     insert into mytab
  4  		     values(trunc(9999*(dbms_random.value)), 'Line '||i, trunc(sysdate - mod(i, 4637)));
  5  	     end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create index mytab_idx
  2  on mytab(myid, mydate);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 996238                                                     

SQL>

For no adjustment to the default TABLE_CACHED_BLOCKS parameter we see a really large clustering factor (and this is due to the random nature of the MYID values generated). Let’s set TABLE_CACHED_BLOCKS to 47 for MYTAB and see what Oracle calculates for the clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>47);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 987551                                                     

SQL>

The clustering factor has decreased somewhat (again it’s not a dramatic decrease due to the random nature of the MYID values). Let’s set the parameter to its maximum value (255) and see what Oracle provides as a clustering factor:


SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> 
SQL> select table_name, index_name, clustering_factor
  2  from user_indexes
  3  where table_name = 'MYTAB';

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR                                                     
------------------------------ ------------------------------ -----------------                                                     
MYTAB                          MYTAB_IDX                                 939451                                                     

SQL>

It’s reduced again, although not dramatically, but it is less than it was originally. Using this new parameter to DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS and DBMS_STATS.SET_DATABASE_PREFS procedures can potentially fix a pessimistic clustering factor and improve index selection and, in turn, improve query performance.

The clustering factor determines which index Oracle will choose for a query against a given table. Changing the clustering factor will influence the CBO calculations sometimes for the better, sometimes for the worse. Plan carefully and test, test, test before second-guessing Oracle.

Which just makes sense.