"If you want sense, you'll have to make it yourself."
- Norton Juster, The Phantom Tollbooth
One of the more confusing statistics in Oracle is, apparently, one called the clustering factor. Associated with an index (as evidenced by the columns in DBA_INDEXES) it’s actually dependent on the table data, more specifically the distance between ‘jumps’ for a given index key. Let’s define a ‘jump’ to be the number of blocks between rows containing the given index key starting with the first block found containing that key. That’s confusing but don’t despair, this will be explained in detail.
Oracle, in its infinite wisdom, computes a clustering factor for an index based upon the key value and the scattering of the rows containing that key across the table data. Looking at a simple example let’s look at a simple non-unique index (indexes for which the clustering factor matters) and the associated table data. The example starts by building the table and loading the data:
SQL>
SQL> create table clust_fact(
2 id number,
3 nm varchar2(40),
4 dt date,
5 sq number);
Table created.
SQL>
SQL> begin
2 for i in 1..1000000 loop
3 insert into clust_fact(id,nm,dt,sq)
4 values(mod(i,77), 'Record '||i, sysdate - mod(i, 77), i);
5 end loop;
6
7 commit;
8 end;
9 /
PL/SQL procedure successfully completed.
SQL>
Next, build the index and generate the statistics:
SQL>
SQL> create index clust_fact_idx on clust_fact(id);
Index created.
SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);
PL/SQL procedure successfully completed.
SQL>
Query the clustering factor computed by Oracle:
SQL> select index_name, clustering_factor
2 from user_indexes
3 where table_name = 'CLUST_FACT';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
CLUST_FACT_IDX 370731
SQL>
The clustering factor provides a representation of how far Oracle needs to ‘jump’, on average, between blocks containing the index key value. To understand this a bit better let’s look at the code Oracle actually uses to compute the clustering factor and explain what it does:
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
Oracle uses a function named sys_op_countchg, which takes two parameters, the first 15 bytes of the rowid and the number of blocks to traverse to trigger the count to increase; by default Oracle chooses to increment the clustering factor counter every time the block address changes. If three rows, for example, exist in the current block the counter does not increment; if the next three rows are in a different block the counter is incremented by 1 (Oracle doesn’t change blocks for the next two reads after the change). This keeps going througout the table to compute the final clustering factor. Knowing the function used by Oracle and its parameters it is possible to modify the call to accept larger values for the number of block changes before the clustering factor counter is incremented. To allow Oracle to consider 3 blocks ‘close enough’ to not change the counter the query is modified as follows:
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
Given how the values were computed for the ID column the clustering factor doesn’t change when we increase the block change count Oracle considers to be within range of the starting block address. Possible values for this second parameter are 1, 3, 5, 9 and 16. Going through the list of possible values it’s obvious with the current data loading the clustering factor won’t improve no matter which value we choose:
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
Since how the table data is ordered affects the clustering factor the table is reloaded with data ordered using the ID column:
SQL>
SQL> create table clust_tmp as select * from clust_fact order by id;
Table created.
SQL>
SQL> truncate table clust_fact;
Table truncated.
SQL>
SQL> insert into clust_fact select * from clust_tmp;
1000000 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);
PL/SQL procedure successfully completed.
SQL>
Querying the clustering factor reveals it has significantly decreased since the keys are much closer together:
SQL>
SQL> select index_name, clustering_factor
2 from user_indexes
3 where table_name = 'CLUST_FACT';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
CLUST_FACT_IDX 4889
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4888
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4888
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4885
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4881
SQL>
It is worth noting that the clustering factor slightly decreases with larger values for the number of blocks Oracle can skip before updating the change count. The changes are not huge, and only differ by 8 from the value Oracle calculates by default to setting the ‘blocks to skip’ parameter to 16, its highest possible value. Since the original load created the SQ values in sequence the table is reloaded with the SQ values in order and a new concatenated index using both the ID and SQ columns is created. This is done to show how the clustering factor behaves for other indexes when it’s ‘optimized’ for one index:
SQL>
SQL> drop table clust_tmp purge;
Table dropped.
SQL>
SQL> create table clust_tmp as select * from clust_fact order by sq;
Table created.
SQL>
SQL> truncate table clust_fact;
Table truncated.
SQL>
SQL> insert into clust_fact select * from clust_tmp;
1000000 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> create index clust_fact_idx2 on clust_fact(id, sq);
Index created.
SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);
PL/SQL procedure successfully completed.
SQL>
Querying the clustering factor the value for the original table load again appears and, again, doesn’t change even when the block skip parameter to the sys_op_countchg function is increased through the available values:
SQL>
SQL> select index_name, clustering_factor
2 from user_indexes
3 where table_name = 'CLUST_FACT';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
CLUST_FACT_IDX2 370731
CLUST_FACT_IDX 370731
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
370731
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
370731
SQL>
Ordering the data by ID shows that for both indexes the clustering factor decreases, since the indexes both include the ID column. The presence of the SQ column causes the second index clustering factor to be greater when the default parameters are used; interestingly as the block skip count increases for the second index the clustering factor decreases to that of the single-column index:
SQL>
SQL> drop table clust_tmp purge;
Table dropped.
SQL>
SQL> create table clust_tmp as select * from clust_fact order by id;
Table created.
SQL>
SQL> truncate table clust_fact;
Table truncated.
SQL>
SQL> insert into clust_fact select * from clust_tmp;
1000000 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats(user, 'CLUST_FACT', cascade=>true);
PL/SQL procedure successfully completed.
SQL>
SQL> select index_name, clustering_factor
2 from user_indexes
3 where table_name = 'CLUST_FACT';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
CLUST_FACT_IDX2 75994
CLUST_FACT_IDX 4889
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4886
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4882
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
75994
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
4967
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
4861
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
4861
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
4861
SQL>
For multiple indexes on the same table, where no column is common between them, the clustering factor can skyrocket for some indexes when the table data is ordered to favor one particular index. That example wasn’t run and is left for you to investigate. Let’s do something similar and change the order of the columns in the second index and see how that affects the clustering factor:
SQL%gt; select index_name, clustering_factor
2 from user_indexes
3 where table_name = 'CLUST_FACT';
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
CLUST_FACT_IDX2 999810
CLUST_FACT_IDX 4889
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4889
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4886
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null;
CLF
----------
4882
SQL>
SQL> --
SQL> -- 1 block limit (Oracle default)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),1) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
999810
SQL>
SQL> --
SQL> -- 3 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),3) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
999810
SQL>
SQL> --
SQL> -- 5 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),5) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
999810
SQL>
SQL> --
SQL> -- 9 block limit
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),9) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
999810
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
999810
SQL>
Even though there is a common column between the indexes the column order matters; ordering the data by the key column for the single-column index caused the clustering factor for the second index to be much greater than it was when the leading column for the second index was the ID column.
Although associated with an index, the clustering factor is affected by the table data so no series of index rebuilds will ‘fix’ a ‘bad’ value. The table data needs to be ordered to favor an index (or indexes if they contain the column used to order the table data). This, of course, improves performance for the positively affected indexes and, unfortunately, adversely affects other indexes created against that same table. Care must be taken when deciding whether or not to order the table data; ‘reasonable’ performance metrics must be decided upon before embarking on such a change, and it should be done in test or dev before considering it in production. It may be that overall ‘acceptable’ performance across all of the indexes is better than ‘outstanding’ performance for one index that causes ‘terrible’ performance for all others. It’s not a simple decision and needs to be carefully considered.
If you want to implement the modified clustering factor computed by modifying the query Oracle uses to generate this statistic you will need to manually update the index statistics using the DBMS_STATS package using one of the SET_INDEX_STATS procedures:
SQL>
SQL> --
SQL> -- 16 block limit (ASM configuration)
SQL> --
SQL>
SQL> column clf new_value cl_f
SQL>
SQL> select /*+ no_parallel_index(t, "CLUST_FACT_IDX2") dbms_stats
2 cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
3 no_substrb_pad no_expand index(t,"CLUST_FACT_IDX2") */
4 sys_op_countchg(substrb(t.rowid,1,15),16) as clf
5 from
6 "BING"."CLUST_FACT" t where "ID" is not null
7 and "SQ" is not null;
CLF
----------
4861
SQL> exec dbms_stats.set_index_stats(ownname => 'BING', indname => 'CLUST_FACT_IDX2', clstfct => &cl_f);
PL/SQL procedure successfully completed.
SQL>
Using the new_value functionality in SQL*Plus saves the computed value into a reusable variable to pass to procedures and functions that need it. The above code would set the clustering factor for the CLUST_FACT_IDX2 to 4861, replacing the Oracle-calculated value of 75994. Again, care must be taken when undertaking such a change since you cannot predict the resulting performance change.
Since this post was originally written Oracle has implemented an enhancement to DBMS_STATS to allow the DBA to tell Oracle how many table blocks can expect to be cached. The SET_TABLE_PREFS, SET_SCHEMA_PREFS and SET_DATABASE_PREFS procedures have been modified to accept the TABLE_CACHED_BLOCKS parameter; this tells Oracle how many index blocks will be cached so these can be ignored when incrementing the clustering factor when statistics are gathered. Looking at an example we see:
SQL> create table mytab(
2 myid number,
3 mydata varchar2(40),
4 mydate date)
5 cache;
Table created.
SQL>
SQL> begin
2 for i in 1..1000000 loop
3 insert into mytab
4 values(trunc(9999*(dbms_random.value)), 'Line '||i, trunc(sysdate - mod(i, 4637)));
5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> create index mytab_idx
2 on mytab(myid, mydate);
Index created.
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);
PL/SQL procedure successfully completed.
SQL>
SQL> select table_name, index_name, clustering_factor
2 from user_indexes
3 where table_name = 'MYTAB';
TABLE_NAME INDEX_NAME CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
MYTAB MYTAB_IDX 996238
SQL>
For no adjustment to the default TABLE_CACHED_BLOCKS parameter we see a really large clustering factor (and this is due to the random nature of the MYID values generated). Let’s set TABLE_CACHED_BLOCKS to 47 for MYTAB and see what Oracle calculates for the clustering factor:
SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>47);
PL/SQL procedure successfully completed.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);
PL/SQL procedure successfully completed.
SQL>
SQL> select table_name, index_name, clustering_factor
2 from user_indexes
3 where table_name = 'MYTAB';
TABLE_NAME INDEX_NAME CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
MYTAB MYTAB_IDX 987551
SQL>
The clustering factor has decreased somewhat (again it’s not a dramatic decrease due to the random nature of the MYID values). Let’s set the parameter to its maximum value (255) and see what Oracle provides as a clustering factor:
SQL> exec dbms_stats.set_table_prefs(ownname=>'BING', tabname=>'MYTAB', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);
PL/SQL procedure successfully completed.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'BING', tabname=>'MYTAB', cascade=>true, estimate_percent=>null)
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_index_stats(ownname=>user, indname=>'MYTAB_IDX', estimate_percent=> null);
PL/SQL procedure successfully completed.
SQL>
SQL> select table_name, index_name, clustering_factor
2 from user_indexes
3 where table_name = 'MYTAB';
TABLE_NAME INDEX_NAME CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
MYTAB MYTAB_IDX 939451
SQL>
It’s reduced again, although not dramatically, but it is less than it was originally. Using this new parameter to DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS and DBMS_STATS.SET_DATABASE_PREFS procedures can potentially fix a pessimistic clustering factor and improve index selection and, in turn, improve query performance.
The clustering factor determines which index Oracle will choose for a query against a given table. Changing the clustering factor will influence the CBO calculations sometimes for the better, sometimes for the worse. Plan carefully and test, test, test before second-guessing Oracle.
Which just makes sense.
[…] https://dfitzjarrell.wordpress.com/2015/07/28/it-isnt-the-index/ […]
Pingback by CLUSTERING FACTOR DEMYSTIFIED PART - I - ORACLE IN ACTION — July 28, 2015 @ 22:03
[…] https://dfitzjarrell.wordpress.com/2015/07/28/it-isnt-the-index/ ——————————————————————————————————– […]
Pingback by understanding Clustering factor | venkat4005 — November 11, 2015 @ 23:54