Oracle Tips and Tricks — David Fitzjarrell

December 11, 2018

“The Best-laid Plans…”

Filed under: General — dfitzjarrell @ 08:14

"Expect everything, I always say, and the unexpected never happens."
-- Norton Juster, The Phantom Tollbooth

Oracle version 12 introduced the hybrid histogram as a performance improvement — such histograms are based on some ‘rules’ set forth in version 12.1. Those ‘rules’ are:

        1) a value should not be found in more than one bucket
        2) the bucket size is allowed to be extended in order to contain all instances of the same distinct value
        3) adjusted bucket size cannot be less than the original size (not applicable at either end of the data set)
        4) the original number of buckets should not be reduced

This type of histogram was a major improvement over previous histograms (height-balanced and frequency) in terms of how a query plan is generated. There was, however, a bug with this type of histogram involving cardinality estimates (Bug 25994960) and a patch was supplied as a corrective measure. Since the bug was not a ‘show stopper’ many 12.2 databases remain unpatched; a backport patch for 12.1 also exists, again not widely applied. With Oracle 18 (specifically 18.3) this issue has been addressed which leads to the following situation between unpatched 12.2 and 12.1 databases and the 18.3 version. The following script based on work by Jonathan Lewis illustrates the problem.

The dbms_stats.gather_table_stats procedure is used to generate the hybrid histogram by specifying that 13 ‘buckets’ will be created. It’s not the number of buckets that is the issue, it’s the endpoints of those buckets that change between the various versions. The script used is posted below:


drop table hist_test purge;
 
execute dbms_random.seed(0)
 
create table hist_test(
        my_id           number(8,0),
        id_mod_20       number(6,0),
        id_mod_30       number(6,0),
        id_mod_50       number(6,0),
        my_rand_id      number(6,0)
)
;
 
insert into hist_test
with datasource as (
        select
                rownum my_id
        from dual
        connect by
                level  'HIST_TEST',
                method_opt       => 'for all columns size 1 for columns my_rand_id size 13'
        );
end;
/

The script above generates 22 distinct values to base the histogram on. The query below reports on those values and on the histogram information Oracle has generated:


select
        my_rand_id, count(*)
from
        hist_test
group by
        my_rand_id
order by
        my_rand_id
;
 
select
        endpoint_value                                                            value,
        endpoint_number,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) bucket_size,
        endpoint_repeat_count
from
        user_tab_histograms
where
        table_name  = 'HIST_TEST'
and     column_name = 'MY_RAND_ID'
order by
        endpoint_value
;

Since the data sets are the same between versions the data set will be reported only once:


MY_RAND_ID   COUNT(*)
---------- ----------
         1          1
         8          3
         9          1
        10          5
        11          4
        12          8
        13         14
        14          9
        15         11
        16         22
        17         34
        18         31
        19         36
        20         57
        21         44
        22         45
        23         72
        24         70
        25         87
        26        109
        27         96
        28         41
 
22 rows selected.
 

As mentioned at the beginning of the article it’s the endpoints that change. Let’s look at the histogram in an unpatched 12.2 database:


     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        18             143          31                    31
        19             179          36                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        28             800         137                    41
 
13 rows selected.
 

Oracle generates the histogram based on the estimated cardinalities and notice that, in 12.2, Oracle takes every value from 17 through 26 for the histogram. Moving to 18.3, with the same script, the histogram data is slightly different:


     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        19             179          67                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        27             759          96                    96
        28             800          41                    41
 
13 rows selected.

Because of the patch the value 18 is now missing from the histogram, replaced with the value 27, a result of correcting the cardinality estimates generated to create the histogram. It will probably be a rare occurrence for such a change to affect an execution plan, but stranger things have been known to happen. Over the years the optimizer has had its share of mishaps with cardinality estimates, and this one appears to be minor in nature. It can be confusing, though, to upgrade to 18.3 and find that a hybrid histogram has changed unexpectedly. It may be a rare occurrence to actually check histogram data without a performance issue at hand so this could easily be overlooked in a database upgrade. It is nice, though, to be aware such changes could occur so that if someone complains that a query that ran ‘fine’ in 12.2 is running a bit … ‘off’ … in 18.3 it can be explained.

Even if it wasn’t expected.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a free website or blog at WordPress.com.

%d bloggers like this: