Oracle Tips and Tricks — David Fitzjarrell

December 28, 2018

Cloning Around

Filed under: General — dfitzjarrell @ 17:41

"We never choose which words to use, for as long as they mean what
they mean to mean, we don’t care if they make sense or nonsense." 
― Norton Juster, The Phantom Tollbooth

Sometimes it’s necessary to clone an existing ORACLE_HOME; one case for this is when the business requires a new ORACLE_HOME when CPU patchsets are applied. With some operating systems cloning a home is simiply a matter of creating a suitable archive (tar, pax, cpio) and extracting the contents to the destination directory. Some operating systems, however, aren’t as well-mannered. Linux is one such operating system and cloning an ORACLE_HOME requires the use the Universal Installer to set things ‘right’ once the copy has completed. Unfortunately this cloning process tends to have issues of its own; one such issue is the setting of the ddl_lock_timeout parameter. Let’s look at the process, the error and how to get around it.

Presuming the business does require a new home to patch it to the latest CPU so the original home can be preserved let’s proceed with the cloning process. This imaginary database will be named GLORBLE (reminding us that this is not an real database) and the original database home will be /u01/zapp/oracle/product/12.2.0/glorble. The April 2018 CPU will be applied so the cloned home will be /u01/app/oracle/product/12.2.0/glorble_apr2018, to make it obvious that it contains software patched with the April 2018 CPU.

The destination directory is created and the original database home files are copied to it from an archive — pax, tar, cpio, gzip can all be used to create such an archive. Presuming this will be a repeated task the clone command can be saved as a script (possibly named clone_home.sh):


$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/clone/bin/clone.pl ORACLE_BASE="/u01/app/oracle/"
ORACLE_HOME="/u01/app/oracle/product/12.2.0/glorble_apr2018" 
OSDBA_GROUP=dba OSOPER_GROUP=oper -defaultHomeName

For readability the command has been split into three lines; it must be a single line in the script or it won’t execute.

The environment must be set to ‘point’ to the home being cloned, then the script can be executed. Prodigous output is generated that won’t be provided here; the ‘problems’ begin after the home is cloned and the root.sh script has been run. Once cloned utilities like sqlplus will perform generally as expected until it becomes desirable or necessary to set a dynamic init parameter, like ddl_lock_timeout, when an unexpected ORA-00068 error is thrown, even when the parameter value is within the accepted range:


SQL> alter session set ddl_lock_timeout = 5;
ERROR:
ORA-00068: invalid value "" for parameter ddl_lock_timeout, must be
between 0 and 1000000

The clue appears to be in the value reported by Oracle; something in the linking step of the cloning process apparently failed. What that may be is not clear and Oracle Support has no document providing any explanation other than an upgrade process was not completed. Fortunately this home will be patched in the next step, which requires the most recent release of OPatch (patch number p6880880) which simply needs to be unzipped into the cloned home. Using the current OPatch the cloned home is processed:


$ cd 
$ $ORACLE_HOME/OPatch/opatch apply
...

As if by magic the ORA-00068 errors disappear. In reality the patch process applies software updates amd re-links the software in a manner different than the Perl script used to clone the home, which executes the setup command in $ORACLE_HOME/oui/bin. It isn’t clear WHAT is different between the two processes but no errors are thrown after patching is completed.

It’s truly an oddity; Oracle supplied tools and utiliities are expected to work and produce usable results. In this case something was missed in the overall cloning process and for those cloning an existing ORACLE_HOME for another database (which isn’t necessary but isn’t wrong) because they are running Oracle on an Oracle VM built from a template and have no software distribution to use this could be a major problem.

Take care when using this cloning process as things may not be as they seem on the surface. Many bug fixes have gone into the Perl script Oracle provides yet it appears there is at least one that hasn’t been addressed (possibly because it hasn’t been reported). This has been reported to Oracle support so hopefully an answer is forthcoming. In the interim if a home is cloned it should be patched to apply the latest security patches and to avoid the ORA-00068 surprises.

And that just makes sense.

Advertisements

December 11, 2018

“The Best-laid Plans…”

Filed under: General — dfitzjarrell @ 08:14

"Expect everything, I always say, and the unexpected never happens."
-- Norton Juster, The Phantom Tollbooth

Oracle version 12 introduced the hybrid histogram as a performance improvement — such histograms are based on some ‘rules’ set forth in version 12.1. Those ‘rules’ are:

        1) a value should not be found in more than one bucket
        2) the bucket size is allowed to be extended in order to contain all instances of the same distinct value
        3) adjusted bucket size cannot be less than the original size (not applicable at either end of the data set)
        4) the original number of buckets should not be reduced

This type of histogram was a major improvement over previous histograms (height-balanced and frequency) in terms of how a query plan is generated. There was, however, a bug with this type of histogram involving cardinality estimates (Bug 25994960) and a patch was supplied as a corrective measure. Since the bug was not a ‘show stopper’ many 12.2 databases remain unpatched; a backport patch for 12.1 also exists, again not widely applied. With Oracle 18 (specifically 18.3) this issue has been addressed which leads to the following situation between unpatched 12.2 and 12.1 databases and the 18.3 version. The following script based on work by Jonathan Lewis illustrates the problem.

The dbms_stats.gather_table_stats procedure is used to generate the hybrid histogram by specifying that 13 ‘buckets’ will be created. It’s not the number of buckets that is the issue, it’s the endpoints of those buckets that change between the various versions. The script used is posted below:


drop table hist_test purge;
 
execute dbms_random.seed(0)
 
create table hist_test(
        my_id           number(8,0),
        id_mod_20       number(6,0),
        id_mod_30       number(6,0),
        id_mod_50       number(6,0),
        my_rand_id      number(6,0)
)
;
 
insert into hist_test
with datasource as (
        select
                rownum my_id
        from dual
        connect by
                level  'HIST_TEST',
                method_opt       => 'for all columns size 1 for columns my_rand_id size 13'
        );
end;
/

The script above generates 22 distinct values to base the histogram on. The query below reports on those values and on the histogram information Oracle has generated:


select
        my_rand_id, count(*)
from
        hist_test
group by
        my_rand_id
order by
        my_rand_id
;
 
select
        endpoint_value                                                            value,
        endpoint_number,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) bucket_size,
        endpoint_repeat_count
from
        user_tab_histograms
where
        table_name  = 'HIST_TEST'
and     column_name = 'MY_RAND_ID'
order by
        endpoint_value
;

Since the data sets are the same between versions the data set will be reported only once:


MY_RAND_ID   COUNT(*)
---------- ----------
         1          1
         8          3
         9          1
        10          5
        11          4
        12          8
        13         14
        14          9
        15         11
        16         22
        17         34
        18         31
        19         36
        20         57
        21         44
        22         45
        23         72
        24         70
        25         87
        26        109
        27         96
        28         41
 
22 rows selected.
 

As mentioned at the beginning of the article it’s the endpoints that change. Let’s look at the histogram in an unpatched 12.2 database:


     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        18             143          31                    31
        19             179          36                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        28             800         137                    41
 
13 rows selected.
 

Oracle generates the histogram based on the estimated cardinalities and notice that, in 12.2, Oracle takes every value from 17 through 26 for the histogram. Moving to 18.3, with the same script, the histogram data is slightly different:


     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        19             179          67                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        27             759          96                    96
        28             800          41                    41
 
13 rows selected.

Because of the patch the value 18 is now missing from the histogram, replaced with the value 27, a result of correcting the cardinality estimates generated to create the histogram. It will probably be a rare occurrence for such a change to affect an execution plan, but stranger things have been known to happen. Over the years the optimizer has had its share of mishaps with cardinality estimates, and this one appears to be minor in nature. It can be confusing, though, to upgrade to 18.3 and find that a hybrid histogram has changed unexpectedly. It may be a rare occurrence to actually check histogram data without a performance issue at hand so this could easily be overlooked in a database upgrade. It is nice, though, to be aware such changes could occur so that if someone complains that a query that ran ‘fine’ in 12.2 is running a bit … ‘off’ … in 18.3 it can be explained.

Even if it wasn’t expected.

Blog at WordPress.com.