Oracle Tips and Tricks — David Fitzjarrell

August 18, 2021

“Now You See Me…”

Filed under: General — dfitzjarrell @ 15:56

Oracle offers the possibility of encrypted TNS connections through the TCPS protocol. This requires the use of SSL, and as a result, certificates to verify both the client and the server. Creating such a configuration is not difficult; the rest of this article will provide the steps and commands to successfully complete this task.

An SSL configuration requires, on the Oracle side, a wallet, similar to TDE. A server-side wallet is required to run a listener using the TCPS protocol and the orapki utility is used to create it. A client-side wallet also needs to be created; if using the Instant Client this can be done in a ’round-about’ fashion. The example shown here involves two wallets with the client-side not sending any information. The wallet on the client-side is present only to allow Oracle to negotiate connections.

Creating a server-side wallet with orapki isn’t difficult; a directory will be created to contain the server-side wallet files:

mkdir -p /my/tns

Next the wallet is created with orapki, and auto-login is required:

orapki wallet create -wallet "/my/tns" -pwd [password] -auto_login_local

Next the self-signed certificate for the server is created. To ensure a proper configuration the certificate name (CN) is the TNS service name. The -keysize is the size, in bytes, of the certificate and the -validity parameter specifies, in days, the time period that the certificate is valid. The following command is used:

orapki wallet add -wallet "/my/tns" -pwd [password] -dn "CN=mydb" -keysize 1024 -self_signed -validity 3650

It’s good to see that the wallet is valid and that it contains what is expected. The following command provides that information:

orapki wallet display -wallet "/my/tns"

If the client side will have a wallet the server certificate needs to be exported:

orapki wallet export -wallet "/my/tns" -pwd [password] -dn "CN=mydb" -cert /tmp/mydb-certificate.crt

Verify the contents of the exported file:

cat /tmp/mydb-certificate.crt

For a full client installation, the client-side certificate can be created on the client machine using the same commands with the exception of the local wallet directory. For instant client installations a certificate is needed but it can be created on the destination server and copied to the client location. Creating a client-side certificate in location /my/tns/client named simon, the steps would be:

mkdir -p /my/tns/client
orapki wallet create -wallet "/my/tns/client" -pwd [password] -auto_login_local
orapki wallet add -wallet "/my/tns/client" -pwd [password] -dn "CN=simon" -keysize 1024 -self_signed -validity 3650
orapki wallet display -wallet "/my/tns/client"
orapki wallet export -wallet "/my/tns/client" -pwd [password] -dn "CN=simon" -cert /tmp/simon-certificate.crt
cat /tmp/simon-certificate.crt

For instant client installations the wallet just created needs to be copied to the client machine; the directory does not need to match the directory where the wallet was created on the server, but it does need to be correctly specified in the local sqlnet.ora file. Once this is completed move on to the next step, discussed below.

Both certificates are ready to be shared. The client-side wallet needs the server certificate added to it and the following command completes that step:

orapki wallet add -wallet /my/tns/client -pwd [password] -trusted_cert -cert /tmp/mydbserver-certificate.crt

The server certificate is named for the database it will connect to. Display the contents of the client-side certificate:

orapki wallet display -wallet "/my/tns/client" -pwd [password]

The output for this example is:

Oracle PKI Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=simon
Trusted Certificates:
Subject:        CN=mydb
Subject:        CN=simon

The client-side wallet is ready for use. Adding the client certificate to the server wallet uses basically the same command:

orapki wallet add -wallet /my/tns -pwd [password] -trusted_cert -cert /tmp/simon-certificate.crt

Displaying the contents of the server-side wallet should provide similar output:

Oracle PKI Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=mydb
Trusted Certificates:
Subject:        CN=mydb
Subject:        CN=simon

The server-side sqlnet.ora file now needs the following additions:

WALLET_LOCATION =
   (SOURCE =
     (METHOD = FILE)
     (METHOD_DATA =
       (DIRECTORY = /my/tns)
     )
   )

SQLNET.AUTHENTICATION_SERVICES = (TCPS,NTS,BEQ)
SSL_CLIENT_AUTHENTICATION = FALSE
SSL_CIPHER_SUITES = (SSL_RSA_WITH_AES_256_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA)

(SSL_CLIENT_AUTHENTICATION is set to FALSE when client-side information is not required. When two-way authentication is required then this parameter must be set to TRUE. The remainder of this article presumes that one-way authentication is in use. Make the necessary parameter changes should two-way authentication be required.)

These additions provide the wallet location, the available authentication services and SSL cipher suites to use. The TCPS protocol must be specifically listed to allow connections to a listener using that protocol. The listener.ora file also needs to be modified by including the following additions:

SSL_CLIENT_AUTHENTICATION = FALSE

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /my/tns)
    )
  )

The listener configuration then needs the TCPS protocol running on a separate port:

(ADDRESS = (PROTOCOL = TCPS)(HOST = mydbserver.youbetcha.com)(PORT = 2921))

This can be done in a single listener.ora file or in the listener.ora file for a dedicated listener installation. The complete listener.ora file for the dedicated listener installation for this example is shown below:

SSL_CLIENT_AUTHENTICATION = FALSE

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /my/tns)
    )
  )

LISTENER_ENCRYPT =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCPS)(HOST = mydbserver.youbetcha.com)(PORT = 2921))
    )
 )

INBOUND_CONNECT_TIMEOUT_LISTENER_ENCRYPT = 120
CONNECT_TIMEOUT_LISTENER_ENCRYPT = 120

SID_LIST_LISTENER_ENCRYPT =
  (SID_LIST =
    (SID_DESC=
      (SID_NAME=PLSExtProc)
      (ORACLE_HOME=/u04/oracle/listener_encpt)
      (PROGRAM=extproc)
      (ENVS = "EXTPROC_DLLS=ANY,LD_LIBRARY_PATH=/u04/oracle/listener_encpt/lib")
   )
    (SID_DESC =
      (ORACLE_HOME = /u04/app/oracle/snord)
      (SID_NAME = snord)
      (ENVS="LD_LIBRARY_PATH=/u04/oracle/snord/lib")
    )
  )

In this example the database in question is already on another port using the standard TCP protocol, thus local_listener cannot be changed. The database is configured, in this listener, as a static configuration. Starting the listener and obtaining the status:

LSNRCTL for *******: Version 12.2.0.1.0 - Production on 20-SEP-2018 09:41:44

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(HOST=mydbserver.youbetcha.com)(PORT=2921)))
STATUS of the LISTENER
------------------------
Alias                     listener_encpt
Version                   TNSLSNR for ******** Version 12.2.0.1.0 - Production
Start Date                20-SEP-2018 08:26:12
Uptime                    0 days 1 hr. 15 min. 32 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u04/oracle/listener_encpt/network/admin/listener.ora
Listener Log File         /u04/oracle/diag/tnslsnr/mydbserver/listener_encpt/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=mydbserver.youbetcha.com)(PORT=2921)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "snord" has 1 instance(s).
  Instance "snord", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

With the listener successfully started the next step is to get the client side configured. The client-side sqlnet.ora file will need to contain the directory location using the following example as a guide:

WALLET_LOCATION =
   (SOURCE =
     (METHOD = FILE)
     (METHOD_DATA =
       (DIRECTORY = /my/tns/wallet)
     )
   )
 
SQLNET.AUTHENTICATION_SERVICES = (TCPS,NTS,BEQ)
SSL_CLIENT_AUTHENTICATION = FALSE
SSL_CIPHER_SUITES = (SSL_RSA_WITH_AES_256_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA)

The client tnsnames.ora file will also need an entry for the encrypted database connection:

snord_ssl           = (DESCRIPTION=(ADDRESS= (PROTOCOL=tcps) (PORT=2921)  (HOST=mydbserver.youbetcha.com)) (CONNECT_DATA=(SID=snord)))

Obviously, the port for the encrypted listener will need to be opened prior to testing the configuration.

At this point all of the necessary steps have been completed and testing can begin. The client should successfully connect through the encrypted port and behave as any basic TNS connection would:

$ sqlplus encrypt_test@mydb_ssl

SQL*Plus: Release 12.2.0.1.0 Production on Wed Sep 26 15:54:35 2018

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Enter password:
Last Successful login time: Wed Sep 26 2018 13:07:50 -06:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

ENCRYPT_TEST @ mydb_ssl >

Verifying that the connection is utilizing the TCPS protocol is simply a matter of executing the following queries:

--
-- NULL line indicates TCPS, all other protocols
-- will display
--
select NETWORK_SERVICE_BANNER
from v$session_connect_info
where SID = sys_context('USERENV','SID')
union
select SYS_CONTEXT('USERENV','NETWORK_PROTOCOL') from dual
order by 1 nulls first
/

column program format a45
set linesize 150

select *
from
(select sid,program,
  case when program not like 'oracle@% (P%)' then
  (select max(case
when NETWORK_SERVICE_BANNER like '%TCP/IP%'
      then 'TCP'
when NETWORK_SERVICE_BANNER like '%Bequeath%'
      then 'BEQUEATH'
when NETWORK_SERVICE_BANNER like '%IPC%'
      then 'IPC'
when NETWORK_SERVICE_BANNER like '%SDP%'
      then 'SDP'
when NETWORK_SERVICE_BANNER like '%NAMED P%'
      then 'Named pipe'
when NETWORK_SERVICE_BANNER is null
      then 'TCPS' end)
    from V$SESSION_CONNECT_INFO i
    where i.sid=s.sid) end protocol
  from v$session s)
where protocol is not null
/

set linesize 80

The TCPS protocol can be included in a single listener configuration, however if managing the encrypted traffic requires a service shutdown this would also stop the TCP listener and prevent any remote connections to the affected database. It is recommended that a second, dedicated listener be configured to handle the encrypted traffic, so that management and monitoring is simplified, and should problems arise, the encrypted listener can be safely shut down.

With the apparently high incidence of breaches into databases of any vendor, setting up encrypted traffic may be the best solution to curb or eliminate intrusions. With one-way authentication the wallet simply needs to exist and certification information from that wallet does not need to be imported into the server-side wallet. With two-way authentication configurations, any client connections from systems not having certificates registered in the server wallet will be disconnected. Of course, if the port isn’t open to the public no client connections can get through, which should be the first step in securing enterprise database systems. The port number should be one that isn’t widely used or published (thus 2921 is NOT a good choice), although that won’t stop creative and dedicated hackers as they will likely resort to port scans. The netstat utility can reveal the ports Oracle databases are listening on, however the TCPS protocol is not listed for encrypted listener configurations. Using the TCP protocol on a port using TCPS provides this response:

ERROR:
ORA-12547: TNS:lost contact

Since the TCPS protocol will not likely be reported by port scanners having the port number, with a listener configured for encryption, does the hacker no good. This is when the encrypted database network traffic can possibly prevent data loss. Remember too, that this is only one aspect of a robust database security configuration; encrypted connections can only do so much to deter attackers.

Another form of encryption is available with Oracle, called native encryption. This type uses settings in the sqlnet.ora file on both the client and the server and does NOT use the TCPS protocol. It is also less robust as connections succeed whether or not the traffic is encrypted. Configuring native encryption uses the following sqlnet.ora settings on the server:

sqlnet.encryption_server=required
sqlnet.encryption_types_server=(AES256)
sqlnet.crypto_checksum_server=required
sqlnet.crypto_checksum_types_server=(SHA1)

On the client side:

sqlnet.encryption_client=required
sqlnet.encryption_types_client=(AES256)
sqlnet.crypto_checksum_client=required
sqlnet.crypto_checksum_types_client=(SHA1)

The key setting is sqlnet.encryption_[client|server]; depending on how the client and server are set determines if native encryption is used. The allowable values for both client and server are:

ACCEPTED
REJECTED
REQUESTED
REQUIRED

Valid encryption types are:

RC4_256
RC4_128
RC4_56
RC4_40
AES256
AES192
AES128
3DES168
3DES112
DES
DES40

Valid checksum types are:

MD5
SHA1

The following combinations of settings for sqlnet.encryption_[client|server] disables native encryption:

Client Setting            Server Setting                Encryption and Data Negotiation
========================= ============================= ===============================
REJECTED                  REJECTED                      OFF
ACCEPTED                 REJECTED                      OFF
REQUESTED               REJECTED                      OFF
ACCEPTED                 ACCEPTED                     OFF (DEFAULT VALUE)
REJECTED                  ACCEPTED                     OFF
REJECTED                  REQUESTED                   OFF

The following setting combinations cause connections to be rejected:

Client Setting            Server Setting                Encryption and Data Negotiation
========================= ============================= ===============================
REJECTED                  REQUIRED                      Connection fails
REQUIRED                  REJECTED                      Connection fails

All other combinations enable native encryption. Since connections can succeed without encryption enabled this is best used as a secondary encryption method, on top of using an SSL configuration to enable the TCPS protocol. It can be used in place of TCPS, but it isn’t as robust and can allow unencrypted connections to the database. Think carefully about using this strategy as the sole method of encryption.

Encrypting database network traffic may not be the simplest option to deploy but it may just give the DBA (and the CEO) a better night’s sleep knowing that the network packets are unreadable to the outside world. It’s worth keeping this in mind when designing and implementing sensitive database systems.

August 10, 2021

“It’s Missing Something…”

Filed under: General — dfitzjarrell @ 13:45

"Is this the place where numbers are made?" asked Milo as the car lurched
again, and this time the Dodecahedron sailed off down the mountainside, head over
heels and grunt over grimace, until he landed sad side up at what looked like the
entrance to a cave.
"They're not made," he replied, as if nothing had happened. "You have to dig
for them. Don't you know anything at all about numbers?"

-- Norton Juster, "The Phantom Tollbooth"


For several years the FORALL loop, in conjunction with the BULK COLLECT operation, has allowed PL/SQL developers to more efficiently process data by eliminating the row-by-row fetch cycle in favor of a bulk load into memory. Such a change can speed processing up considerably, but it can also throw a ‘monkey wrench’ into the works if the LIMIT doesn’t evenly divide the row count. What can be more frustrating is that the same code works sometimes and fails sometimes, all without modifying a single character. Let’s look at why this can happen and what can be done to fix it.

BULK COLLECT does two things, basically: it fetches ‘chunks’ of data into one or more tables of records and it keeps track of the remaining data, in a rudimentary way, by monitoring for ‘no data found’, signaling the end of the result set. Unfortunately, the fetch that generates that ‘no data found’ condition also sets %NOTFOUND for the cursor and when the row source volume doesn’t evenly divide by the LIMIT selected that last fetch is a partial fetch. It does populate the record variable or variables, so the data is there at the time it is fetched. The issue is when a FORALL loop, with a LIMIT set, exits with a cur%NOTFOUND the partial fetch of data is lost as the cursor is closed. Let’s diagram (in a way) that behavior:

Fetch 1, limit of 12, result set count of 37, loop still runs:

‘A’
‘B’
‘C’
‘D’
‘E’
‘F’
‘G’
‘H’
‘I’
‘J’
‘K’
‘L’

25 rows remain.  Fetch 2, same conditions, loop still runs:

‘M’
‘N’
‘O’
‘P’
‘Q’
‘R’
‘S’
‘T’
‘U’
‘V’
‘W’
‘X’

13 rows remain.  Fetch 3, same conditions, loop still runs:

‘Y’
‘Z’
‘AA’
‘AB’
‘AC’
‘AD’
‘AE’
‘AF’
‘AG’
‘AH’
‘AI’
‘AJ’

1 row remains.  Fetch 4, same conditions, loop looks for %NOTFOUND and terminates before last row can be processed:

‘AK’
[NODATAFOUND]
If that’s still confusing the example below should clear things up. Let’s take the ubiquitous EMP table, with 14 rows of data, and try to copy that to a table named EMP_TEST, an exact copy of EMP in definition but completely empty:


SQL> 
SQL> —
SQL> — Create empty table just like EMP
SQL> —
SQL> — We’ll populate this with a FORALL loop
SQL> — using BULK COLLECT and LIMIT
SQL> —
SQL> — Well, we’ll try to but the first attempt will fail
SQL> — to completely populate the table
SQL> —
SQL> create table emp_test as select * From emp where 0=1;

Table created.

SQL> 
Let’s now set up an example that will definitely fail to populate EMP_TEST; we’ll exit the loop with the %NOTFOUND condition and a LIMIT that will leave records behind:


SQL> 
SQL> —
SQL> — Populate the table
SQL> —
SQL> — Exit the loop in the usual way
SQL> —
SQL> — Given the LIMIT value this won’t get
SQL> — all of the records inserted
SQL> —
SQL> declare
  2       type empcoltyp is table of emp%rowtype;
  3       emp_c empcoltyp;
  4  
  5       cursor get_emp_data is
  6       select * from emp;
  7  
  8  begin
  9       open get_emp_data;
 10       loop
 11       fetch get_emp_data bulk collect into emp_c limit 9;
 12       exit when get_emp_data%notfound;
 13  
 14       for i in 1..emp_c.count loop
 15             insert into emp_test (empno, ename, sal)
 16             values (emp_c(i).empno, emp_c(i).ename, emp_c(i).sal);
 17       end loop;
 18  
 19       end loop;
 20  
 21       commit;
 22  
 23  end;
 24  /

PL/SQL procedure successfully completed.

SQL> 
SQL> —
SQL> — Verify that the data was inserted
SQL> —
SQL> — The count from emp_test should match the count
SQL> — from emp
SQL> —
SQL> — It doesn’t
SQL> —
SQL> select count(*) from emp;

  COUNT(*)
———-
        14

SQL> select count(*) from emp_test;

  COUNT(*)
———-
         9

SQL> 
SQL> —
SQL> — Let’s see what’s missing
SQL> —
SQL> select * From emp where empno not in (select empno from emp_test);

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
———- ———- ——— ———- ——— ———- ———- ———-
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30

SQL> 
SQL> —
SQL> — Seems that we are 5 rows short
SQL> —
SQL> — Unfortunately NOTFOUND is set on the incomplete fetch
SQL> — from emp (the fetch that returned less than LIMIT)
SQL> — so the loop exited before processing the remaining data
SQL> —
SQL> — Ooooops
SQL> —
SQL> 

Had there been 18 rows, or had the LIMIT been 7, the insert would have completed, but it would have been a false sense of security as the next FORALL/BULK COLLECT/LIMIT process could fail because the row source wasn’t an even multiple of the LIMIT. To fix this properly we need to use a property of the record variable itself, count and check for a value of 0, indicating that all of the fetched records have been processed. It requires one more pass through the loop but that doesn’t add any appreciable time to the execution. Re-writing the process to check the record count, rather than for %NOTFOUND, populates the EMP_TEST table completely:


SQL> 
SQL> —
SQL> — Let’s try this again
SQL> —
SQL> truncate table emp_test;

Table truncated.

SQL> 
SQL> —
SQL> — Change the exit strategy to look for a 0 count
SQL> — for the populated variable
SQL> —
SQL> — This will get all of the rows from EMP into EMP_TEST
SQL> —
SQL> declare
  2       type empcoltyp is table of emp%rowtype;
  3       emp_c empcoltyp;
  4  
  5       cursor get_emp_data is
  6       select * from emp;
  7  
  8  begin
  9       open get_emp_data;
 10       loop
 11       fetch get_emp_data bulk collect into emp_c limit 9;
 12       exit when emp_c.count = 0;
 13  
 14       for i in 1..emp_c.count loop
 15             insert into emp_test (empno, ename, sal)
 16             values (emp_c(i).empno, emp_c(i).ename, emp_c(i).sal);
 17       end loop;
 18  
 19       end loop;
 20  
 21       commit;
 22  
 23  end;
 24  /

PL/SQL procedure successfully completed.

SQL> 
SQL> —
SQL> — Check counts to ensure all data was inserted
SQL> —
SQL> select count(*) from emp;

  COUNT(*)
———-
        14

SQL> select count(*) from emp_test;

  COUNT(*)
———-
        14

SQL> 
SQL> —
SQL> — Let’s see what’s missing
SQL> —
SQL> — This time there won’t be any rows returned
SQL> —
SQL> select * From emp where empno not in (select empno from emp_test);

no rows selected

SQL> 
It’s a problem that can be frustrating to work on when knowledge of how a FORALL sets the cursor status with a LIMIT-based fetch is missing. Not all loops behave in the same way; a straight FOR loop fetches records one at a time so that when %NOTFOUND is set it’s actually at the end of the data. A BULK COLLECT/LIMIT fetch sets %NOTFOUND when it can’t find any more data and that could be from a partial fetch (again where the fetch count is less than the set LIMIT value). Exiting with %NOTFOUND will close the cursor and ‘lose’ the remaining data that was fetched, resulting in incompletely processing the result set. And, since this behavior depends on both the LIMIT and the total number of rows in the result set it can work just as often as it fails.

It’s been said that knowledge can be a dangerous thing, but not in the database world since more knowledge means better understanding. Understanding how a BULK COLLECT fetch sets %NOTFOUND when a LIMIT is set can make a huge difference in how such a loop exit is written. And that knowledge can help ensure that records don’t go mysteriously ‘missing’.

It all starts with numbers.

 

July 6, 2021

“Your Transaction Will Be Delayed… “

Filed under: disaster recovery,General — dfitzjarrell @ 12:10

“Now will you tell me where we are?” asked Tock as he looked around the
desolate island.
“To be sure,” said Canby; “you’re on the Island of Conclusions. Make yourself
at home. You’re apt to be here for some time.”
“But how did we get here?” asked Milo, who was still a bit puzzled by being
there at all.
“You jumped, of course,” explained Canby. “That’s the way most everyone gets
here. It’s really quite simple: every time you decide something without having a
good reason, you jump to Conclusions whether you like it or not. It’s such an easy
trip to make that I’ve been here hundreds of times.”
“But this is such an unpleasant-looking place,” Milo remarked.
“Yes, that’s true,” admitted Canby; “it does look much better from a distance.”
— Norton Juster, The Phantom Tollbooth 



Standby databases have been both bane and boon to DBAs.  The boon is the ability to quickly recover from a priimary database failure by failing over to the standby essentially allowing the business to “pick up where they left off” when the unthinkable happens.  The bane, many times, is errant transactions which compromise the production data on the primary side; given that in most standby or Data Guard installations the lag period between primary and standby is no more tnan the time it takes to switch logs this can be far too short of an interval for the DBA to respond to prevent the responsible insert/update/delete from being applied at the standby site.  Yes, such a scenario can be a rare occurrence but it’s not beyond the realm  of possibility.  Compound that with customers using a standby database as a near-real-time reporting database and the situation gets more complicated. What’s a DBA to do?  Let’s examine this scenario and see what possible options the DBA has at his/her disposal.

Standby databases, whether or not they are managed via Data Guard, have been key components in most Disaster Recovery configurations.  All data flowing into or out of the primary is systematically repllicated to one or more standby databases by configuring standby redo logs (version 11 and later) or redo transport (all versions).  Let’s look at that last staement in more detail.  When standby redo logs were introduced the behavior changed for the standard ‘alter database recover managed standby database disconnect from session;’ statement; Oracle automatically  uses the standby redo logs for transaction information.  Yes, the old reliable archived redo logs are still transported to the standby but they aren’t the main source of information.  The archived redo logs CAN be used, if desired, by modifying the ‘alter database’ statement by adding the ‘using archived logfile’ directive, and this option opens the door to setting a delay to the redo apply process so that problem transactions can be caught before they can be applied to the standby — problem averted.

Redo log apply delay is configured in the spfile parameter log_archive_dest_N (N being a number from 1 to up to 31, depending on the version of Oracle) using, yes, you guessed it, the DELAY parameter.  This parameter requires an integer value, in minutes.  For example a possible configuration for a standy database located in Kansas City would be:


log_archive_dest_4=’SERVICE=stbyKC, DELAY=1440′
log_archive_dest_state_4=ENABLE


Standby number 4, in Kansas CIty, will now lag the primary database by 24 hours when archived logfiles are the source of truth.  The delay can be set to any value of minutes, from 1 to whatever you desire; if the delay had been set to 2880 there would be 2 days lag between primary and standby.  Yes, 2880 minutes is a bit extreme but it’s not an invalid value for the parameter.  Remember that corrupted transactions can occur at any time of the day or night and because of that I have seen on more than one occasion the DELAY set to 1440.  Take, as an example a situation where transactions hit the primary database 24/7 and at 3 AM on Tuesday, December 7th, the production data is rendered unusable due to a corrupted feed from Klottsylvania.  Additionally Perba Kleppneckle, the company’s intrepid DBA, wisely set a 24 hour delay between primary and standby.  She now has time to research when the transaction was applied so the current standby can be recovered to a time just before the data corrupting transaction would be processed.  Failing over to the standby would now be possible since Perba was able to prevent the corruption that rendered the primary unusable.  Yes, the roles are reversed but the former primary now becomes the standby once it’s recreated from the new primary and the corrupted data is no longer an issue.

One fly in this ointment becomes apparent if the standby is running as an Active Data Guard database; normally such configurations are found when the standby is used as an active reporting database; many times such use requires that the standby remain in lock step with the primary (the finance and investment industries are good examples); delays would render the reports essentially useless.  But, don’t despair, even the finance and investment industries can benefit from using an apply delay; such a configuration would be found on a second or third standby database not used for reporting.  Having more than one standby available can provide at least one source of truth where data corruption can be minimized or eliminated, giving the enterprise a recovery path that doesn’t jeopardize reporting and one that can restore the other standbys should disaster strike.  It’s likely that the current reports are run from a scheduler (on Windows or via cron on Unix/Linux) providing scripts that can be called at will to regenerate reports as necessary.  Fail over, rebuild the remaining standby databases and re-run the problem reports.  Since the old standby is now the new primary the archived logs which contain the suspect transactions are no longer valid and the danger of accidentally applying them is averted.

Can multiple standby databases have different delay settings?  Of course; the example above involves two standbys, one with a delay of 0 minutes and one with a delay of 1440 minutes.  Let’s go back to our wisely cautious DBA, Perba Kleppneckle, and see how she configured 3 standby databases for various uses.  Perba works for the investment firm Alhami Investments, LLC, and the company requires the use of standby databases to feed data to various parts of the organization.  Brokers need current data to generate investment reports for clients; sales can use data up to 24-hours old since they are building historical performance reports and portfolio managers have asked for data no more than two hours old to track futures.  Perba has configured and built the three standbys in separate physical locations — Topeka, KS, Reno, NV and Baltimore, MD — to isolate them from each other and from the primary, located in San Diego, CA.  Standby #1 is in Topeka, and it’s configured absent any delay; this is for the brokers.  Standby #2 is in Reno, configured with a 2-hour delay for the portfolio managers; standby #3 is in Baltimore, configured with a 24-hour delay for the sales team.  Looking at the log_archive_dest_N parameters Perba implemented we see:


log_archive_dest_2=’SERVICE=stbyone’
log_archive_dest_state_2=enable
log_archive_dest_3=’SERVICE=stbytwo, DELAY=120′
log_archive_dest_state_3=enable
log_archive_dest_4=’SERVICE=stbythree, DELAY=1440′
log_archive_dest_state_4=enable


Now that Perba has the configuration parameters set the various standbys can be started.  Standby #1 uses the standby redo logs so it can be started with the following command:


alter database recover managed standby database disconnect from session;


Standbys #2 and #3 will need to use the archive logs rather than the standby redo logs; Perba modifies her startup command:


alter database recover managed standby database using archived logfile disconnect from session;


which allows Oracle to use the configured delay to buffer the recovery should a data-corrupting transaction or set of transactions be applied to the primary.  Note that in this configuration both the primary and standby #1 will be affected by data-corrupting input; having the remaining standbys using different delay settings allows for faster data recovery and a shorter recovery time (the two-hour delay standby requires less redo to recover to just before the corruption hits, bringing the restored primary online in less time thus starting the standby rebuild process quicker).  Of course the standby log destination configuration will change since Reno will be the new primary database; San Diego will now become standby #2.  Once the standby reconfiguration is complete the applications that used Reno as a standby will need to be reconfigured to use San Diego, a simple change taking minutes to complete since the tnsnames.ora file only need to swap the connection information between San Diego and Reno.  The service names can remain the same since no location names are used to define those services.

Delaying standby reecovery may not be standard practice but for mission-critical databases maybe it should be.  The ability to prevent data corruption from spreading across the standby landscape can save the business time and money by providing a means to recover from such corruption in a short amount of time, putting the company back on track quickly and reliably by reducing the amount of duplicated work to enter current transactions again.  The key part of Disaster Recovery isn’t the disaster, it’s the recovery and the smoother that process can go the better it is for the enterprise.  Some thought and investigation beforehand can better prepare the environment, and the DBA, to respond to that which the DBA hopes never happens.

Because jumping to conclusions can land you somewhere you’d rather not be.

 

December 24, 2020

“Abort!!! Abort!!!”

Filed under: General — dfitzjarrell @ 15:22

"For years she ruled as a wise and beloved monarch, each morning at sunrise
releasing the day's new sounds, to be borne by the winds throughout the kingdom,
and each night at moonset gathering in the old sounds, to be catalogued and filed in
the vast storage vaults below."
-- Norton Juster,"The Phantom Tollbooth"

In the early days of Oracle database administration using the ABORT option to the SHUTDOWN command was met with many comments:

“Insane!”

“Irresponsible!”

“Foolhardy!”

“Stupid!”

“It’s your funeral…”

“Better get your CV updated.”

Such exclamations (and their accompanying expletives) were due, in part, from not understanding that Oracle is smart enough to realize an aborted shutdown and initiate the steps necessary to recover any and all available committed transactions, ensuring a consistent database by the time it’s opened. We have come far since then and, for the most part, no longer fear a “shutdowon abort”. Until 11.2 and later releases. 11.2.0.1 introduced the EMON {event monitor} process, which uses the SYS.REG$ table to monitor event queues. Unfortunately a “shutdown abort” can leave the REG$ table in a state unusable by the EMMON process resulting in the process failing to start at database startup. Let’s look into that to get a better understanding of why this can occur.

Oracle is written to perform a number of “housekeeping” tasks when a shutdown is normal, immediate or transactional; one set of these tasks cleans out various queue tables so that at the next startup no orphan queues exist. During a normal or transactional shutdown Oracle will wait for pending transactions to complete before terminating the associated user processes. A shutdown immediate terminates user proesses “immediately”, with networked conenctions taking the most time to stop cleanly. In all three cases Oracle then terminates any advanced queuing proesses, removing associated entries from the tables those processes use. By the time the DBA sees “Oracle shut down” all of the maintenance tasks related to process management are complete and nothing is left “hanging”; the next startup is clean and smooth and, barring any physical problems (disk, memory, CPU) the startup proceeds in an orderly fashion and the database is ready for use.

One of these housekeeping tasks is to clean up the queue tables to remove entries generated by runnning processes that have ben registered for notification. EMON is the process whereby registered events are monitored; changes are recorded, generating notifications. SYS.REG$ contains the information EMON uses to connect to and monitor events:


SQL> desc reg$
 Name                                                                                Null?    Type
 ----------------------------------------------------------------------------------- -------- --------------------------------------------------------
 REG_ID                                                                                       NUMBER
 SUBSCRIPTION_NAME                                                                   NOT NULL VARCHAR2(128)
 LOCATION_NAME                                                                       NOT NULL VARCHAR2(256)
 USER#                                                                               NOT NULL NUMBER
 USER_CONTEXT                                                                                 RAW(128)
 CONTEXT_SIZE                                                                                 NUMBER
 NAMESPACE                                                                                    NUMBER
 PRESENTATION                                                                                 NUMBER
 VERSION                                                                                      NUMBER
 STATUS                                                                                       NUMBER
 ANY_CONTEXT                                                                                  ANYDATA
 CONTEXT_TYPE                                                                                 NUMBER
 QOSFLAGS                                                                                     NUMBER
 PAYLOAD_CALLBACK                                                                             VARCHAR2(4000)
 TIMEOUT                                                                                      TIMESTAMP(6)
 REG_TIME                                                                                     TIMESTAMP(6) WITH TIME ZONE
 NTFN_GROUPING_CLASS                                                                          NUMBER
 NTFN_GROUPING_VALUE                                                                          NUMBER
 NTFN_GROUPING_TYPE                                                                           NUMBER
 NTFN_GROUPING_START_TIME                                                                     TIMESTAMP(6) WITH TIME ZONE
 NTFN_GROUPING_REPEAT_COUNT                                                                   NUMBER
 GROUPING_INST_ID                                                                             NUMBER
 STATE                                                                                        NUMBER
 SESSION_KEY                                                                                  VARCHAR2(1024)

SQL> 

Because of this dependency orphan entries in SYS.REG$ will cause EMON startup to fail, which is why, in most shutdown scenarios, Oracle “cleans house” to remove any entries that could create issues if they are left behind. Notice that says “in most shutdown scenarios”; there is one shutdown option that doesn’t follow those rules.

With a “shutdown abort” the situation is different — processes are terminated abruptly without benefit of any pre-shutdown maintenance. Up until version 11.2 this really wasn’t an issue as background processes started and the resulting “mess” left by the “shutdown abort” could be addressed by performing those same cleanup tasks at the next database startup. Enter the EMON process — it reads the SYS.REG$ table for queues and queuing procoesses to discover which processes to monitor. A “shutdown abort” won’t clear old queue entries from SYS.REG$ and, as a result, the EMON process can fail to start, leaving notice in the alert log of the failure. This really doesn’t affect mounting and opening the database but it is a problem for the DBA to reckon with since notifications will be essentially disabled. If this isn’t quite clear think of it as though the process moved, didn’t leave a forwarding address and tore dowm the old location; mail can’t be delivered because nothing exists where the process used to live.

Finding orphan entries in SYS.REG$ is fairly straightforward as they will not have a lccation_name found in SYS.LOC$:


select reg_id, eubscription_name, count(*) from sys.reg$
where subscription_name like 'CHNF%' and
location_name not in (select location_name from sys.loc$) group by reg_id, subscription_name;

This will provide a list of orphaned registrations that should be able to be deleted from SYS.REG$:


SUBSCRIPTION_NAME                                                                                                                    REG_ID   COUNT(*)
-------------------------------------------------------------------------------------------------------------------------------- ---------- ----------
CHNF130232                                                                                                                           130232          1
CHNF130219                                                                                                                           130219          1
CHNF130513                                                                                                                           130513          1
CHNF130228                                                                                                                           130228          1
CHNF130220                                                                                                                           130220          1
CHNF130514                                                                                                                           130514          1
CHNF130504                                                                                                                           130504          1
CHNF130215                                                                                                                           130215          1
CHNF130510                                                                                                                           130510          1
CHNF130509                                                                                                                           130509          1
CHNF130227                                                                                                                           130227          1
CHNF130507                                                                                                                           130507          1
CHNF130218                                                                                                                           130218          1
CHNF130230                                                                                                                           130230          1
CHNF130229                                                                                                                           130229          1
CHNF130231                                                                                                                           130231          1
CHNF130506                                                                                                                           130506          1
CHNF130505                                                                                                                           130505          1
CHNF130503                                                                                                                           130503          1
CHNF130217                                                                                                                           130217          1
CHNF130512                                                                                                                           130512          1
CHNF130508                                                                                                                           130508          1
CHNF130511                                                                                                                           130511          1
CHNF130216                                                                                                                           130216          1

Modifying the query can let Oracle write the delete statements for you:


select 'delete from sys.reg$ where reg_id = '||reg_id||';'
from sys.reg$
where subscription_name like 'CHNF%' and
location_name not in (select location_name from sys.loc$) group by reg_id, subscription_name;

The query results can be spooled to a file for execution; the output of the above query is shown below:


delete from sys.reg$ where reg_id = 130217;
delete from sys.reg$ where reg_id = 130512;
delete from sys.reg$ where reg_id = 130509;
delete from sys.reg$ where reg_id = 130506;
delete from sys.reg$ where reg_id = 130513;
delete from sys.reg$ where reg_id = 130510;
delete from sys.reg$ where reg_id = 130232;
delete from sys.reg$ where reg_id = 130229;
delete from sys.reg$ where reg_id = 130228;
delete from sys.reg$ where reg_id = 130508;
delete from sys.reg$ where reg_id = 130503;
delete from sys.reg$ where reg_id = 130219;
delete from sys.reg$ where reg_id = 130231;
delete from sys.reg$ where reg_id = 130504;
delete from sys.reg$ where reg_id = 130511;
delete from sys.reg$ where reg_id = 130230;
delete from sys.reg$ where reg_id = 130507;
delete from sys.reg$ where reg_id = 130216;
delete from sys.reg$ where reg_id = 130514;
delete from sys.reg$ where reg_id = 130227;
delete from sys.reg$ where reg_id = 130505;
delete from sys.reg$ where reg_id = 130220;
delete from sys.reg$ where reg_id = 130218;
delete from sys.reg$ where reg_id = 130215;

Once these entries are gone the database can be shutdown and started the EMON process will also successfully start, enabling registered notifications and eliminating the EMON errors previously found in the alert log.

Just remember to file this away for future use.

Digging Through The Archives-

Filed under: General — dfitzjarrell @ 10:46

"Nothing can possibly go wrong now," cried the Humbug happily, and as soon
as he'd said it he leaped from the car, as if stuck by a pin, and sailed all the way to
the little island.
-- Norton Juster,"The Phantom Tollbooth"

Starting with Oracle version 6 the database offered the possibility of “hot”, or online, database backups. This relied, in part, on the ability of Oracle to archive, or copy, redo data into a separate location to be read later for recovery purposes. [Oracle also introduced the redo change vector in this same release, a key part of the ability to peform “hot” backups. A discussion on the redo change vector will have to wait for another post.] To provide that functionality the database must be running ih archivelog mode, and until recently required that the log_archive_start parameter be set to true; Oracle now automatically sets this if the database is running in archivelog mode. [Of course such parameters as log_archive_dest_n need to be set to let Oracle know where to write the archived redo.] What may not be common knowledge is how much of the configured size of the redo log is actually copied during archiving. Let’s look into that and see where it leads.

Redo logs are configured with a fixed size at creation which cannot be altered, meaning there is no ‘alter redo log … resize …’ command; to increase the size of redo logs it is necessary to create larger logs then delete the smaller logs once they have been archived. Left to its own devices Oracle will write redo data to the current log until it is either completely filled or until the next redo record won’t fit in the current log; Oracle executes a log switch which begins the process of archiving the current redo records. [For this post the redo log size will be 4 GB, using a block size of 512 bytes, producing 8,388,608 blocks.] One would therefore expect that the archivelogs will be very close in size to the redo logs from which they were generated but that may not always be true.

When Oracle copies redo records it only copies what was currently written which may not completely fill the redo log. Given the impossible situation of unlimited storage space Oracle would only switch logs when they are full and the size of the archive would be very close to the size of the redo log. Of course that can’t happen so it’s necessary to backup the archivelog destination on a regular basis to provide both recoverability and space. Such procexses invariably execute a manual log switch prior to backing up the contents of the archivelog destination, making smaller archivelogs a reality. On top of that only the current redo records in a given redo log are copied; the entire file is not simply duplicated with an operating-system level command so even the largest archivelog will likely be smaller than the redo log it was generated from. Depending on the frequency of the archivelog backups some archives may be exceptionally small in comparison.

This brings us to the topic of redo log switch time. Since this is the real world many things can affect archivelog generation — frequency of backups, size of the redo records, transactional volume and how archive_lag_target is set all affect the frequency and size of the archived logs. In an ideal world the switch time would only depend upon the redo log size and the transaction volume; only ‘full’ logs would be archived. That, of course, isn’t true so the log switch time will vary due to the conditions stated at the beginning of this section. Reporting that time is fairly straightforward, given some limitations on the data used to calculate the value. Without any maintenance activity on the redo logs (backups, resizing) any range of days will provide a reasonably accurate switch time indicative of the size of the redo log and the database activity experienced during the day. Looking at a basic script to calculate such a value the query is not a complicated one:


select round(avg(60/count(*)),0) "Log switch time - minutes"
from v$log_history
where first_time >= sysdate - 30
group by to_Char(first_time, 'DD-MON-RRRR HH24')

The output is the average redo log switch time, in minutes, over a period of 30 days. Recommended switch times range from 10 to 15 minutes but some systems may not be able to satisfy that recommendation — transaction volume and file size restrictions may prevent sizing the redo logs to provide that log switch frequency. So the DBA may need to work within such limitations to provide the best compromise between log size and switch frequency.

Looking at log switch history can provide a window into the overall level of activity; the following query can be used to generate a table of hourly log switches (I take no credit for writing it, although I cannot remember where it may have come from):


set verify off pagesize 60
col date format a12
col h0 form 999
col h1 form 999
col h2 form 999
col h3 form 999
col h4 form 999
col h5 form 999
col h6 form 999
col h7 form 999
col h8 form 999
col h9 form 999
col h10 form 999
col h11 form 999
col h12 form 999
col h13 form 999
col h14 form 999
col h15 form 999
col h16 form 999
col h17 form 999
col h18 form 999
col h19 form 999
col h20 form 999
col h21 form 999
col h22 form 999
col h23 form 999
col "Day" format a4
col "Total" format 99999

set linesize 145

SELECT trunc(first_time) "Date",
to_char(first_time, 'Dy') "Day",
    count(*) "Total",
    SUM(decode(to_char(first_time, 'hh24'),'00',1,0)) "h0",
    SUM(decode(to_char(first_time, 'hh24'),'01',1,0)) "h1",
    SUM(decode(to_char(first_time, 'hh24'),'02',1,0)) "h2",
    SUM(decode(to_char(first_time, 'hh24'),'03',1,0)) "h3",
    SUM(decode(to_char(first_time, 'hh24'),'04',1,0)) "h4",
    SUM(decode(to_char(first_time, 'hh24'),'05',1,0)) "h5",
    SUM(decode(to_char(first_time, 'hh24'),'06',1,0)) "h6",
    SUM(decode(to_char(first_time, 'hh24'),'07',1,0)) "h7",
    SUM(decode(to_char(first_time, 'hh24'),'08',1,0)) "h8",
    SUM(decode(to_char(first_time, 'hh24'),'09',1,0)) "h9",
    SUM(decode(to_char(first_time, 'hh24'),'10',1,0)) "h10",
    SUM(decode(to_char(first_time, 'hh24'),'11',1,0)) "h11",
    SUM(decode(to_char(first_time, 'hh24'),'12',1,0)) "h12",
    SUM(decode(to_char(first_time, 'hh24'),'13',1,0)) "h13",
    SUM(decode(to_char(first_time, 'hh24'),'14',1,0)) "h14",
    SUM(decode(to_char(first_time, 'hh24'),'15',1,0)) "h15",
    SUM(decode(to_char(first_time, 'hh24'),'16',1,0)) "h16",
    SUM(decode(to_char(first_time, 'hh24'),'17',1,0)) "h17",
    SUM(decode(to_char(first_time, 'hh24'),'18',1,0)) "h18",
    SUM(decode(to_char(first_time, 'hh24'),'19',1,0)) "h19",
    SUM(decode(to_char(first_time, 'hh24'),'20',1,0)) "h20",
    SUM(decode(to_char(first_time, 'hh24'),'21',1,0)) "h21",
    SUM(decode(to_char(first_time, 'hh24'),'22',1,0)) "h22",
    SUM(decode(to_char(first_time, 'hh24'),'23',1,0)) "h23"
FROM    V$log_history
where first_time > trunc(sysdate - &&1)
group by trunc(first_time), to_char(first_time, 'Dy')
Order by 1;

Providing the number of days allows for generating output ranging from 1 day to the limit of the data found in V$LOG_HISTORY; using 30 days as an example the script generates the following table:


Date         Day   Total   h0   h1   h2   h3   h4   h5   h6   h7   h8   h9  h10  h11  h12  h13  h14  h15  h16  h17  h18  h19  h20  h21  h22  h23
------------ ---- ------ ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
09-NOV-20    Mon     566   21    22   20   28   20   27   25   26   25   24   25   23   26   25   26   25   23   24   22   22   22   23   19   23
10-NOV-20    Tue     520   24    25   25   23   22   24   26   24   23   22   21   21   22   22   21   21   20   20   20   19   21   20   15   19
11-NOV-20    Wed     505   26    20   18   24   20   24   24   21   23   24   22   22   23   22   21   21   19   18   19   19   18   21   17   19
12-NOV-20    Thu     535   20    22   19   24   31   31   25   24   25   22   22   24   23   22   23   21   20   21   19   18   20   20   20   19
13-NOV-20    Fri     550   19    21   17   24   26   25   26   22   24   22   23   27   25   25   28   27   25   24   22   19   19   21   20   19
14-NOV-20    Sat     505   19    21   19   23   26   22   24   23   24   21   22   21   22   20   22   19   21   20   19   19   19   20   20   19
15-NOV-20    Sun     536   21    22   21   29   53   25   25   22   21   20   20   21   21   20   20   21   19   20   18   20   19   20   19   19
16-NOV-20    Mon     537   20    22   21   30   21   24   27   25   25   24   27   31   21   23   21   22   21   21   20   20   18   19   16   18
17-NOV-20    Tue     605   21    20   19   22   22   34   25   25   23   27   28   31   28   30   29   29   28   26   25   20   20   25   24   24
18-NOV-20    Wed     587   25    19   23   32   29   24   25   23   24   24   23   21   21   21   22   23   26   33   31   21   20   27   24   26
19-NOV-20    Thu     520   24    19   20   25   27   24   27   24   24   26   25   21   22   20   22   21   20   18   18   19   19   20   18   17
20-NOV-20    Fri     511   19    20   19   25   25   22   26   25   23   24   22   22   21   22   22   20   20   19   19   19   19   20   19   19
21-NOV-20    Sat     524   19    22   19   24   27   22   26   25   25   22   23   23   22   22   21   20   20   21   20   21   21   20   19   20
22-NOV-20    Sun     542   18    21   18   25   58   25   28   21   22   22   21   21   20   21   20   20   22   21   20   20   20   20   20   18
23-NOV-20    Mon     522   19    22   20   24   28   23   25   24   24   22   25   22   21   22   21   22   23   21   19   19   19   19   20   18
24-NOV-20    Tue     519   18    22   18   24   26   24   25   23   26   22   22   22   22   23   24   22   20   21   19   20   20   19   19   18
25-NOV-20    Wed     521   19    19   21   26   25   24   26   22   24   24   22   23   21   22   23   21   20   22   20   21   20   20   18   18
26-NOV-20    Thu     484   19    22   20   21   20   22   24   19   22   21   20   21   19   20   19   20   19   19   19   20   20   20   19   19
27-NOV-20    Fri     519   20    21   21   27   31   22   24   24   23   21   21   23   21   20   21   21   22   20   19   19   20   20   19   19
28-NOV-20    Sat     521   19    21   19   26   31   23   27   23   23   23   21   22   22   21   21   20   20   22   19   19   20   21   19   19
29-NOV-20    Sun     535   21    21   19   23   60   22   25   22   21   21   19   20   19   22   20   21   20   19   20   20   21   20   19   20
30-NOV-20    Mon     592   19    19   20   22   27   23   28   24   25   21   24   25   22   28   29   29   28   26   30   20   19   28   28   28
01-DEC-20    Tue     535   19    18   19   27   33   24   28   25   26   25   24   22   22   23   23   21   22   20   19   21   18   19   18   19
02-DEC-20    Wed     529   18    21   18   27   33   25   25   23   23   22   22   23   21   23   23   22   21   20   20   21   20   20   19   19
03-DEC-20    Thu     559   20    22   21   29   33   24   27   27   24   25   26   25   22   25   25   23   22   20   20   21   20   20   19   19
04-DEC-20    Fri     529   18    21   20   24   31   26   27   25   24   23   23   24   19   22   24   22   20   19   19   20   19   20   19   20
05-DEC-20    Sat     520   19    21   19   23   29   25   25   24   23   23   21   21   21   21   21   21   22   20   21   21   20   21   19   19
06-DEC-20    Sun     531   20    20   19   20   59   26   24   23   23   20   21   19   20   21   20   19   19   21   18   19   19   22   18   21
07-DEC-20    Mon     525   20    22   20   24   27   24   23   22   26   21   24   24   22   23   23   21   21   21   20   19   21   20   19   18
08-DEC-20    Tue     552   19    23   20   25   26   24   26   24   24   22   22   24   20   23   23   23   20   20   19   21   20   19   17   48
09-DEC-20    Wed      98   10    10    9    8   10   10    9    8    9    9    6    0    0    0    0    0    0    0    0    0    0    0    0    0

Using such a query can help spot redo logs that need to be made larger to accomodate the transaction volume so log switch frequency can be reduced; remember that these totals will be affected by arhivelog backup frequency and the value of archive_lag_target.

In tandem with the log switch frequency is the actual time required to archive the current redo log; this should be a fairly quick operation barring any storage or configuration issues. The archive write time can be calculated from data found in V$ARCHIVED_LOG, and write times for individual archives as well as an overall average can be computed. To calculate the write time for individual archivelogs:


select (completion_time - next_time)*86400 from v$archived_log;

NEXT_TIME is the time recorded for the first entry in the next redo log or redo log group to be written to; it matches the FIRST_TIME entry for the next log record in the view. COMPLETION_TIME is the date/time when the actual data copy to the archivelog finishes. NEXT_TIME is therefore the date/time of the log switch; switching logs triggers the write mechanism to the archivelog destination so it’s also the start time of the redo data copy. Depending on the size of the redo data copied the copy time can fluctuate, and that may be confusing to those who expect the archivelogs to be the size of the redo logs. V$ARCHIVED_LOG also records the blocks copied so it’s easy to determine the size of each archived copy of a redo log:


connect / as sysdba
set linesize 180 pagesize 100 trimspool on
column name format a60
column pct_arch format 990.90
column sysdate new_value sys_dt noprint

select sysdate from dual;

alter session set nls_date_format = 'DD-MON-RRRR HH24:MI:SS';

with rblocks as (
select distinct bytes/blocksize rbl from v$log
)
select name, first_time, next_time, blocks, round((blocks/rbl)*100, 2) pct_arch, completion_time, (completion_time - next_time)*86400 write_time
from v$archived_log, rblocks
where first_time >= trunc(sysdate)

spool archlog_rpt_&sys_dt..log
/
spool off

undefine 1

A portion of the report generated is shown below:


NAME                                 FIRST_TIME       NEXT_TIME        BLOCKS PCT_ARCH COMPLETION_TIME  WRITE_TIME
------------------------------------------------------------ -------------------- -------------------- ---------- -------- -------------------- ----------
                                 11-DEC-2020 00:00:22 11-DEC-2020 00:01:08    7395689    88.16 11-DEC-2020 00:01:17      9
                                 11-DEC-2020 00:01:08 11-DEC-2020 00:02:06    7642144    91.10 11-DEC-2020 00:02:14      8
                                 11-DEC-2020 00:02:06 11-DEC-2020 00:03:27    7426709    88.53 11-DEC-2020 00:03:38     11
                                 11-DEC-2020 00:03:27 11-DEC-2020 00:15:24    3594910    42.85 11-DEC-2020 00:15:29      5
                                 11-DEC-2020 00:15:24 11-DEC-2020 00:30:18      98727     1.18 11-DEC-2020 00:30:18      0
                                 11-DEC-2020 00:30:18 11-DEC-2020 00:45:16      91018     1.09 11-DEC-2020 00:45:16      0
                                 11-DEC-2020 00:45:16 11-DEC-2020 00:45:50        699     0.01 11-DEC-2020 00:45:50      0
                                 11-DEC-2020 00:45:50 11-DEC-2020 01:00:41    5510730    65.69 11-DEC-2020 01:00:50      9
                                 11-DEC-2020 01:00:41 11-DEC-2020 01:00:54    1382449    16.48 11-DEC-2020 01:00:57      3
                                 11-DEC-2020 01:00:54 11-DEC-2020 01:01:41    8097166    96.53 11-DEC-2020 01:01:55     14
                                 11-DEC-2020 01:01:41 11-DEC-2020 01:02:44    7558944    90.11 11-DEC-2020 01:02:55     11
                                 11-DEC-2020 01:02:44 11-DEC-2020 01:04:02    7429076    88.56 11-DEC-2020 01:04:15     13
                                 11-DEC-2020 01:04:02 11-DEC-2020 01:15:26    2023945    24.13 11-DEC-2020 01:15:29      3
                                 11-DEC-2020 01:15:26 11-DEC-2020 01:30:18    1779189    21.21 11-DEC-2020 01:30:21      3
                                 11-DEC-2020 01:30:18 11-DEC-2020 01:45:14    1732877    20.66 11-DEC-2020 01:45:16      2
                                 11-DEC-2020 01:45:14 11-DEC-2020 01:45:40      39409     0.47 11-DEC-2020 01:45:41      1
                                 11-DEC-2020 01:45:40 11-DEC-2020 02:00:24    4620762    55.08 11-DEC-2020 02:00:32      8
                                 11-DEC-2020 02:00:24 11-DEC-2020 02:01:22    7640768    91.09 11-DEC-2020 02:01:32     10
                                 11-DEC-2020 02:01:22 11-DEC-2020 02:02:25    7465826    89.00 11-DEC-2020 02:02:33      8
                                 11-DEC-2020 02:02:25 11-DEC-2020 02:03:58    7328193    87.36 11-DEC-2020 02:04:07      9
                                 11-DEC-2020 02:03:58 11-DEC-2020 02:15:16    1618757    19.30 11-DEC-2020 02:15:19      3
                                 11-DEC-2020 02:15:16 11-DEC-2020 02:30:14     786809     9.38 11-DEC-2020 02:30:15      1
                                 11-DEC-2020 02:30:14 11-DEC-2020 02:30:16         86     0.00 11-DEC-2020 02:30:16      0
                                 11-DEC-2020 02:30:16 11-DEC-2020 02:45:16     775505     9.24 11-DEC-2020 02:45:17      1
                                 11-DEC-2020 02:45:16 11-DEC-2020 02:45:39      36017     0.43 11-DEC-2020 02:45:39      0
                                 11-DEC-2020 02:45:39 11-DEC-2020 03:00:29    4719624    56.26 11-DEC-2020 03:00:40     11
                                 11-DEC-2020 03:00:29 11-DEC-2020 03:01:20    7616245    90.79 11-DEC-2020 03:01:31     11
                                 11-DEC-2020 03:01:20 11-DEC-2020 03:02:11    7445839    88.76 11-DEC-2020 03:02:19      8
...

The BLOCKS column provides the number of redo blocks copied from the redo log to the archive; notice that it is less than the total number of blocks for the redo log (8388608). [The NAME is blank because the archivelog has been removed from the system as a result of a scheduled archivelog backup.] As mentioned previously the redo record size governs how full the redo log will become before a “standard” log switch occurs; very rarely will the archivelog contain the total number of blocks allocated to the redo log, that’s the nature of redo records. As long as the archived block count is reasonably close to the total number of blocks in the redo log there should be no reason for concern. If archive_lag_target is set then log switches should occur with regularity and at the specified interval — this does not prevent redo log switches from occurring “naturally”, and the report above illustrates that for activity at the top of each hour. Notice that those will be very close to 90+% of the total redo blocks allocated. Other reasons log switches can occur are a privileged user executed a manual log switch after redo log additioms and RMAN archivelog backups so the blocks written can sometimes be a smaller number. Using the log switch history query provided above will report the number of loh switches per hour so hourly comparisons can be made from several days data; an increase in an hourly number could be indicative of manual intervention and a starting point for investigation. A report similar to that provided above should cause no concern.

As long as you don’t jump to conclusions.

September 21, 2020

Take Control

Filed under: General — dfitzjarrell @ 16:42

"For years she ruled as a wise and beloved monarch, each morning at sunrise
releasing the day's new sounds, to be borne by the winds throughout the kingdom,
and each night at moonset gathering in the old sounds, to be catalogued and filed in
the vast storage vaults below."
-- Norton Juster,"The Phantom Tollbooth"

A fairly common task for many DBAs is to backup the controlfile, either as a binary copy or as a script intended to recreate said controlfile (or controfiles) should the need arise. The task is not complicated, but there may be some wondering why creating such a backup is necessary. Let’s look at what this task does and why it probably should be a regular occurrence.

The controlfile is the heart and soul of the Oracle database; without it the database cannot be mounted and subsequently opened. It contains the following sections:


DATABASE
CKPT PROGRESS
REDO THREAD
REDO LOG
DATAFILE
FILENAME
TABLESPACE
TEMPORARY FILENAME
RMAN CONFIGURATION
LOG HISTORY
OFFLINE RANGE
ARCHIVED LOG
BACKUP SET
BACKUP PIECE
BACKUP DATAFILE
BACKUP REDOLOG
DATAFILE COPY
BACKUP CORRUPTION
COPY CORRUPTION
DELETED OBJECT
PROXY COPY
BACKUP SPFILE
DATABASE INCARNATION
FLASHBACK LOG
RECOVERY DESTINATION
INSTANCE SPACE RESERVATION
REMOVABLE RECOVERY FILES
RMAN STATUS
THREAD INSTANCE NAME MAPPING
MTTR
DATAFILE HISTORY
STANDBY DATABASE MATRIX
GUARANTEED RESTORE POINT
RESTORE POINT

Since it contains a wealth of information the ability to replace it is of extreme importance for the DBA. The choice is not whether to backup the controlfile but in which form should the backup take. Binary copies are convenient, as they only need be copied to the correct locations to restore service. They can be large, however, and on systems where data files are exceptionally large space could be an issue. The second option, backing up to trace, provides a way to replace existing controlfile “in place” and such backups consume very little storage. Such backups also consume very little processing time, making this option very favorable for nightly O/S scheduler jobs to create a current backup.

Two options present themselves from within a controlfile backup script: a NORESETLOGS option, for when the controlfile becomes corrupted but the redo logs are still available to bring the database controlfile replacement current, and a RESETLOGS option, for when the redo is also a victim, rendering it unavailable. The contents of this script are to be found in the directory defined by the background_dump_dest parameter; of course since it’s a ‘standard’ trace file the naming will not make it obvious nor easy to find. Enter the tracefile_identifier parameter.

Oracle provides this parameter to allow the DBA to insert identifiable text into the tracefile name; it is exceptionally easy to set at the session level:


alter session set tracefile_identifier='CFILE';

Essentially any valid text can be used, however it might be prudent to set the text to the database name, in capital letters, or, as shown above, some text to mark the trace as one containing the controlfile script. Because the file initially resides in the background_dump_dest it’s a good idea to either move or copy the file to a different, “safe” destination:


/bin/cp /my/oracle/background/dest/*CFILE* /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql

This, of course, presumes only one tracefile exists with CFILE in its name; this can be remedied by writing a short script to perform the deed, and include any ‘housekeeping’ necessary to ensure success. An example of the relevant code is provided below:


/bin/cp /my/oracle/background/dest/*CFILE* /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql
status=$?
if [ $status -ne 0 ]
then
        /bin/echo "File copy, line 110, failed -- `/bin/date`" > $ERRFILE
        /bin/mailx -s "${NME} refresh error - `/bin/date`" $MAILLIST < $ERRFILE
        exit -3
else
        /bin/rm -f /my/oracle/background/dest/*CFILE*
fi

[The entire script is not provided as that is left as an exercise for the DBA.] So why do this on a daily basis? Databases are not static and changes such as datafile and tablespace additions will make prior controlfile backups obsolete. Also putting this on a schedule keeps the DBA from having to create a controlfile backup at the moment one is needed, reducing the stress the DBA is already experiencing from having an unexpected controlfile failure.

The DBA can go one step further and create two scripts, one for when the current redo is available and one for when it isn’t. The second case is the easiest to plan for as the RESETLOGS option is always option #2:


/bin/cp /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql /my/oracle/cfile/scripts/${ORACLE_SID}_resetlogs_controlfile.sql
LNE=`grep -n "#2" /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql | /bin/awk -F":" '{print $1}'`

/bin/vi /my/oracle/cfile/scripts/${ORACLE_SID}_resetlogs_controlfile.sql <<EOF
:1,${LNE}:d
:wq!
EOF

Creating the NORESETLOGS script is almost as easy::


/bin/cp /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql /my/oracle/cfile/scripts/${ORACLE_SID}_noresetlogs_controlfile.sql
LNE=`grep -n "#2" /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql | /bin/awk -F":" '{print $1}'`
END=`wc -l /my/oracle/cfile/scripts/${ORACLE_SID}_controlfile.sql`

/bin/vi /my/oracle/cfile/scripts/${ORACLE_SID}_noresetlogs_controlfile.sql <<EOF
:${LNE},${END}:d
:wq!
EOF

This will leave some commented lines at the end of the CREATE CONTROLFILE command but that should not be an issue. Now the DBA is positioned to replace the controlfile at a moments notice and all that will be necessary is to start the database in NOMOUNT mode and execute the script of choice (or necessity).

Although it should be a very rare occurrence controlfile corruption can happen (or, at the very worst, controlfiles can be deleted by determined hackers in an attempt to sabotage an installation). Just like the Boy Scouts it’s best for a DBA to ‘be prepared’, and having existing controlfile backups, in either form, is part of that preparation. It’s far better to be ready for a situation that never arrives than to be caught, possibly scrambling, to bring the database back to life by having to create a replacement controlfile ‘on the fly’.

And that is good information to file away for later use.

September 17, 2020

That Will Cost You

Filed under: General — dfitzjarrell @ 15:56

"Nothing can possibly go wrong now," cried the Humbug happily, and as soon
as he'd said it he leaped from the car, as if stuck by a pin, and sailed all the way to
the little island.
-- Norton Juster,"The Phantom Tollbooth"

Recently Oracle support posted the following at MOS:


Bug 30324542 : WRONG RESULT WITH "COST BASED OR EXPANSION"

An example was posted with the results, and, indeed, the results from the transformed query were wrong as they were ordered by the second column in descending order. This was interesting because a hidden parameter, _optimizer_cbqt_or_expansion, was using the default setting of ‘on’; setting this to ‘off’ essentially ‘cured’ the issue and returned correctly sorted results. The example provided was run in Oracle 19.8.0.0.0, with the output provided below. It’s a simple example, creating a table, inserting a handful of rows, creating an index and executing the suspect query when the parameter was on, then off. The results follow.

This begins by creating a two-column table and populating it with four rows:


SQL>
SQL> create table test(a char(1),b char(1));

Table created.

SQL> insert into test values('a','1');

1 row created.

SQL> insert into test values('a','2');

1 row created.

SQL> insert into test values('b','1');

1 row created.

SQL> insert into test values('b','2');

1 row created.

SQL> commit;

Commit complete.

An index is also created:


SQL> create index test_idx on test(a,b);

Index created.

SQL>

So far, so good. Now the interesting part — Oracle uses cost-based OR expansion to process the original query, with less than stellar results:


SQL> set autotrace on
SQL> select distinct a,b from test
  2  where (a=' ' and b>=' ') or (a>' ') order by a,b;


A B
- -
a 2
b 2
a 1
b 1


Execution Plan
----------------------------------------------------------
Plan hash value: 2529375370

---------------------------------------------------------------------------------------
| Id  | Operation           | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                 |     5 |    30 |     2  (50)| 00:00:01 |
|   1 |  HASH UNIQUE        |                 |     5 |    30 |     2  (50)| 00:00:01 |
|   2 |   VIEW              | VW_ORE_A06CF4B6 |     5 |    30 |     1   (0)| 00:00:01 |
|   3 |    UNION-ALL        |                 |       |       |            |          |
|*  4 |     INDEX RANGE SCAN| TEST_IDX        |     1 |     6 |     0   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN| TEST_IDX        |     4 |    24 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("A"=' ' AND "B">=' ' AND "B" IS NOT NULL)
   5 - access("A">' ' AND "A" IS NOT NULL)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        686  bytes sent via SQL*Net to client
        440  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

During the run event 10053 was set, to provide an optimizer trace. From that it was found that the original query had undergone an interesting transformation, edited to be more easily readable. This was executed, producing the same results as the original query when _optimizer_cbqt_or_expansion was set to ‘on’:


SQL> SELECT DISTINCT "VW_ORE_A06CF4B6"."ITEM_1" "A","VW_ORE_A06CF4B6"."ITEM_2" "B"
  2  FROM
  3  ( (SELECT "TEST"."A" "ITEM_1","TEST"."B" "ITEM_2" FROM "U1"."TEST" "TEST" WHERE "TEST"."A"=' ' AND "TEST"."B">=' ')
  4  UNION ALL
  5  (SELECT "TEST"."A" "ITEM_1","TEST"."B" "ITEM_2" FROM "U1"."TEST" "TEST" WHERE "TEST"."A">' ')) "VW_ORE_A06CF4B6";

A B
- -
a 2
b 2
a 1
b 1


Execution Plan
----------------------------------------------------------
Plan hash value: 4084848649

--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |     5 |    30 |     2  (50)| 00:00:01 |
|   1 |  HASH UNIQUE        |          |     5 |    30 |     2  (50)| 00:00:01 |
|   2 |   VIEW              |          |     5 |    30 |     1   (0)| 00:00:01 |
|   3 |    UNION-ALL        |          |       |       |            |          |
|*  4 |     INDEX RANGE SCAN| TEST_IDX |     1 |     6 |     0   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN| TEST_IDX |     4 |    24 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("TEST"."A"=' ' AND "TEST"."B">=' ' AND "TEST"."B" IS NOT
              NULL)
   5 - access("TEST"."A">' ' AND "TEST"."A" IS NOT NULL)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
         13  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        686  bytes sent via SQL*Net to client
        691  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

What appears to be odd is the results are not wrong, per se, simply not in the desired order. By adding an order by clause to the transformed query all is again right with the world:


SQL> SELECT DISTINCT "VW_ORE_A06CF4B6"."ITEM_1" "A","VW_ORE_A06CF4B6"."ITEM_2" "B"
  2  FROM
  3  ( (SELECT "TEST"."A" "ITEM_1","TEST"."B" "ITEM_2" FROM "U1"."TEST" "TEST" WHERE "TEST"."A"=' ' AND "TEST"."B">=' ')
  4  UNION ALL
  5  (SELECT "TEST"."A" "ITEM_1","TEST"."B" "ITEM_2" FROM "U1"."TEST" "TEST" WHERE "TEST"."A">' ')) "VW_ORE_A06CF4B6"
  6  ORDER BY A,B;

A B
- -
a 1
a 2
b 1
b 2


Execution Plan
----------------------------------------------------------
Plan hash value: 1587190138

--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |     5 |    30 |     3  (67)| 00:00:01 |
|   1 |  SORT UNIQUE        |          |     5 |    30 |     2  (50)| 00:00:01 |
|   2 |   VIEW              |          |     5 |    30 |     1   (0)| 00:00:01 |
|   3 |    UNION-ALL        |          |       |       |            |          |
|*  4 |     INDEX RANGE SCAN| TEST_IDX |     1 |     6 |     0   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN| TEST_IDX |     4 |    24 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("TEST"."A"=' ' AND "TEST"."B">=' ' AND "TEST"."B" IS NOT
              NULL)
   5 - access("TEST"."A">' ' AND "TEST"."A" IS NOT NULL)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
         13  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        686  bytes sent via SQL*Net to client
        704  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          4  rows processed

SQL>

It would appear to be a fairly simple case of an incomplete query transformation; the ORDER BY clause is missing from the transformed query, resulting in the default ordering Oracle would produce from a hash join (the hash key is the ordering column). Notice when the ORDER BY is included the HASH UNIQUE becomes a SORT UNIQUE; in the query not undergoing the transform the final step is a SORT UNIQUE NOSORT, due to the index access:


SQL> select distinct a,b from test
  2  where (a=' ' and b>=' ') or (a>' ') order by a,b;

A B
- -
a 1
a 2
b 1
b 2

Execution Plan
----------------------------------------------------------
Plan hash value: 4069179713

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     4 |    24 |     2  (50)| 00:00:01 |
|   1 |  SORT UNIQUE NOSORT|          |     4 |    24 |     2  (50)| 00:00:01 |
|*  2 |   INDEX FULL SCAN  | TEST_IDX |     4 |    24 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A">' ' OR "A"=' ' AND "B">=' ')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
         21  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        686  bytes sent via SQL*Net to client
        440  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

Of course the ORDER BY exists in the original query, so it can’t be added to fix the errant output. Until Oracle gets around to correcting this the best option is to simply turn the feature off:


alter session set "_optimizer_cbqt_or_expansion"=off;

A database engine is a complicated piece of software, and, as with all complex programs, fixing bugs or providing new features in one section of the code can result in bugs surfacing in other parts. Even though valiant attempts are made to test such changes it’s highly likely that not every case will be discovered so errant results can be produced. It’s been reported to Oracle Support and it is reproducible so chances are good it will be fixed in an upcoming release.

Presuming, of course, nothing else goes wrong.

September 15, 2020

A Blooming Union

Filed under: General — dfitzjarrell @ 12:29

"What a silly system." The boy laughed. "Then your head keeps changing its
height and you always see things in a different way? Why, when you're fifteen
things won't look at all the way they did when you were ten, and at twenty
everything will change again."
-- Norton Juster,"The Phantom Tollbooth"

Oracle functionality and features can change with each release of the database. Introduced in 10.2 Bloom filters have evolved over time in their scope and can find themselves used in UNION ALL queries. This can make large rowsource operations more efficient by ‘weeding out’ unwanted rows before the final result set is constructed. Let’s take a closer look at that.

There may be some who wonder what Bloom filters are, so a bit of explanation may be helpful. Named after Burton Howard Bloom, who came up with the concept in the 1970s, it’s an efficient data structure used to quickly determine if an element has a high probability of being a member of a given set. It’s based on a bit array that allows for rapid searches and returns one of two results: either the element is probably in the set (which can produce false positives) or the element is definitely not in the set. The filter cannot produce false negatives, and the incidence of false positives is relatively rare. Another advantage to Bloom filters is their small size relative to other data structures used for similar purposes (self-balancing binary search trees, hash tables, or linked lists). The possibility of false positives necessitates the addition of another filter to eliminate them from the results, yet such a filter doesn’t add appreciably to the process time and, therefore, goes relatively unnoticed.

Let’s use an example modeled after one provided by Jonathan Lewis — slight changes in the text were made but the overall process is how Jonathan defined it. Run in a 19c database the optimizer_features_enable session parameter has been modified to reflect the release behavior of 12.2.0.1. A basic data set is created using DBA_OBJECTS:


SQL> 
SQL> create table tab_one as select * from all_objects where rownum <= 50000;

Table created.

SQL> create table tab_two as select tab_one.* from tab_one, (select rownum n1 from dual connect by level <= 4);

Table created.

SQL> create table tab_three as select tab_one.* from tab_one, (select rownum n1 from dual connect by level <= 4);

Table created.

SQL> 
SQL> begin
  2  	     dbms_stats.gather_table_stats(
  3  		     ownname	 => null,
  4  		     tabname	 => 'TAB_ONE',
  5  		     method_opt  => 'for all columns size 1 for columns object_type size 254'
  6  	     );
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> 

A simple join between TAB_ONE and TAB_TWO is executed, producing the expected Bloom filter:


SQL> 
SQL> doc
DOC>
DOC>Optimizer features 12.2.0.1
DOC>
DOC>Blooom filte created and used
DOC>
DOC>#
SQL> 
SQL> alter session set optimizer_features_enable='12.2.0.1';

Session altered.

SQL> 
SQL> select
  2  	     /*+
  3  		     parallel(2)
  4  		     gather_plan_statistics
  5  	     */
  6  	     tab_one.object_name, vtab_one.object_name
  7  from
  8  	     tab_one,
  9  	     tab_two vtab_one
 10  where
 11  	     tab_one.object_type = 'SCHEDULE'
 12  and     vtab_one.object_id = tab_one.object_id
 13  /

OBJECT_NAME                         OBJECT_NAME
----------------------------------- -----------------------------------
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED

16 rows selected.

SQL> 
SQL> select * From table(dbms_xplan.display_cursor( format=>'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  663n8j3g0g63u, child number 0
-------------------------------------
select         /*+                 parallel(2)
gather_plan_statistics         */         tab1.object_name,
vtab1.object_name from         tab1,         tab2 vtab1 where
tab1.object_type = 'SCHEDULE' and     vtab1.object_id = tab1.object_id

Plan hash value: 1427754421

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |      1 |        |     16 |00:00:00.07 |       5 |       |       |          |
|   1 |  PX COORDINATOR        |          |      1 |        |     16 |00:00:00.07 |       5 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)  | :TQ10000 |      0 |     16 |      0 |00:00:00.01 |       0 |       |       |          |
|*  3 |    HASH JOIN           |          |      0 |     16 |      0 |00:00:00.01 |       0 |  1250K|  1250K|  566K (0)|
|   4 |     JOIN FILTER CREATE | :BF0000  |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|*  5 |      TABLE ACCESS FULL | TAB1     |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   6 |     JOIN FILTER USE    | :BF0000  |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|   7 |      PX BLOCK ITERATOR |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|*  8 |       TABLE ACCESS FULL| TAB2     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("VTAB1"."OBJECT_ID"="TAB1"."OBJECT_ID")
   5 - filter("TAB1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"VTAB1"."OBJECT_ID"))

Note
-----
   - Degree of Parallelism is 2 because of hint


35 rows selected.

SQL> 

Let’s now alter the query by adding a UNION ALL construct between TAB_TWO and TAB_THREE, still using the optimizer features set for 12.2.0.1; notice the Bloom filter is now missing with the result that all 400,000 rows end up being processed:</p)


SQL> 
SQL> doc
DOC>
DOC>Bloom filter not created -- 12.2.0.1 parallel
DOC>
DOC>#
SQL> select
  2  	     /*+
  3  		     parallel(2)
  4  		     gather_plan_statistics
  5  	     */
  6  	     tab_one.object_name, vtab_one.object_name
  7  from
  8  	     tab_one,
  9  	     (select * from tab_two union all select * from tab_three) vtab_one
 10  where
 11  	     tab_one.object_type = 'SCHEDULE'
 12  and     vtab_one.object_id = tab_one.object_id
 13  /

OBJECT_NAME                         OBJECT_NAME
----------------------------------- -----------------------------------
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED

32 rows selected.

SQL> 
SQL> select * From table(dbms_xplan.display_cursor( format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  08q5f2nq822uj, child number 0
-------------------------------------
select         /*+                 parallel(2)
gather_plan_statistics         */         tab1.object_name,
vtab1.object_name from         tab1,         (select * from tab2 union
all select * from tab3) vtab1 where         tab1.object_type =
'SCHEDULE' and     vtab1.object_id = tab1.object_id

Plan hash value: 4060012981

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |      1 |        |     32 |00:00:00.19 |      10 |       |       |          |
|   1 |  PX COORDINATOR         |          |      1 |        |     32 |00:00:00.19 |      10 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000 |      0 |     32 |      0 |00:00:00.01 |       0 |       |       |          |
|*  3 |    HASH JOIN            |          |      0 |     32 |      0 |00:00:00.01 |       0 |  1250K|  1250K|  693K (0)|
|*  4 |     TABLE ACCESS FULL   | TAB1     |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   5 |     VIEW                |          |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|   6 |      UNION-ALL          |          |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|   7 |       PX BLOCK ITERATOR |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|*  8 |        TABLE ACCESS FULL| TAB2     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|   9 |       PX BLOCK ITERATOR |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|* 10 |        TABLE ACCESS FULL| TAB3     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("VTAB1"."OBJECT_ID"="TAB1"."OBJECT_ID")
   4 - filter("TAB1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 2 because of hint


38 rows selected

SQL> 

Let’s take this one step further and try to get Oracle, by force, to generate a plan using a Bloom filter. The hints used will, indeed, generate a Bloom filter, but not quite the way we’d want it created:


SQL> 
SQL> doc
DOC>
DOC>Optimizer features 19.1.0
DOC>
DOC>Bloom filter created and used
DOC>
DOC>#
SQL> 
SQL> select
  2  	     /*+
  3  		     parallel(2)
  4  		     gather_plan_statistics
  5  		     leading(@sel$1 tab_one@sel$1 vtab_one@sel$1)
  6  		     use_hash(@sel$1 vtab_one@sel$1)
  7  		     pq_distribute(@sel$1 vtab_one@sel$1 hash hash)
  8  	     */
  9  	     tab_one.object_name, vtab_one.object_name
 10  from
 11  	     tab_one,
 12  	     (select * from tab_two union all select * from tab_three) vtab_one
 13  where
 14  	     tab_one.object_type = 'SCHEDULE'
 15  and     vtab_one.object_id = tab_one.object_id
 16  /

OBJECT_NAME                         OBJECT_NAME
----------------------------------- -----------------------------------
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED

32 rows selected.

SQL> 
SQL> select * From table(dbms_xplan.display_cursor( format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  71c28r1gch210, child number 0
-------------------------------------
select         /*+                 parallel(2)
gather_plan_statistics                 leading(@sel$1 tab_one@sel$1
vtab_one@sel$1)                 use_hash(@sel$1 vtab_one@sel$1)
  pq_distribute(@sel$1 vtab_one@sel$1 hash hash)         */
tab_one.object_name, vtab_one.object_name from         tab_one,         (select
* from tab_two union all select * from tab_three) vtab_one where
tab_one.object_type = 'SCHEDULE' and     vtab_one.object_id = tab_one.object_id

Plan hash value: 994466787

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |      1 |        |     32 |00:00:00.15 |      15 |       |       |          |
|   1 |  PX COORDINATOR             |          |      1 |        |     32 |00:00:00.15 |      15 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)       | :TQ10002 |      0 |     32 |      0 |00:00:00.01 |       0 |       |       |          |
|*  3 |    HASH JOIN BUFFERED       |          |      0 |     32 |      0 |00:00:00.01 |       0 |  2290K|  2082K|  911K (0)|
|   4 |     JOIN FILTER CREATE      | :BF0000  |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   5 |      PX RECEIVE             |          |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   6 |       PX SEND HYBRID HASH   | :TQ10000 |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   7 |        STATISTICS COLLECTOR |          |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|   8 |         PX BLOCK ITERATOR   |          |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|*  9 |          TABLE ACCESS FULL  | TAB1     |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|  10 |     PX RECEIVE              |          |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|  11 |      PX SEND HYBRID HASH    | :TQ10001 |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|  12 |       JOIN FILTER USE       | :BF0000  |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|  13 |        VIEW                 |          |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|  14 |         UNION-ALL           |          |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|  15 |          PX BLOCK ITERATOR  |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|* 16 |           TABLE ACCESS FULL | TAB2     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|  17 |          PX BLOCK ITERATOR  |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|* 18 |           TABLE ACCESS FULL | TAB3     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("VTAB1"."OBJECT_ID"="TAB1"."OBJECT_ID")
   9 - access(:Z>=:Z AND :Z<=:Z)
       filter("TAB1"."OBJECT_TYPE"='SCHEDULE')
  16 - access(:Z>=:Z AND :Z<=:Z)
  18 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 2 because of hint


49 rows selected.

SQL> 
>

Unfortunately the filter was not used until after the view was generated, resulting in all 400,000 rows being processed by the UNION ALL. Since this is 19c hints at lines 5, 6, and 7 have been removed and the query run again, this time using the UNION ALL enhancement implemented by Oracle, producing a different execution plan:


SQL> select
  2          /*+
  3                  parallel(2)
  4                  gather_plan_statistics
  5          */
  6          tab1.object_name, vtab1.object_name
  7  from
  8          tab1,
  9          (select * from tab2 union all select * from tab3) vtab1
 10  where
 11          tab1.object_type = 'SCHEDULE'
 12  and        vtab1.object_id = tab1.object_id
 13  /

OBJECT_NAME                         OBJECT_NAME
----------------------------------- -----------------------------------
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED
DAILY_PURGE_SCHEDULE                DAILY_PURGE_SCHEDULE
FILE_WATCHER_SCHEDULE               FILE_WATCHER_SCHEDULE
PMO_DEFERRED_GIDX_MAINT_SCHED       PMO_DEFERRED_GIDX_MAINT_SCHED
BSLN_MAINTAIN_STATS_SCHED           BSLN_MAINTAIN_STATS_SCHED

32 rows selected.

SQL>
SQL> select * From table(dbms_xplan.display_cursor( format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  08q5f2nq822uj, child number 1
-------------------------------------
select         /*+                 parallel(2)
gather_plan_statistics         */         tab1.object_name,
vtab1.object_name from         tab1,         (select * from tab2 union
all select * from tab3) vtab1 where         tab1.object_type =
'SCHEDULE' and     vtab1.object_id = tab1.object_id

Plan hash value: 2580303290

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |     32 |00:00:00.09 |      10 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |     32 |00:00:00.09 |      10 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)    | :TQ10000 |      0 |     32 |      0 |00:00:00.01 |       0 |       |       |          |
|*  3 |    HASH JOIN             |          |      0 |     32 |      0 |00:00:00.01 |       0 |  1250K|  1250K|  553K (0)|
|   4 |     JOIN FILTER CREATE   | :BF0000  |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|*  5 |      TABLE ACCESS FULL   | TAB1     |      0 |      4 |      0 |00:00:00.01 |       0 |       |       |          |
|   6 |     VIEW                 |          |      0 |    400K|      0 |00:00:00.01 |       0 |       |       |          |
|   7 |      UNION-ALL           |          |      0 |        |      0 |00:00:00.01 |       0 |       |       |          |
|   8 |       JOIN FILTER USE    | :BF0000  |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|   9 |        PX BLOCK ITERATOR |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|* 10 |         TABLE ACCESS FULL| TAB2     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|  11 |       JOIN FILTER USE    | :BF0000  |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|  12 |        PX BLOCK ITERATOR |          |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
|* 13 |         TABLE ACCESS FULL| TAB3     |      0 |    200K|      0 |00:00:00.01 |       0 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("VTAB1"."OBJECT_ID"="TAB1"."OBJECT_ID")
   5 - filter("TAB1"."OBJECT_TYPE"='SCHEDULE')
  10 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"TAB2"."OBJECT_ID"))
  13 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"TAB3"."OBJECT_ID"))

Note
-----
   - Degree of Parallelism is 2 because of hint


43 rows selected.

SQL>

Predicate information for steps 10 and 13 now show the Bloom filter being used to build the view and not filter it afterwards, which is the desired result.

New features are always exciting, and their evolution can provide performance improvements in how those features run or in how they can be applied. Even older technologies can receive an update, elevating their place in the pantheon of Oracle technology, though the benefit may not immediately be apparent.

Which isn’t such a silly system, if you think about it.

September 10, 2020

Going My Way?

Filed under: General — dfitzjarrell @ 17:03

"There are no wrong roads to anywhere."
-- Norton Juster, The Phantom Tollbooth 

Since version 12 Oracle has changed the way EXTPROC is configured, and it may be confusing to some. The documentation provides two methods to setup EXTPROC functionality, one including the listener (as older versions required) but that method isn’t like the original as it involves both the listener.ora and tnsnames.ora files. Compound that with the “new way” to configure the service and it may leave a DBA wondering what, exactly, needs to be done. Let’s clear that up.

In versions 12 and later a new configuration file, extproc.ora, located under $ORACLE_HOME/hs/admin, provides the location for the default configuration settings. At installation this file contains the following text:


# 
# extproc.ora is used by extproc in the default Oracle configuration.
#
# This is a sample extproc init file that contains a name-value(s) pair which
# is same as the value of ENVS parameter in listener.ora file.
#
# Syntax: SET name=value (environment variable name and value)
# 
# When specifying the EXTPROC_DLLS environment variable to restrict the DLLs 
# that extproc is allowed to load, you may set EXTPROC_DLLS to one of the 
# following values:
# 
# * ONLY (maximum security) 
#
#   When EXTPROC_DLLS=ONLY:DLL[:DLL], only the specified DLL(s) can be loaded.
# 
#   Syntax: SET EXTPROC_DLLS=ONLY:DLL[:DLL]
#
# * NULL (the default value)
#
#   When EXTPROC_DLLS is not set, only the DLL(s) in $ORACLE_HOME/bin and 
#   ORACLE_HOME/lib can be loaded.
#
# * Colon-seperated list of the DLLs
#
#   When EXTPROC_DLLS=DLL[:DLL], the specified DLL(s) and the DLLs in 
#   $ORACLE_HOME/bin and ORACLE_HOME/lib can be loaded.
# 
#   Syntax: SET EXTPROC_DLLS=DLL:DLL
#
# * ANY 
#   When EXTPROC_DLLS=ANY, DLL checking is disabled.
# 
#   Syntax: SET EXTPROC_DLLS=ANY
#
#
# To turn extproc tracing on, set TRACE_LEVEL=ON (default is OFF).
#
#   Syntax: TRACE_LEVEL=ON
#

The text is really self-explanatory, providing instructions and parameter setting examples to make it fairly simple to get extproc functioning. The most basic of configurations involves a single setting:


SET EXTPROC_DLLS=ANY

which allows Oracle to use any libraries created in the database. Specific locations can be set using the following syntax:


SET EXTPROC_DLLS=DLL:DLL

where DLL represents any DLL or shared library, including the full path, located on the database server. By default this also includes any unnamed libraries under $ORACLE_HOME/bin and $ORACLE_HOME/lib. An example would be:


SET EXTPROC_DLLS=/my/lib/shared1.so:/my/lib/shared2.so:/your/lib/shared3.so

Entries are separated with colons; no restrictions are listed outside of the overall line length but if there are a large number to configure it may be best to use the ANY configuration to avoid any potential issues. LD_LIBRARY_PATH can be set, if desired, but it isn’t necessary as the configuration file applies to the current $ORACLE_HOME. [In installations where the listener has a separate home then setting LD_LIBRARY_PATH may be required.] A configuration file for a working installation is shown below:


# 
# extproc.ora is used by extproc in the default Oracle configuration.
#
# This is a sample extproc init file that contains a name-value(s) pair which
# is same as the value of ENVS parameter in listener.ora file.
#
# Syntax: SET name=value (environment variable name and value)
# 
# When specifying the EXTPROC_DLLS environment variable to restrict the DLLs 
# that extproc is allowed to load, you may set EXTPROC_DLLS to one of the 
# following values:
# 
# * ONLY (maximum security) 
#
#   When EXTPROC_DLLS=ONLY:DLL[:DLL], only the specified DLL(s) can be loaded.
# 
#   Syntax: SET EXTPROC_DLLS=ONLY:DLL[:DLL]
#
# * NULL (the default value)
#
#   When EXTPROC_DLLS is not set, only the DLL(s) in $ORACLE_HOME/bin and 
#   ORACLE_HOME/lib can be loaded.
#
# * Colon-seperated list of the DLLs
#
#   When EXTPROC_DLLS=DLL[:DLL], the specified DLL(s) and the DLLs in 
#   $ORACLE_HOME/bin and ORACLE_HOME/lib can be loaded.
# 
#   Syntax: SET EXTPROC_DLLS=DLL:DLL
#
# * ANY 
#   When EXTPROC_DLLS=ANY, DLL checking is disabled.
# 
#   Syntax: SET EXTPROC_DLLS=ANY
#
#
# To turn extproc tracing on, set TRACE_LEVEL=ON (default is OFF).
#
#   Syntax: TRACE_LEVEL=ON
#
SET EXTPROC_DLLS=ANY

At this point nothing further needs to be done; EXTPROC is now configured.

For those who aren’t comfortable with the new configuration file the listener.ora can still be configured for EXTPROC, but now this configuration includes the tnsnames.ora file. To set the listener to spawn the extproc process the following entries need to be made to the listener.ora file:


EXTPLSNR=
  (DESCRIPTION=
    (ADDRESS_LIST=
      (ADDRESS=(PROTOCOL=ipc)(KEY=extp))))

SID_LIST_EXTPLSNR=
  (SID_LIST=
    (SID_DESC=
      (SID_NAME=ep_agt1)
      (ORACLE_HOME=/u01/app/oracbase/product/12.2.0/dbhome_1)
      (ENVS="EXTPROC_DLLS=ANY LD_LIBRARY_PATH=/home/oracle/lib")
      (ORACLE_SID=extp)
      (PROGRAM= extproc)))

The tnsnames.ora also needs changes, which deviates from the pre-12 configuration that only involved adding EXTPROC-specific settings; those are provided below:


extproc_agent=
  (DESCRIPTION=
    (ADDRESS=PROTOCOL=ipc)(KEY=extp))
    (CONNECT_DATA=
      (PRESENTATION=RO)
      (SID=ep_agt1)))

Yet another change requires that the new EXTPROC listener, EXTPLSNR, be started:</


$  lsnrctl start extplsnr

The usual listener output should be seen:


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=extp)))
STATUS of the LISTENER
------------------------
Alias                     EXTPLSNR
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                09-SEP-2020 15:41:44
Uptime                    1 days 0 hr. 39 min. 17 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /my/oracle/home/network/admin/listener.ora
Listener Log File         /my/oracle/diag/tnslsnr/myserver/extplsnr/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=extp)))
Services Summary...
Service "ep_agt1" has 1 instance(s).
  Instance "ep_agt1", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

At this point the listener will spawn the EXTPROC agent and external procedures will run as expected. [If processes using these external procedures access endpoints on remote servers it will be necessary to implement the proper firewall rules to permit connections between the servers.] Both CAN be configured, however the default configuration using the extproc.ora file will take precedence, making the listener configuration unnecessary; if the listener is preferred simply restore the extproc.ora file to its original state and the listener configuration will take over.

EXTPROC isn’t just the configuration discussed here, it’s also external shared libraries and procedures/functions written to use those libraries. The external libraries are operating system dependent, so shared libraries from AIX cannot be used on a Linux system, for example. Such libraries will need to be recreated from the source code; note that different operating systems can employ different compilers and not all compilers provide the same inherent functions. As an example HP-UX provides an ltoa() function native to the compiler, and Linux does not. When dealing with such inconsistencies it may be necessary to modify the original code to produce a working library. That discussion is beyond the scope of this post; it is mentioned to help troubleshoot EXTPROC installations migrated from one O/S to another. Ensure that all of the source code necessary for the shared libraries is available and is migrated to its new home or external procedure calls may fail.

EXTPROC in version 12 and later isn’t rocket science, but it could take a little time and patience to get up and running. The effort expended will most likely be worth the time it takes to get it properly configured, and it will open a new realm of possibilities in data processing by allowing other avenues of data entry and data manipulation, all from within an Oracle database. As an example an external library call can access a shell script to facilitate file transfers between two servers and provide feedback on the success or failure of that process. That’s, well, cool.

And that makes this another road to … anywhere.

September 8, 2020

Downhill Run

Filed under: General — dfitzjarrell @ 14:35
 
"For there's always something to do to keep you from what you really should be doing, ..." 
-- Norton Juster, The Phantom Tollbooth 

Database upgrades can be… interesting … at times but never more so than when an upgrade needs to be undone. Many times such tasks occur in development or test environments, however it is possible to successfully complete an upgrade in development and run across issues in production that will necessitate a downgrade. Smaller shops may be forced to upgrade in place due to physical space limitations or licensing restrictions preventing a second database from being created in a new software home. Oracle support reports that under such conditions a downgrade can generate ORA-12899 errors not experienced before the upgrade. Let’s look at why and how that can happen.

Under ideal conditions a database upgrade would involve creating a new software home for the desired release. A full backup of the database (using RMAN) would be taken to preserve the current state and version; the current database would be shut down cleanly and brought up under the new software home, at which time the proper upgrade scripts would be executed. Notice that this process preserves the original database unmolested; should a downgrade be required the current database would be shut down, the previous home would be used and an RMAN restore would ensue. The database should now be back at the previous version, no worse for wear, presuming, of course, the issues requiring the downgrade were discovered during the post-upgrade testing phase. A simple scenario, to be sure, but it can eliminate a lot of ‘headaches’ with upgrades.

As mentioned previously Oracle support reports that a downgrade to 11.2.0.3 from 12.2.0.1 throws “ORA-12899: value too large for column” errors; the AUD$ and FGA_LOG$ tables are reported in the complete error text. These tables were modified by the catdwgrd.sql script in an attempt to restore the original settings. In 11.2.0.x and earlier releases the CLIENT_ID column in AUD$ is declared as VARCHAR2(64); in 12.2.0.x that definition changed to VARCHAR2(128). The FGA_LOG$ table also defines CLIENT_ID as VARCHAR2(64). The catdwgrd.sql script performs several actions relative to the column definition change; the statements are shown below:


--File rdbms/admin/dsec.bsq

-- lrg 14680081
-- determine AUD$ schema before altering it.
DECLARE
  uname VARCHAR2(30);
BEGIN
  select usr.name into uname
  from sys.tab$ tab, sys.obj$ obj, sys.user$ usr
  where obj.name ='AUD$' AND obj.obj# =tab.obj# AND usr.user# = obj.owner#;

  execute immediate
    'alter table ' || dbms_assert.enquote_name(uname) ||
    '.aud$ modify clientid varchar2(30)';
END;
/
 


alter table fga_log$ modify clientid varchar2(30);

Notice that the script supplied by Oracle sets these two columns as VARCHAR2(30) when the data dictionary is downgraded; that these columns were never declared as VARCHAR2(30) in any recent release of Oracle makes these changes confusing to say the least. The conditions surrounding the need to run the catdwgrd.sql are not provided, only that the script was executed on downgrade and afterwards the ORA-12899 errors appeared. Upgrades and downgrades I have been involved with did not encounter such issues, mainly because there was sufficient space and resources to have both Oracle home installations on the same server and sufficient testing was completed to catch problems that would necessitate a downgrade without requiring data recovery (changes made in testing could be discarded making the downgrade as simple as shutting down the database, changing the Oracle home, restoring the backup created just prior to upgrade and opening the database). Be that as it may the script was run at the reporting client site producing the aforementioned errors, prompting the client to file a Service Request with Oracle Support.

Having this knowledge can only be beneficial as a situation may arise where a downgrade would require executing any Oracle-supplied downgrade scripts, and having this information would make short work of resolving the issue (an issue, IMHO, that should not exist in the first place). We cannot all be perfect.

As with any DBA activity that requires more effort than simply logging into the terminal planning is required and preparedness is a must; even the most bizarre and remote of possibilities must be on the DBA’s “radar” so that if the unthinkable does happen there are planned steps to address that occurrence so the DBA is not caught between the proverbial rock and a hard place. It’s always better to expect the worst and plan accordingly rather than be caught wanting should disaster strike.

Which is, of course, what the DBA should be doing.

Next Page »

Blog at WordPress.com.