Oracle Tips and Tricks — David Fitzjarrell

March 28, 2017

Finding Your Way

Filed under: General — dfitzjarrell @ 08:00

"Whether or not you find your own way, you're bound to find some way. If you happen to find my way, please return it,
as it was lost years ago. I imagine by now it's quite rusty."
-- Norton Juster, The Phantom Tollbooth

Oracle has provided access to its wait interface for several releases and with each new release it expands the range of wait information available, so much so that it’s hard to not find something to examine. Disk reads, logical reads, sort activity, table scans all vie for the attention of the DBA. Of course examination leads to investigation which leads, inevitably, to tuning, even when there is nothing to tune. Such constant twiddling and tweaking is known as Compulsive Tuning Disorder, or CTD. Unfortunately the more ways Oracle provides to interrogate the wait interface the more the DBA can fall victim to CTD. To help reduce the urge to tune a few questions need to be asked regarding the so-called ‘problem area’. Let’s dig in and ask those questions.

First, and foremost, is the following question:

“What problem are you trying to solve?”

If you can’t answer that question then there really isn’t a reason to tune anything; you’ll never know when you’re done and the task will go on and on and on and on and on … ad infinitum, ad nauseaum with no progress to report and no end in sight, another DBA sucked into the rabbit hole of CTD. One thing will lead to another and another and another as you find more areas to ‘tune’ based on the wait interface data and blog posts and articles clearly telling you something needs to be fixed. In most cases nothing could be further from the truth.

Next is misleading or misunderstood numbers, mainly in reference to data reads and writes. I’ve seen some DBAs try to tune the database to reduce logical reads — it’s usually newer DBAs who see the large values for logical reads and conclude there is a problem. The issue isn’t the available data, it’s isolating one small aspect of the entire performance picture and tuning that to the exclusion of everything else. Large volumes of logical reads aren’t necessarily a problem, unless the buffer cache is being reloaded across short periods of time which would be accompanied by large volumes of physical reads. In cases such as this the physical reads would be the telling factor and those MAY be cause for concern. It depends upon the system; an OLTP system grinding through tables to get a single row is most likely a problem whereas a data warehouse churning through that same volume of data would be normal. Going back to the large volume of logical reads relative to the physical reads that can be a problem of interpretation when taking each area individually as that may cloud the water and obscure the real issue of a buffer cache that may be too small for the workload; a configuration that once was more than sufficient can, over time, become a performance bottleneck as more and more users are using the database. A database is a changing entity and it needs to be tended, like a garden, if it’s going to grow.

The DBA needs to listen to the users since they will be the first to complain when something isn’t right and needs attention. Performance is time and, for business, time is money, and when tasks, over time, take longer and longer to complete less work is getting done. The DBA shouldn’t need to hunt for things to do; continually tuning to get that last microsecond of performance is really wasted effort — if no one but the DBA is going to notice the ‘improvement’ it’s not worth pursuing.

Not all tuning is bad or wasted effort but the DBA needs to have a clear goal in mind and a path to follow that addresses issues and brings them to some sort of resolution, even if it’s only a temporary fix until a permanent solution can be implemented. It does no good to constantly pick apart the database to find problems to solve, especially when the users aren’t complaining.

When something is wrong the DBA will hear about it; that’s the time to step into action and start problem solving. The DBA doesn’t need to go looking for problems, they’ll show up all by themselves. And if he or she isn’t constantly twiddling with this or tweaking that the real issues can be dealt with when they happen. Then the users will stop complaining and peace and joy will reign supreme. Okay, so peace and joy won’t necessarily cover the land but the users will stop complaining, at least for a while, and there will be benefit seen from the effort expended.

CTD is thankless, relentless and never-ending, so don’t get caught up in wanting to fix everything; it can’t be done and there are some things that are, most likely, not worth the effort spent given the small return that investment will generate. It’s not enough to know when to stop, the DBA also needs to know when NOT to start; if there is no clear destination to the journey it’s best to not begin. There is plenty to do without making work out of nothing.

Go your own way, just don’t get lost.

Advertisements

March 13, 2017

It’s Private

Filed under: General — dfitzjarrell @ 10:35

“The only thing you can do easily is be wrong, and that's hardly worth the effort.” 
― Norton Juster, The Phantom Tollbooth

Oracle provides two parameters that affect the PGA that look very similar but operate very differently. One of these parameters is the well-known pga_max_size and the otheris a hidden parameter, _pga_max_size. Let’s look at both and see how one can be very effective while the other can create problems with respect to PGA memory management,

DBAs know pga_max_size from extensive documentation from Oracle Corporation and from numerous Oracle professionals writing blog posts about it. It’s a common parameter to set to restrict the overall size of the PGA in releases 11.2 and later. It’s available if Automatic Memory Management (AMM) is not in use; databases running on Linux and using hugepages would be in this group since AMM and hugepages is not a supported combination. Hugepages are available for IPC (Inter-Process Communication) shared memory; this is the ‘standard’ shared memory model (starting with UNIX System V) allowing multiple processes to access the same shared memory segment. There is also another form of shared memory segment, the memory-mapped file, and currently such segments are not supported by hugepages. Oracle, on Linux, gives you a choice of using hugepages or memory-mapped files and you implement that choice by selecting to use (or not use) Automatic Memory Management (AMM). Using Automatic Shared Memory Management (ASMM) allows the DBA to set such parameters as sga_target, sga_max_size, pga_aggregate_target and pga_max_size and have some control how those memory areas are sized.

Using pga_max_size is a simple task:


SQL> alter system set pga_max_size=2G;

Systen altered.

SQL>

Now Oracle will do its best to limit the overall PGA size to the requested value but remember this is a targeted max size, not an absolute. It is more restrictive than pga_aggregate_target, meaning it’s less likely to be exceeded.

On to its sister parameter, _pga_max_size. This parameter regulates the size of the PGA memory allocated to a single process. Oracle sets this using calculations based on pga_aggregate_target and pga_max_size and, since it is an ‘undocumented’ parameter, it should NOT be changed at the whim of the DBA. Setting this to any value prevents Oracle from setting it based on its standard calculations and can seriously impact database performance and memory usage. If, for example, the DBA does this:


SQL> alter system set "_pga_max_size"=2G;

Systen altered.

SQL>

Oracle is now capable of allocating up to 2 GB of PGA to each and every process started after that change has taken place. On an exceptionally active and busy system, with parallel processing enabled, each process can have up to 2 GB of RAM in its PGA. Since many systems still don’t have terabytes of RAM installed such allocations can bring the database, and the server, to a grinding halt, throwing ORA-04030 errors in the process. This, of course, is not what the DBA intended but it is what the DBA enabled by altering the _pga_max_size parameter. Unfortunately this parameter (_pga_max_size) is still being written on in blogs that provide ‘information’, which hasn’t been validated, to the Oracle community.

Knowledge is power; unfortunately unverified ‘information’ is seen as knowledge (especially since it’s a common misconception that ‘if it’s on the Internet it MUST be true’ which isn’t always the case) by those who don’t apply critical thinking to what they read. I know of DBAs who set _pga_max_size to match the pga_max_size parameter and found, to their dismay, that their actions seriously impacted production systems in a negative way. Sometimes in the database world prolific authors are taken as experts and their words looked upon as gospel. Unfortunately prolific doesn’t necessarily mean reliable.

It’s always best to test what others tell you before assuming the advice given to you is right.

March 1, 2017

Return To Sender

Filed under: General — dfitzjarrell @ 16:06

"The most important reason for going from one place to another is to see what's in between."
-- Norton Juster, The Phantom Tollbooth

Recently in an Oracle forum a question resurfaced regarding enabling row movement for tables. The posted question, from five years ago, asked if row movement was safe and if there could be any ‘undesired impact on application data’. The answer to the first part of that question is ‘yes’ (it’s safe because Oracle, except under extreme conditions, won’t lose your data) and the answer to the second part is also ‘yes’. That may be confusing so let’s look at what the end result could be.

Data rows are uniquely identified by a construct known far and wide as the ROWID. ROWIDs contain a wealth of information as to the location of a given row; the file number, the block number and row number are all encoded in this curious value. Updates can change pretty much everything in a row except the ROWID and primary key values (and, yes, there’s a way to change PK values but it involves deleting and inserting the row — Oracle does this when, for some bizarre reason known only to the user making the change, a PK value is updated). The ONLY way to change a ROWID value is to physically move the row, which is what enabling row movement will allow. This is undesirable for the reasons listed below:


	* Applications coded to store ROWID values can fail as the data that was once in Location A is now in Location B.
	* Indexes will become invalid or unusable, requiring that they be rebuilt.

Storing ROWID values in application tables isn’t the wisest of choices a developer can make. Exporting from the source and importing into a new destination will automatically cause those stored ROWID values to be useless. Cloning the database via RMAN will do the same thing since ROWID values are unique only within the database where they are generated; they do not transport across servers or platforms. Consider two imaginary countries, Blaggoflerp and Snormaflop. Each is unique in geography so that locations found in Blaggoflerp are not found in Snormaflop, with the reverse also being true. If the traveler has but one map, of Blaggoflerp, and tries to use that to navigate Snormaflop our traveler will become hopelessly lost and confused. Enable row movement on a table where indexes are present, an application stores ROWIDs for easy data access, or both and Oracle starts singing that old Elvis Presley hit, written by Winfield Scott, “Return To Sender”:


Return to sender, address unknown.
No such person, no such zone.

Don’t misunderstand, the data is STILL in the table, it’s just moved from its original location and left no forwarding address. It’s possible that new data now occupies the space where that trusty old row used to live, so the application doesn’t break but it does return unexpected results because the values that were once at that location are no longer there. And any indexes that referenced that row’s original ROWID are now invalidated, making them useless until manual intervention is employed to rebuild them.


"Since you got here by not thinking, it seems reasonable to expect that, in order to get out, you must start thinking."
-- Norton Juster, The Phantom Tollbooth

Maybe it’s not that the DBA didn’t think about the act before he or she did it, it might simply be that he or she didn’t think far enough ahead about the results of such an act to make a reasonable choice. Changes to a database can affect downstream processing; failing to consider the ripple effect of such changes can be disastrous, indeed. It isn’t enough in this day and age to consider the database as a ‘lone ranger’; many systems can depend on a single database and essentially haphazard changes can stop them in their tracks.

There may be times when enabling row movement is necessary; changing a partitioning key is one of them. Granted making such changes on a partitioned table will be part of a larger outage window where the local and global indexes can be maintained so the impact will be positive, not negative. Absent such tasks (ones where row movement would be necessary) it’s not recommended to enable row movement as it will certainly break things, especially things no one was expecting because of lack knowledge of the affected systems.

It’s not always good to go travelling.

January 27, 2017

“Back It Up!!!”

Filed under: General — dfitzjarrell @ 14:21

“Expectations is the place you must always go to before you get to where you're going.” 
― Norton Juster, The Phantom Tollbooth   

In a recent post in the Oracle General database forum the following question was asked:


Hi,

I have 3 schema's of almost equal sizes, off which only two are creating the backup's (Hot backup). One of the schema's is not creating 
any backup, by not creating I mean, the backup file that is generated size is too small than the other two files.

The size of the other two backup files is almost 20 GB while the third one is only 54 Bytes !!!

Below are the commands I am using for backup,

alter tablespace SCHEMA begin backup; ( DB level )
tar cpvf - SCHEMA_TS | compress -f > backup.tar.Z ( OS level )

The DB (Oracle Database 11g) is in Archive log mode and no error is being thrown while running the above commands.

Could you please help me in solving this issue.

Any reference related to this would also be of great help.

Thanks in advance !!

There are issues with that sequence of statements, the first is calling that a ‘backup’. The issue with that is it’s highly likely that, after the tablespace files are restored, the recovery will fail and the database will be left in an unusable state. The obvious omission is the archivelogs; nowhere in that command sequence is found any statement using tar to copy the archivelogs generated before, during and after that ‘backup’ is completed; apparently the entire script was not posted to the thread so additional steps that script might execute were not available to view. Since no recovery testing is reported (if such a script exists its contents were not presented) it’s very possible that this ‘backup’ is taken on faith, and unfortunately faith isn’t going to be of much help here.

Yet another problem is the lack of any query to determine the actual datafiles associated with the given tablespace; a ‘backup’ missing that important information means that not all required datafiles will be copied, making the tablespace incomplete and Oracle unable to recover it. This again leads to a down database with no hope of opening.

It was suggested several times in the thread that the poster stop using this ‘backup’ and move to RMAN to create dependable, reliable and recoverable backups. Why this method was in use was explained with this post:


I am new to Oracle DB and the guy who worked before me, wrote a (backup) script where he just created a tar of the table space files.

which leads one to wonder how this DBA thought he or she would restore these tablespaces to a useful and usable state should the time come. The poster added:


I want to use RMAN but thought of solving this issuse first (worst case scenario) and then create a new backup script using RMAN.

Honestly this is not the problem that needs to be solved; the problem is generating a reliable backup and RMAN has been proven time and again as the tool for that job. Further discussion lead to the realization that not all files were being sent to tar which explained the size discrepancy but didn’t truly address the recoverability issue. Anyone can take a so-called ‘backup’ using any number of tools and operating system utilities; it’s restoring and recovering from those ‘backups’ that tells the tale of success or failure, and failure in restoring and recovering a production database isn’t an option.

Sometimes you don’t get what you expect.

December 21, 2016

“Do You Have A Key?”

Filed under: General — dfitzjarrell @ 11:27

“Don't you know anything at all about numbers?"
"Well, I don't think they're very important," snapped Milo, too embarrassed to admit the truth.
"NOT IMPORTANT!" roared the Dodecahedron, turning red with fury. "Could you have tea for two without the two — or three blind mice 
without the three? Would there be four corners of the earth if there weren't a four? And how would you sail the seven seas without a seven?"
"All I meant was—" began Milo, but the Dodecahedron, overcome with emotion and shouting furiously, carried right on.
"If you had high hopes, how would you know how high they were? And did you know that narrow escapes come in all different widths? 
Would you travel the whole wide world without ever knowing how wide it was? And how could you do anything at long last," he concluded, 
waving his arms over his head, "without knowing how long the last was? Why, numbers are the most beautiful and valuable things in the world. 
Just follow me and I'll show you." He turned on his heel and stalked off into the cave.” 
-- Norton Juster, The Phantom Tollbooth

How best to generate a primary key is a discussion that seems to never get an answer suitable to everyone. Sequences are a popular choice, as are other forms of generated or “artificial” keys. In addition to that Oracle provides a function named sys_guid() that generates unique identifiers that can be used for a number of purposes, one of which is, as one might expect, as a generated primary key. Occasionally discussions in the Oracle forums discuss the merits and issues of using such global unique identifiers as primary keys; one current discussion asked if using sys_guid() was faster or slower than using a sequence; on a Linux system generating sys_guid() values was illustrated as being faster than using a sequence. The example code, slightly modified, is presented here having been run on Oracle 12.1.0.2 on Windows. Both 12c configurations were used (standard and container) to see if any differences appeared, and since both runs provided similar results the non-container results are provided here. Timing was set on to record and display the elapsed time for each set of serial tests; parallel tests were also completed and log table entries report the elapsed time for those runs. Separate tables were used for each set of tests, the results that were produced are shown below. The test begins with a table and a sequence created with the default settings:


SQL> -- default 20 cache
SQL> create sequence seq1;

Sequence created.

SQL>
SQL> create table t_seq
  2  	 ( id	  number(9) primary key
  3  	 , filler varchar2(1000)
  4  	 );

Table created.

SQL>

The next step inserts 999,999 records into the t_seq table; the execution plan and run statistics are shown below:


SQL> insert into t_seq
  2    select seq1.nextval ,'sdfsf' from dual connect by level < 1000000;

999999 rows created.

Elapsed: 00:00:10.19

Execution Plan
----------------------------------------------------------
Plan hash value: 3365622274

--------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | INSERT STATEMENT               |       |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL       | T_SEQ |       |            |          |
|   2 |   SEQUENCE                     | SEQ1  |       |            |          |
|*  3 |    CONNECT BY WITHOUT FILTERING|       |       |            |          |
|   4 |     FAST DUAL                  |       |     1 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(LEVEL<1000000)


Statistics
----------------------------------------------------------
      50149  recursive calls
     227827  db block gets
      59085  consistent gets
          0  physical reads
  113671864  redo size
        855  bytes sent via SQL*Net to client
        893  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
     999999  rows processed

SQL>
SQL> drop table t_seq;

Table dropped.

SQL> drop sequence seq1;

Sequence dropped.

SQL>

The amount of redo generated is large, a result of using the sequence. Another sequence test was executed using a sequence created with a larger cache value. Before each run the user was re-connected to reset the session statistics:


SQL> connect bing/!@#!@#
Connected.
SQL> create sequence seq1 cache 10000;

Sequence created.

SQL>
SQL> create table t_seq
  2  	 ( id	  number(9) primary key
  3  	 , filler varchar2(1000)
  4  	 );

Table created.

SQL>

The same insert statement was executed using the sequence having the larger cache; the execution plan and session statistics are shown below:


SQL> insert into t_seq
  2    select seq1.nextval ,'sdfsf' from dual connect by level < 1000000;

999999 rows created.

Elapsed: 00:00:05.24

Execution Plan
----------------------------------------------------------
Plan hash value: 3365622274

--------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | INSERT STATEMENT               |       |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL       | T_SEQ |       |            |          |
|   2 |   SEQUENCE                     | SEQ1  |       |            |          |
|*  3 |    CONNECT BY WITHOUT FILTERING|       |       |            |          |
|   4 |     FAST DUAL                  |       |     1 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(LEVEL<1000000)


Statistics
----------------------------------------------------------
        249  recursive calls
      77911  db block gets
       9188  consistent gets
          1  physical reads
   79744836  redo size
        854  bytes sent via SQL*Net to client
        893  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
     999999  rows processed

SQL>

Using the larger sequence cache reduced the redo size by 33927028, which cut the execution time roughly in half. On to the sys_guid() part of the serial testing, with a new table created and a new connection established:


SQL> connect bing/!@#!@#
Connected.
SQL> create table t_raw
  2  	 ( id	  raw(16) primary key
  3  	 , filler varchar2(1000)
  4  	 );

Table created.

SQL>
SQL> insert into t_raw
  2    select sys_guid(),'sdfsf' from dual connect by level < 1000000;

999999 rows created.

Elapsed: 00:00:54.15

Execution Plan
----------------------------------------------------------
Plan hash value: 1236776825

-------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | INSERT STATEMENT              |       |     1 |     2   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL      | T_RAW |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|       |       |            |          |
|   3 |    FAST DUAL                  |       |     1 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(LEVEL<1000000)


Statistics
----------------------------------------------------------
       1442  recursive calls
    2956342  db block gets
      23736  consistent gets
         13  physical reads
  375573628  redo size
        854  bytes sent via SQL*Net to client
        890  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
         34  sorts (memory)
          0  sorts (disk)
     999999  rows processed

SQL>

A benefit of using the sequence with the larger cache size is the redo size is less than half that of using the sys_guid() call. Additionally, on Windows, the execution took almost 11 times longer than using a sequence with a large cache value. Returning to the default cache size for the sequence a PL/SQL loop is used to return the generated values of the sequence; since serveroutput is not turned on the time to return the values isn’t included in the execution time (and as a result the values aren’t displayed). The elapsed time to run the block is found at the end of the execution as well as the redo statistics for the session:

`


SQL> connect bing/!@#!@#
Connected.
SQL> create sequence seq1 ;

Sequence created.

SQL>
SQL> set timing on
SQL> declare
  2    x number(38);
  3    function sf return number is
  4    begin
  5  	 return seq1.nextval;
  6    end;
  7  begin
  8    for i in 1..100000 loop
  9  	 x := sf;
 10    end loop;
 11  end;
 12  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.23
SQL>
SQL> select m.statistic#, n.name, m.value
  2  from v$mystat m, v$statname n
  3  where m.statistic# = n.statistic#
  4  and n.name like '%redo%'
  5  and m.value > 0;

STATISTIC# NAME                                                                  VALUE
---------- ---------------------------------------------------------------- ----------
       257 redo entries                                                          10014
       258 redo size                                                           3620676
       302 redo subscn max counts                                                    1
       307 redo synch time (usec)                                                  251
       308 redo synch time overhead (usec)                                         553
       309 redo synch time overhead count (  2ms)                                    2
       314 redo synch writes                                                         3
       323 redo write info find                                                      2

8 rows selected.

Elapsed: 00:00:00.00

SQL>

The reported redo size is smaller, but there were no inserts into a table performed in this test. A similar test was run using a sequence with a cache value of 10000:


SQL> drop sequence seq1;

Sequence dropped.

SQL> connect bing/!@#!@#
Connected.
SQL> create sequence seq1 cache 10000;

Sequence created.

SQL>
SQL> set timing on
SQL> declare
  2    x number(38);
  3    function sf return number is
  4    begin
  5  	 return seq1.nextval;
  6    end;
  7  begin
  8    for i in 1..100000 loop
  9  	 x := sf;
 10    end loop;
 11  end;
 12  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.19
SQL>
SQL> select m.statistic#, n.name, m.value
  2  from v$mystat m, v$statname n
  3  where m.statistic# = n.statistic#
  4  and n.name like '%redo%'
  5  and m.value > 0;

STATISTIC# NAME                                                                  VALUE
---------- ---------------------------------------------------------------- ----------
       257 redo entries                                                             34
       258 redo size                                                             11940
       307 redo synch time (usec)                                                  110
       308 redo synch time overhead (usec)                                      303802
       309 redo synch time overhead count (  2ms)                                    1
       313 redo synch time overhead count (inf)                                      1
       314 redo synch writes                                                         3
       323 redo write info find                                                      2

8 rows selected.

Elapsed: 00:00:00.00
SQL>

As shown in a prior test the redo statistics show a smaller redo size for the larger cache. On to the sys_guid() test:


SQL> connect bing/!@#!@#
Connected.
SQL> declare
  2    x raw(16);
  3    function sf return varchar2 is
  4    begin
  5  	 return sys_guid();
  6    end;
  7  begin
  8    for i in 1..100000 loop
  9  	 x := sf;
 10    end loop;
 11  end;
 12  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.70
SQL>
SQL> select m.statistic#, n.name, m.value
  2  from v$mystat m, v$statname n
  3  where m.statistic# = n.statistic#
  4  and n.name like '%redo%'
  5  and m.value > 0;

STATISTIC# NAME                                                                  VALUE
---------- ---------------------------------------------------------------- ----------
       257 redo entries                                                              6
       258 redo size                                                              1476
       302 redo subscn max counts                                                    1
       307 redo synch time (usec)                                                    1
       308 redo synch time overhead (usec)                                         575
       309 redo synch time overhead count (  2ms)                                    1
       314 redo synch writes                                                         1
       323 redo write info find                                                      1

8 rows selected.

Elapsed: 00:00:00.00
SQL>

Now, absent the insert, the redo generation is much less. The execution time, however, is at least twice as long as it was for the large-cache sequence. So on a Windows-based Oracle installation using a sequence takes much less time than using the sys_guid() calls. Parallel execution may be faster and the results may be reversed so further testing is necessary. These tests begin with the sequence created with the default cache value but starting at 100000000 (to more accurately reflect real production conditions) and multiple tables created with the CACHE option, which places the blocks in the most-recently used area of the cache to speed access and delay aging:


SQL> -- SEQUENCE MULTI SESSION TEST
SQL> drop table t_seq;

Table dropped.

Elapsed: 00:00:00.02
SQL> drop table tmp1;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp2;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp3;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp4;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> create table tmp1 cache as select 1 dummy from dual connect by level  create table tmp2 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.54
SQL> create table tmp3 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.61
SQL> create table tmp4 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.49
SQL>
SQL> drop sequence seq1 ;

Sequence dropped.

Elapsed: 00:00:00.00
SQL> create sequence seq1 start with 100000000 ;

Sequence created.

Elapsed: 00:00:00.00
SQL>
SQL> create table t_seq
  2  	 ( id	  number(9) primary key
  3  	 , filler varchar2(1000)
  4  	 );

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> alter system switch logfile;

System altered.

Elapsed: 00:00:01.29
SQL> alter system checkpoint;

System altered.

Elapsed: 00:00:00.20
SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.13
SQL>
SQL> select /*+ full(tmp1) */ count(*) from tmp1;

  COUNT(*)
----------
    999999

Elapsed: 00:00:00.05
SQL> select /*+ full(tmp2) */ count(*) from tmp2;

  COUNT(*)
----------
    999999

Elapsed: 00:00:00.04
SQL> select /*+ full(tmp3) */ count(*) from tmp3;

  COUNT(*)
----------
    999999

Elapsed: 00:00:00.04
SQL> select /*+ full(tmp4) */ count(*) from tmp4;

  COUNT(*)
----------
    999999

Elapsed: 00:00:00.04
SQL>
SQL> drop table tmp_log;

Table dropped.

Elapsed: 00:00:00.04
SQL> create table tmp_log(mydata varchar2(4000), optime timestamp);

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> create or replace PROCEDURE    sp_log(p varchar2) as
  2    PRAGMA AUTONOMOUS_TRANSACTION;
  3  begin
  4    insert into tmp_log values (p , systimestamp);
  5    commit;
  6  end;
  7  /

Procedure created.

Elapsed: 00:00:00.03
SQL>
SQL> show errors
No errors.
SQL>
SQL> create or replace procedure sp_test_seq(p number) as
  2  begin
  3    sp_log('JOB ' || p || ' BEGIN');
  4
  5    if p = 1 then
  6  	 insert  into t_seq
  7  	   select seq1.nextval ,'sdfsf' from tmp1;
  8    elsif p = 2 then
  9  	 insert into t_seq
 10  	   select seq1.nextval ,'sdfsf' from tmp2;
 11    elsif p = 3 then
 12  	 insert  into t_seq
 13  	   select seq1.nextval ,'sdfsf' from tmp3;
 14    elsif p = 4 then
 15  	 insert into t_seq
 16  	   select seq1.nextval ,'sdfsf' from tmp4;
 17    end if;
 18    commit;
 19
 20    sp_log('JOB ' || p || ' END');
 21  end;
 22  /

Procedure created.

Elapsed: 00:00:00.02
SQL>
SQL> show errors
No errors.
SQL>
SQL> declare
  2    x_time date := sysdate + 1/1440;
  3  begin
  4
  5
  6    dbms_scheduler.create_job(job_name => 'TEST_SEQ1',
  7  				 job_type => 'PLSQL_BLOCK',
  8  				 job_action => 'begin sp_test_seq(1); end;',
  9  				 enabled=> true,
 10  				 start_date=> x_time
 11  			       );
 12    dbms_scheduler.create_job(job_name => 'TEST_SEQ2',
 13  				 job_type => 'PLSQL_BLOCK',
 14  				 job_action => 'begin sp_test_seq(2); end;',
 15  				 enabled=> true,
 16  				 start_date=> x_time
 17  			       );
 18    dbms_scheduler.create_job(job_name => 'TEST_SEQ3',
 19  				 job_type => 'PLSQL_BLOCK',
 20  				 job_action => 'begin sp_test_seq(3); end;',
 21  				 enabled=> true,
 22  				 start_date=> x_time
 23  			       );
 24    dbms_scheduler.create_job(job_name => 'TEST_SEQ4',
 25  				 job_type => 'PLSQL_BLOCK',
 26  				 job_action => 'begin sp_test_seq(4); end;',
 27  				 enabled=> true,
 28  				 start_date=> x_time
 29  			       );
 30  end;
 31  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SQL>
SQL> select job_name, start_date from user_scheduler_jobs where job_name like 'TEST%';

JOB_NAME     START_DATE
------------ ---------------------------------------------------------------------------
TEST_SEQ1    27-NOV-16 01.46.47.000000 PM -07:00
TEST_SEQ2    27-NOV-16 01.46.47.000000 PM -07:00
TEST_SEQ3    27-NOV-16 01.46.47.000000 PM -07:00
TEST_SEQ4    27-NOV-16 01.46.47.000000 PM -07:00

Elapsed: 00:00:00.00
SQL>
SQL> exec dbms_lock.sleep(120)

PL/SQL procedure successfully completed.

Elapsed: 00:02:00.00
SQL>
SQL> select * from
  2  (select mydata, optime, lead(optime) over (order by mydata) optime_end, lead(optime) over (order by mydata) - optime elapsed
  3  from tmp_log)
  4  where mydata like '%BEGIN%'
  5  /

MYDATA          OPTIME                       OPTIME_END                   ELAPSED
--------------- ---------------------------- ---------------------------- ----------------------------
JOB 1 BEGIN     27-NOV-16 01.46.50.233000 PM 27-NOV-16 01.47.28.113000 PM +000000000 00:00:37.880000
JOB 2 BEGIN     27-NOV-16 01.46.50.234000 PM 27-NOV-16 01.47.27.904000 PM +000000000 00:00:37.670000
JOB 3 BEGIN     27-NOV-16 01.46.50.235000 PM 27-NOV-16 01.47.28.169000 PM +000000000 00:00:37.934000
JOB 4 BEGIN     27-NOV-16 01.46.50.244000 PM 27-NOV-16 01.47.28.121000 PM +000000000 00:00:37.877000

Elapsed: 00:00:00.00
SQL>

Parallel takes longer on Windows, possibly because of the underlying architecture, but each of the four concurrent inserts consumed aboutn 39 seconds. To see if concurrent processes using the sys_guid() call may end up faster we set up the test again, this time using the sys_guid() call:


SQL> -- SYS_GUID MULTI SESSION TEST
SQL> drop table t_raw;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp1;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp2;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp3;

Table dropped.

Elapsed: 00:00:00.01
SQL> drop table tmp4;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> create table tmp1 cache as select 1 dummy from dual connect by level  create table tmp2 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.62
SQL> create table tmp3 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.57
SQL> create table tmp4 cache as select 1 dummy from tmp1;

Table created.

Elapsed: 00:00:00.48
SQL>
SQL> create table t_raw
  2  	 ( id	  raw(16) primary key
  3  	 , filler varchar2(1000)
  4  	 );

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> alter system switch logfile;

System altered.

Elapsed: 00:00:03.02
SQL> alter system checkpoint;

System altered.

Elapsed: 00:00:00.17
SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.34
SQL>
SQL> select /*+ full(tmp1) */ count(*) from tmp1; -- to make sure table is in buffer_cache
  2  select /*+ full(tmp2) */ count(*) from tmp2; -- to make sure table is in buffer_cache
  3  select /*+ full(tmp3) */ count(*) from tmp3; -- to make sure table is in buffer_cache
  4  select /*+ full(tmp4) */ count(*) from tmp4; -- to make sure table is in buffer_cache
  5
SQL> drop table tmp_log;

Table dropped.

Elapsed: 00:00:00.03
SQL> create table tmp_log(mydata varchar2(4000), optime timestamp);

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> create or replace PROCEDURE    sp_log(p varchar2) as
  2    PRAGMA AUTONOMOUS_TRANSACTION;
  3  begin
  4    insert into tmp_log values (p , systimestamp);
  5    commit;
  6  end;
  7  /

Procedure created.

Elapsed: 00:00:00.03
SQL>
SQL> show errors
No errors.
SQL>
SQL> create or replace procedure sp_test_guid(p number) as
  2  begin
  3    sp_log('JOB ' || p || ' BEGIN');
  4
  5    if p = 1 then
  6  	 insert  into t_raw
  7  	   select sys_guid() ,'sdfsf' from tmp1;
  8    elsif p = 2 then
  9  	 insert  into t_raw
 10  	   select sys_guid() ,'sdfsf' from tmp2;
 11    elsif p = 3 then
 12  	 insert  into t_raw
 13  	   select sys_guid() ,'sdfsf' from tmp3;
 14    elsif p = 4 then
 15  	 insert into t_raw
 16  	   select sys_guid() ,'sdfsf' from tmp4;
 17    end if;
 18    commit;
 19
 20    sp_log('JOB ' || p || ' END');
 21  end;
 22  /

Procedure created.

Elapsed: 00:00:00.02
SQL>
SQL> show errors
No errors.
SQL>
SQL> declare
  2    x_time date := sysdate + 1/1440;
  3  begin
  4
  5    dbms_scheduler.create_job(job_name => 'TEST_GUID1',
  6  				 job_type => 'PLSQL_BLOCK',
  7  				 job_action => 'begin sp_test_guid(1); end;',
  8  				 enabled=> true,
  9  				 start_date=> x_time
 10  			       );
 11    dbms_scheduler.create_job(job_name => 'TEST_GUID2',
 12  				 job_type => 'PLSQL_BLOCK',
 13  				 job_action => 'begin sp_test_guid(2); end;',
 14  				 enabled=> true,
 15  				 start_date=> x_time
 16  			       );
 17    dbms_scheduler.create_job(job_name => 'TEST_GUID3',
 18  				 job_type => 'PLSQL_BLOCK',
 19  				 job_action => 'begin sp_test_guid(3); end;',
 20  				 enabled=> true,
 21  				 start_date=> x_time
 22  			       );
 23    dbms_scheduler.create_job(job_name => 'TEST_GUID4',
 24  				 job_type => 'PLSQL_BLOCK',
 25  				 job_action => 'begin sp_test_guid(4); end;',
 26  				 enabled=> true,
 27  				 start_date=> x_time
 28  			       );
 29  end;
 30  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> select job_name, start_date from user_scheduler_jobs where job_name like 'TEST%';

JOB_NAME     START_DATE
------------ ---------------------------------------------------------------------------
TEST_GUID1   27-NOV-16 01.48.53.000000 PM -07:00
TEST_GUID2   27-NOV-16 01.48.53.000000 PM -07:00
TEST_GUID3   27-NOV-16 01.48.53.000000 PM -07:00
TEST_GUID4   27-NOV-16 01.48.53.000000 PM -07:00

Elapsed: 00:00:00.00
SQL>
SQL> exec dbms_lock.sleep(180);

PL/SQL procedure successfully completed.

Elapsed: 00:03:00.00
SQL>
SQL> select * from
  2  (select mydata, optime, lead(optime) over (order by mydata) optime_end, lead(optime) over (order by mydata) - optime elapsed
  3  from tmp_log)
  4  where mydata like '%BEGIN%'
  5  /

MYDATA          OPTIME                       OPTIME_END                   ELAPSED
--------------- ---------------------------- ---------------------------- ----------------------------
JOB 1 BEGIN     27-NOV-16 01.48.54.228000 PM 27-NOV-16 01.50.49.312000 PM +000000000 00:01:55.084000
JOB 2 BEGIN     27-NOV-16 01.48.54.236000 PM 27-NOV-16 01.50.46.200000 PM +000000000 00:01:51.964000
JOB 3 BEGIN     27-NOV-16 01.48.54.245000 PM 27-NOV-16 01.50.47.742000 PM +000000000 00:01:53.497000
JOB 4 BEGIN     27-NOV-16 01.48.54.267000 PM 27-NOV-16 01.50.48.966000 PM +000000000 00:01:54.699000

Elapsed: 00:00:00.00
SQL>
SQL> set timing off echo off linesize 80 trimspool off

Table dropped.


Table dropped.


Sequence dropped.

Generating sys_guid() values is not faster when run in parallel in a Windows environment; each process ran almost two minutes before completing, roughly four times longer than the parallel sequence executions and twice as long as the serial sys_guid() runs.

The redo size using the sys_guid() call and insert statements was consistent regardless of the operating system used (Windows or Linux) and was larger than that when using a sequence and insert statements; absent any DML the sys_guid() call generated far less redo. Sequence cache size can affect the redo generation as a larger cache generates smaller amounts of redo, and the redo generation affects the execution time with a sequence. Without fail on Windows using sys_guid() takes longer. This is one area where testing on the operating system you are using is a must since Oracle on Linux can, and does, perform much differently than Oracle on Windows. Before you decide to change your primary key strategy to using sys_guid() test to see how it performs; you may be surprised at the results.

It would seem that 42 isn’t the only important number.

November 21, 2016

Taking Things For Granted

Filed under: General — dfitzjarrell @ 09:54

"Sometimes I find the best way of getting from one place to another is simply to erase everything and begin again."
-- Norton Juster, The Phantom Tollbooth

In one of the Oracle forums a question was asked regarding revoking selected privileges from the DBA role. Unfortunately for the person posting the question the answer is a resounding “No”; granting a role grants all privileges assigned to that role and there is no “picking and choosing” as if you were in a cafeteria. Roles are designed (or should be, at least) to grant all necessary privileges a user would need to access objects that role will use. And roles supplied by Oracle are designed for the jobs they are named after, such as DBA. Changing the role affects functionality and can seriously impact those granted that role. Let’s look at that in a bit more detail.

Oracle provides pre-configured roles with every installation of the database, and the list can vary based on the options you choose to install. A partial list of these roles from 12.1.0.2 is shown below (remember this is not a complete and exhaustive list):


ROLE                           O
------------------------------ -
CONNECT                        Y
RESOURCE                       Y
DBA                            Y
AUDIT_ADMIN                    Y
AUDIT_VIEWER                   Y
SELECT_CATALOG_ROLE            Y
EXECUTE_CATALOG_ROLE           Y
DELETE_CATALOG_ROLE            Y
CAPTURE_ADMIN                  Y
EXP_FULL_DATABASE              Y
IMP_FULL_DATABASE              Y
CDB_DBA                        Y
PDB_DBA                        Y
RECOVERY_CATALOG_OWNER         Y
LOGSTDBY_ADMINISTRATOR         Y
DBFS_ROLE                      Y
GSMUSER_ROLE                   Y
AQ_ADMINISTRATOR_ROLE          Y
AQ_USER_ROLE                   Y
DATAPUMP_EXP_FULL_DATABASE     Y
DATAPUMP_IMP_FULL_DATABASE     Y
ADM_PARALLEL_EXECUTE_TASK      Y
PROVISIONER                    Y
XS_RESOURCE                    Y
XS_SESSION_ADMIN               Y
XS_NAMESPACE_ADMIN             Y
XS_CACHE_ADMIN                 Y
GATHER_SYSTEM_STATISTICS       Y
OPTIMIZER_PROCESSING_RATE      Y
GSMADMIN_ROLE                  Y
RECOVERY_CATALOG_USER          Y
EM_EXPRESS_BASIC               Y
EM_EXPRESS_ALL                 Y
SCHEDULER_ADMIN                Y
HS_ADMIN_SELECT_ROLE           Y
HS_ADMIN_EXECUTE_ROLE          Y
HS_ADMIN_ROLE                  Y
GLOBAL_AQ_USER_ROLE            Y
OEM_ADVISOR                    Y
OEM_MONITOR                    Y
XDBADMIN                       Y
XDB_SET_INVOKER                Y
AUTHENTICATEDUSER              Y
XDB_WEBSERVICES                Y
XDB_WEBSERVICES_WITH_PUBLIC    Y
XDB_WEBSERVICES_OVER_HTTP      Y
GSM_POOLADMIN_ROLE             Y
GDS_CATALOG_SELECT             Y
WM_ADMIN_ROLE                  Y
JAVAUSERPRIV                   Y
JAVAIDPRIV                     Y
JAVASYSPRIV                    Y
JAVADEBUGPRIV                  Y
EJBCLIENT                      Y
JMXSERVER                      Y
JAVA_ADMIN                     Y
JAVA_DEPLOY                    Y
CTXAPP                         Y
ORDADMIN                       Y
OLAP_XS_ADMIN                  Y
OLAP_DBA                       Y
OLAP_USER                      Y
SPATIAL_WFS_ADMIN              Y
WFS_USR_ROLE                   Y
SPATIAL_CSW_ADMIN              Y
CSW_USR_ROLE                   Y
LBAC_DBA                       Y
APEX_ADMINISTRATOR_ROLE        Y
APEX_GRANTS_FOR_NEW_USERS_ROLE Y
DV_SECANALYST                  Y
DV_MONITOR                     Y
DV_ADMIN                       Y
DV_OWNER                       Y
DV_ACCTMGR                     Y
DV_PUBLIC                      Y
DV_PATCH_ADMIN                 Y
DV_STREAMS_ADMIN               Y
DV_GOLDENGATE_ADMIN            Y
DV_XSTREAM_ADMIN               Y
DV_GOLDENGATE_REDO_ACCESS      Y
DV_AUDIT_CLEANUP               Y
DV_DATAPUMP_NETWORK_LINK       Y
DV_REALM_RESOURCE              Y
DV_REALM_OWNER                 Y

The ‘O’ header is for the ORACLE_MAINTAINED column which indicates the role is supplied by Oracle. [This is a new column in the DBA_ROLES view for 12.1; earlier releases do not have this column in the view definition.] That list has 84 different roles all created when your database was created. What privileges do these roles have? That’s a question answered by the ROLE_SYS_PRIVS and ROLE_TAB_PRIVS views; let’s look at the DBA role and see what Oracle deems as necessary system privileges to be an effective DBA:


PRIVILEGE
----------------------------------------
CREATE SESSION
ALTER SESSION
DROP TABLESPACE
BECOME USER
DROP ROLLBACK SEGMENT
SELECT ANY TABLE
INSERT ANY TABLE
UPDATE ANY TABLE
DROP ANY INDEX
SELECT ANY SEQUENCE
CREATE ROLE
EXECUTE ANY PROCEDURE
ALTER PROFILE
CREATE ANY DIRECTORY
CREATE ANY LIBRARY
EXECUTE ANY LIBRARY
ALTER ANY INDEXTYPE
DROP ANY INDEXTYPE
DEQUEUE ANY QUEUE
EXECUTE ANY EVALUATION CONTEXT
EXPORT FULL DATABASE
CREATE RULE
ALTER ANY SQL PROFILE
ADMINISTER ANY SQL TUNING SET
CHANGE NOTIFICATION
DROP ANY EDITION
DROP ANY MINING MODEL
ALTER ANY MINING MODEL
ALTER ANY CUBE DIMENSION
CREATE CUBE
DROP ANY CUBE BUILD PROCESS
USE ANY SQL TRANSLATION PROFILE
CREATE PLUGGABLE DATABASE
ALTER ROLLBACK SEGMENT
DELETE ANY TABLE
ALTER DATABASE
FORCE ANY TRANSACTION
ALTER ANY PROCEDURE
DROP ANY TRIGGER
DROP ANY MATERIALIZED VIEW
UNDER ANY TYPE
ALTER ANY LIBRARY
CREATE DIMENSION
DEBUG ANY PROCEDURE
CREATE RULE SET
ALTER ANY RULE SET
ANALYZE ANY DICTIONARY
ALTER ANY EDITION
CREATE ANY ASSEMBLY
ALTER ANY CUBE
SELECT ANY CUBE
DROP ANY MEASURE FOLDER
RESTRICTED SESSION
CREATE TABLESPACE
ALTER TABLESPACE
CREATE USER
ALTER USER
LOCK ANY TABLE
CREATE VIEW
DROP ANY VIEW
GRANT ANY ROLE
CREATE TRIGGER
CREATE TYPE
EXECUTE ANY OPERATOR
CREATE ANY DIMENSION
ALTER ANY DIMENSION
CREATE ANY OUTLINE
ADMINISTER DATABASE TRIGGER
RESUMABLE
FLASHBACK ANY TABLE
CREATE ANY RULE SET
EXECUTE ANY RULE SET
IMPORT FULL DATABASE
EXECUTE ANY RULE
EXECUTE ANY PROGRAM
CREATE ANY EDITION
CREATE ASSEMBLY
ALTER ANY ASSEMBLY
CREATE CUBE DIMENSION
CREATE ANY CUBE BUILD PROCESS
UPDATE ANY CUBE DIMENSION
EM EXPRESS CONNECT
SET CONTAINER
ALTER ANY MEASURE FOLDER
CREATE ANY TABLE
CREATE ANY INDEX
CREATE ANY SEQUENCE
ALTER ANY ROLE
ANALYZE ANY
DROP ANY LIBRARY
CREATE ANY OPERATOR
CREATE INDEXTYPE
UNDER ANY TABLE
DROP ANY DIMENSION
SELECT ANY DICTIONARY
GRANT ANY OBJECT PRIVILEGE
CREATE EVALUATION CONTEXT
CREATE ANY EVALUATION CONTEXT
DROP ANY EVALUATION CONTEXT
CREATE ANY RULE
CREATE JOB
CREATE ANY JOB
CREATE MINING MODEL
INSERT ANY CUBE DIMENSION
DROP ANY CUBE
UPDATE ANY CUBE BUILD PROCESS
EXEMPT DML REDACTION POLICY
READ ANY TABLE
ALTER SYSTEM
AUDIT SYSTEM
CREATE ROLLBACK SEGMENT
DROP ANY TABLE
COMMENT ANY TABLE
REDEFINE ANY TABLE
CREATE CLUSTER
ALTER ANY INDEX
DROP PUBLIC DATABASE LINK
CREATE PROFILE
ALTER ANY MATERIALIZED VIEW
ALTER ANY TYPE
DROP ANY TYPE
UNDER ANY VIEW
EXECUTE ANY INDEXTYPE
DROP ANY CONTEXT
ALTER ANY OUTLINE
ADMINISTER RESOURCE MANAGER
MANAGE SCHEDULER
MANAGE FILE GROUP
CREATE ANY MINING MODEL
SELECT ANY MINING MODEL
CREATE ANY MEASURE FOLDER
DELETE ANY MEASURE FOLDER
CREATE ANY SQL TRANSLATION PROFILE
CREATE ANY CREDENTIAL
EXEMPT DDL REDACTION POLICY
SELECT ANY MEASURE FOLDER
SELECT ANY CUBE BUILD PROCESS
ALTER ANY CUBE BUILD PROCESS
CREATE TABLE
BACKUP ANY TABLE
CREATE ANY CLUSTER
DROP ANY SYNONYM
DROP PUBLIC SYNONYM
CREATE ANY VIEW
CREATE SEQUENCE
ALTER ANY SEQUENCE
FORCE TRANSACTION
CREATE PROCEDURE
CREATE ANY PROCEDURE
ALTER RESOURCE COST
DROP ANY DIRECTORY
CREATE ANY TYPE
ALTER ANY OPERATOR
CREATE ANY INDEXTYPE
ENQUEUE ANY QUEUE
ON COMMIT REFRESH
DEBUG CONNECT SESSION
DROP ANY RULE SET
EXECUTE ANY CLASS
MANAGE ANY FILE GROUP
EXECUTE ANY ASSEMBLY
EXECUTE ASSEMBLY
COMMENT ANY MINING MODEL
CREATE ANY CUBE DIMENSION
DELETE ANY CUBE DIMENSION
SELECT ANY CUBE DIMENSION
DROP ANY SQL TRANSLATION PROFILE
CREATE CREDENTIAL
ALTER ANY TABLE
DROP ANY CLUSTER
CREATE SYNONYM
CREATE PUBLIC SYNONYM
DROP ANY SEQUENCE
DROP ANY ROLE
AUDIT ANY
DROP ANY PROCEDURE
CREATE ANY TRIGGER
ALTER ANY TRIGGER
DROP PROFILE
GRANT ANY PRIVILEGE
CREATE LIBRARY
CREATE OPERATOR
DROP ANY OUTLINE
MERGE ANY VIEW
ADMINISTER SQL TUNING SET
UPDATE ANY CUBE
INSERT ANY MEASURE FOLDER
ADMINISTER SQL MANAGEMENT OBJECT
CREATE SQL TRANSLATION PROFILE
LOGMINING
MANAGE TABLESPACE
DROP USER
ALTER ANY CLUSTER
CREATE ANY SYNONYM
CREATE DATABASE LINK
CREATE PUBLIC DATABASE LINK
CREATE MATERIALIZED VIEW
CREATE ANY MATERIALIZED VIEW
EXECUTE ANY TYPE
DROP ANY OPERATOR
QUERY REWRITE
GLOBAL QUERY REWRITE
MANAGE ANY QUEUE
CREATE ANY CONTEXT
ALTER ANY EVALUATION CONTEXT
ALTER ANY RULE
DROP ANY RULE
ADVISOR
SELECT ANY TRANSACTION
DROP ANY SQL PROFILE
CREATE ANY SQL PROFILE
READ ANY FILE GROUP
CREATE EXTERNAL JOB
DROP ANY ASSEMBLY
DROP ANY CUBE DIMENSION
CREATE ANY CUBE
CREATE MEASURE FOLDER
CREATE CUBE BUILD PROCESS
ALTER ANY SQL TRANSLATION PROFILE
FLASHBACK ARCHIVE ADMINISTER

220 system privileges are required to be an effective DBA for an Oracle database, an impressive list, indeed. Of course once a user is granted the DBA role he or she gets ALL of those system privileges and since the role is the only direct grant that gives those privileges that list cannot be modified by selectively revoking one or more of those privileges:


SQL> grant DBA to blorpo identified by gussyflorp;

Grant succeeded.

SQL> revoke select any transaction from blorpo;
revoke select any transaction from blorpo
*
ERROR at line 1:
ORA-01952: system privileges not granted to 'BLORPO'


SQL>

Yes, the user DOES have that privilege, albeit indirectly. It’s indirect because it’s the ROLE that was granted that privilege, among others, and no attempt was made to revoke the role from the user. It’s a ‘package deal’; you grant a role to a user and it’s all or nothing, and even though it behaves like the user has the privileges granted directly that’s not the case.

You could, of course, get all of the privileges the DBA role has (both system and table) and create a script to grant each individual privilege to the desired user. It would be a LONG script and such grants require attention from the DBA granting them to ensure they are both current and not being abused. In that case individual privileges can be revoked which would be a maintenance nightmare for the DBA having keep track of which user has which set of privileges. Another option presents itself, creating a new role with only the privileges the DBA wants to assign to a user. The privilege list for DBA could be shortened to create, say, a DB_OPERATOR or DBO role. Such privileges would depend upon the job description; creating such a role would make granting such access easier and make maintenance simpler since when the role grants are changed those who are granted that role have there privileges adjusted the next time they login.

Roles make granting privileges very easy and straightforward, provided the role is properly created and maintained. Roles also make it impossible to “pick and choose” privileges a user should have. It’s an “all or nothing” proposition and there’s no way around that when using a pre-defined role.

Sometimes you just need to begin again.

November 17, 2016

“That STILL Ain’t Right”

Filed under: General — dfitzjarrell @ 07:45

"If you want sense, you'll have to make it yourself."
- Norton Juster, The Phantom Tollbooth

In a recent post to an Oracle forum a query was presented and a tuning request was made. It appears that the query was taking 20 hours to complete. Through further interrogation and responses it was discovered that the dates were being stored in a VARCHAR2 column and implicit date conversion was being used. To show how much of a problem this can cause the following example was created; notice the results returned and the execution plans generated for each run of the query, once with the table defined in the manner the original poster described and one with the date column using the DATE datatype. We begin:


SQL>
SQL> create table datetst(
  2  myid    number,
  3  res_value	varchar2(3),
  4  mydt    varchar2(20));

Table created.

SQL>
SQL> create index datetest_idx on datetst(res_value);

Index created.

SQL>
SQL> begin
  2  	     for i in 1..10000 loop
  3  		     if mod(i,23) = 0 then
  4  		     insert into datetst
  5  		     values(i, 'PUR', to_char(sysdate+i, 'MM/DD/RRRR'));
  6  		     else
  7  		     insert into datetst
  8  		     values(i, 'BID', to_char(sysdate+i, 'MM/DD/RRRR'));
  9  		     end if;
 10  	     end loop;
 11
 12  	     commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL>

Let’s now run a query using conditions similar to the posted query and see what Oracle returns:


SQL>
SQL>
SQL> select *
  2  from datetst
  3  where mydt <= sysdate +230;
where mydt <= sysdate +230
      *
ERROR at line 3:
ORA-01843: not a valid month


SQL>

As expected the implicit date conversion failed; modifying the query to explicitly convert the strings to dates produces ‘interesting’ results:


SQL>
SQL> select *
  2  from datetst
  3  where res_value = 'PUR'
  4  and mydt <= to_char(sysdate +230, 'MM/DD/RRRR');

      MYID RES MYDT
---------- --- --------------------
       920 PUR 05/26/2019
       943 PUR 06/18/2019
      1150 PUR 01/11/2020
      1173 PUR 02/03/2020
      1196 PUR 02/26/2020
      1219 PUR 03/20/2020
      1242 PUR 04/12/2020
      1265 PUR 05/05/2020
      1288 PUR 05/28/2020
      1311 PUR 06/20/2020
        46 PUR 01/02/2017
        69 PUR 01/25/2017
        92 PUR 02/17/2017
       115 PUR 03/12/2017
       138 PUR 04/04/2017
       161 PUR 04/27/2017
       184 PUR 05/20/2017
       207 PUR 06/12/2017
       230 PUR 07/05/2017
       414 PUR 01/05/2018
       437 PUR 01/28/2018
       460 PUR 02/20/2018
       483 PUR 03/15/2018
       506 PUR 04/07/2018
       529 PUR 04/30/2018
       552 PUR 05/23/2018
       575 PUR 06/15/2018
       782 PUR 01/08/2019
       805 PUR 01/31/2019
       828 PUR 02/23/2019
       851 PUR 03/18/2019
       874 PUR 04/10/2019
       897 PUR 05/03/2019
      2622 PUR 01/22/2024
      2645 PUR 02/14/2024
      2668 PUR 03/08/2024
      2691 PUR 03/31/2024
      2714 PUR 04/23/2024
      3335 PUR 01/04/2026
      3358 PUR 01/27/2026
      3381 PUR 02/19/2026
      3404 PUR 03/14/2026
      3427 PUR 04/06/2026
      3450 PUR 04/29/2026
      3473 PUR 05/22/2026
      3496 PUR 06/14/2026
      3703 PUR 01/07/2027
      3726 PUR 01/30/2027
      3749 PUR 02/22/2027
      3772 PUR 03/17/2027
      3795 PUR 04/09/2027
      3818 PUR 05/02/2027
      3841 PUR 05/25/2027
      3864 PUR 06/17/2027
      2737 PUR 05/16/2024
      2760 PUR 06/08/2024
      2783 PUR 07/01/2024
      2967 PUR 01/01/2025
      2990 PUR 01/24/2025
      3013 PUR 02/16/2025
      3036 PUR 03/11/2025
      3059 PUR 04/03/2025
      3082 PUR 04/26/2025
      3105 PUR 05/19/2025
      3128 PUR 06/11/2025
      3151 PUR 07/04/2025
      1518 PUR 01/13/2021
      1541 PUR 02/05/2021
      1564 PUR 02/28/2021
      1587 PUR 03/23/2021
      1610 PUR 04/15/2021
      1633 PUR 05/08/2021
      1656 PUR 05/31/2021
      1679 PUR 06/23/2021
      1886 PUR 01/16/2022
      1909 PUR 02/08/2022
      1932 PUR 03/03/2022

      MYID RES MYDT
---------- --- --------------------
      1955 PUR 03/26/2022
      1978 PUR 04/18/2022
      2001 PUR 05/11/2022
      2024 PUR 06/03/2022
      2047 PUR 06/26/2022
      2254 PUR 01/19/2023
      2277 PUR 02/11/2023
      2300 PUR 03/06/2023
      2323 PUR 03/29/2023
      2346 PUR 04/21/2023
      2369 PUR 05/14/2023
      2392 PUR 06/06/2023
      2415 PUR 06/29/2023
      5543 PUR 01/21/2032
      5566 PUR 02/13/2032
      5589 PUR 03/07/2032
      5612 PUR 03/30/2032
      5635 PUR 04/22/2032
      5658 PUR 05/15/2032
      5681 PUR 06/07/2032
      5704 PUR 06/30/2032
      5911 PUR 01/23/2033
      5934 PUR 02/15/2033
      5957 PUR 03/10/2033
      5980 PUR 04/02/2033
      6003 PUR 04/25/2033
      6026 PUR 05/18/2033
      4071 PUR 01/10/2028
      4094 PUR 02/02/2028
      4117 PUR 02/25/2028
      4140 PUR 03/19/2028
      4163 PUR 04/11/2028
      4186 PUR 05/04/2028
      4209 PUR 05/27/2028
      4232 PUR 06/19/2028
      4439 PUR 01/12/2029
      4462 PUR 02/04/2029
      4485 PUR 02/27/2029
      4508 PUR 03/22/2029
      4531 PUR 04/14/2029
      4554 PUR 05/07/2029
      4577 PUR 05/30/2029
      4600 PUR 06/22/2029
      4807 PUR 01/15/2030
      4830 PUR 02/07/2030
      4853 PUR 03/02/2030
      4876 PUR 03/25/2030
      4899 PUR 04/17/2030
      4922 PUR 05/10/2030
      4945 PUR 06/02/2030
      4968 PUR 06/25/2030
      5175 PUR 01/18/2031
      5198 PUR 02/10/2031
      5221 PUR 03/05/2031
      5244 PUR 03/28/2031
      5267 PUR 04/20/2031
      5290 PUR 05/13/2031
      5313 PUR 06/05/2031
      5336 PUR 06/28/2031
      6992 PUR 01/09/2036
      7015 PUR 02/01/2036
      7038 PUR 02/24/2036
      7061 PUR 03/18/2036
      7084 PUR 04/10/2036
      7107 PUR 05/03/2036
      7130 PUR 05/26/2036
      7153 PUR 06/18/2036
      7866 PUR 06/01/2038
      7889 PUR 06/24/2038
      8096 PUR 01/17/2039
      8119 PUR 02/09/2039
      8142 PUR 03/04/2039
      8165 PUR 03/27/2039
      8188 PUR 04/19/2039
      8211 PUR 05/12/2039
      8234 PUR 06/04/2039
      8257 PUR 06/27/2039

      MYID RES MYDT
---------- --- --------------------
      7360 PUR 01/11/2037
      7383 PUR 02/03/2037
      7406 PUR 02/26/2037
      7429 PUR 03/21/2037
      7452 PUR 04/13/2037
      7475 PUR 05/06/2037
      7498 PUR 05/29/2037
      7521 PUR 06/21/2037
      7728 PUR 01/14/2038
      7751 PUR 02/06/2038
      7774 PUR 03/01/2038
      7797 PUR 03/24/2038
      7820 PUR 04/16/2038
      7843 PUR 05/09/2038
      6049 PUR 06/10/2033
      6072 PUR 07/03/2033
      6256 PUR 01/03/2034
      6279 PUR 01/26/2034
      6302 PUR 02/18/2034
      6325 PUR 03/13/2034
      6348 PUR 04/05/2034
      6371 PUR 04/28/2034
      6394 PUR 05/21/2034
      6417 PUR 06/13/2034
      6624 PUR 01/06/2035
      6647 PUR 01/29/2035
      6670 PUR 02/21/2035
      6693 PUR 03/16/2035
      6716 PUR 04/08/2035
      6739 PUR 05/01/2035
      6762 PUR 05/24/2035
      6785 PUR 06/16/2035
      9982 PUR 03/17/2044
      8464 PUR 01/20/2040
      8487 PUR 02/12/2040
      8510 PUR 03/06/2040
      8533 PUR 03/29/2040
      8556 PUR 04/21/2040
      8579 PUR 05/14/2040
      8602 PUR 06/06/2040
      8625 PUR 06/29/2040
      8832 PUR 01/22/2041
      8855 PUR 02/14/2041
      8878 PUR 03/09/2041
      8901 PUR 04/01/2041
      8924 PUR 04/24/2041
      8947 PUR 05/17/2041
      8970 PUR 06/09/2041
      8993 PUR 07/02/2041
      9177 PUR 01/02/2042
      9200 PUR 01/25/2042
      9223 PUR 02/17/2042
      9246 PUR 03/12/2042
      9269 PUR 04/04/2042
      9292 PUR 04/27/2042
      9315 PUR 05/20/2042
      9338 PUR 06/12/2042
      9545 PUR 01/05/2043
      9568 PUR 01/28/2043
      9591 PUR 02/20/2043
      9614 PUR 03/15/2043
      9637 PUR 04/07/2043
      9660 PUR 04/30/2043
      9683 PUR 05/23/2043
      9706 PUR 06/15/2043
      9913 PUR 01/08/2044
      9936 PUR 01/31/2044
      9959 PUR 02/23/2044

222 rows selected.

SQL>

The query should have returned no more than 10 rows that met the criteria, and it returned 222. Looking at the plan we see:


SQL>
SQL> select * From table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6qatrtphp5wjt, child number 0
-------------------------------------
select * from datetst where res_value = 'PUR' and mydt <=
to_char(sysdate +230, 'MM/DD/RRRR')

Plan hash value: 3255216368

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DATETST      |   222 |  6216 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | DATETEST_IDX |   434 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("MYDT"<=TO_CHAR(SYSDATE@!+230,'MM/DD/RRRR'))
   2 - access("RES_VALUE"='PUR')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


25 rows selected.

SQL>

Oracle found 222 rows that ‘matched’ the conditions, illustrating a problem of using an incorrect datatype; Oracle can’t know these are dates and compares them as ASCII strings creating a result set that is much larger than it should be. We drop the table and start over:


SQL>
SQL> drop table datetst purge;

Table dropped.

SQL>
SQL> create table datetst(
  2  myid    number,
  3  res_value varchar2(3),
  4  mydt    date);

Table created.

SQL>
SQL> create index datetest_idx on datetst(res_value);

Index created.

SQL>
SQL> begin
  2  	     for i in 1..10000 loop
  3  		     if mod(i,23) = 0 then
  4  		     insert into datetst
  5  		     values(i, 'PUR', sysdate+i);
  6  		     else
  7  		     insert into datetst
  8  		     values(i, 'BID', sysdate+i);
  9  		     end if;
 10  	     end loop;
 11
 12  	     commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL>

We now run the original query (that didn’t have explicit date conversion, since we no longer need it) and examine the results:


SQL>
SQL> select *
  2  from datetst
  3  where res_value = 'PUR'
  4  and mydt <= sysdate +230;

      MYID RES MYDT
---------- --- ---------
        23 PUR 10-DEC-16
        46 PUR 02-JAN-17
        69 PUR 25-JAN-17
        92 PUR 17-FEB-17
       115 PUR 12-MAR-17
       138 PUR 04-APR-17
       161 PUR 27-APR-17
       184 PUR 20-MAY-17
       207 PUR 12-JUN-17
       230 PUR 05-JUL-17

10 rows selected.

SQL>
SQL> select * From table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0m2c2mv7zhx49, child number 0
-------------------------------------
select * from datetst where res_value = 'PUR' and mydt <= sysdate +230

Plan hash value: 3255216368

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DATETST      |    10 |   250 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | DATETEST_IDX |   434 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("MYDT"<=SYSDATE@!+230)
   2 - access("RES_VALUE"='PUR')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


24 rows selected.

SQL>

Oracle now found the 10 rows we sought using the conditions we specified because the date data was correctly stored as a DATE datatype. Using VARCHAR2 made the result set 2,220 percent larger, and that was for a 10000 row table. Let’s re-run the example with 1,000,000 rows and see what numbers Oracle produces:


SQL>
SQL> create table datetst(
  2  myid    number,
  3  res_value	varchar2(3),
  4  mydt    varchar2(20));

Table created.

SQL>
SQL> create index datetest_idx on datetst(res_value);

Index created.

SQL>
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     if mod(i,23) = 0 then
  4  		     insert into datetst
  5  		     values(i, 'PUR', to_char(sysdate+i, 'MM/DD/RRRR'));
  6  		     else
  7  		     insert into datetst
  8  		     values(i, 'BID', to_char(sysdate+i, 'MM/DD/RRRR'));
  9  		     end if;
 10  	     end loop;
 11
 12  	     commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL>
SQL> select *
  2  from datetst
  3  where mydt <= sysdate +230;
where mydt <= sysdate +230
      *
ERROR at line 3:
ORA-01843: not a valid month


SQL>
SQL> select *
  2  from datetst
  3  where res_value = 'PUR'
  4  and mydt <= to_char(sysdate +230, 'MM/DD/RRRR');

      MYID RES MYDT
---------- --- --------------------
       920 PUR 05/26/2019
       943 PUR 06/18/2019
      1150 PUR 01/11/2020
      1173 PUR 02/03/2020
      1196 PUR 02/26/2020
      1219 PUR 03/20/2020
      1242 PUR 04/12/2020
      1265 PUR 05/05/2020
      1288 PUR 05/28/2020
      1311 PUR 06/20/2020
        46 PUR 01/02/2017
        69 PUR 01/25/2017
        92 PUR 02/17/2017
       115 PUR 03/12/2017
       138 PUR 04/04/2017
       161 PUR 04/27/2017
       184 PUR 05/20/2017
       207 PUR 06/12/2017
       230 PUR 07/05/2017
       414 PUR 01/05/2018
       437 PUR 01/28/2018
       460 PUR 02/20/2018
       483 PUR 03/15/2018
       506 PUR 04/07/2018
       529 PUR 04/30/2018
       552 PUR 05/23/2018
       575 PUR 06/15/2018
       782 PUR 01/08/2019
       805 PUR 01/31/2019
       828 PUR 02/23/2019
       851 PUR 03/18/2019
       874 PUR 04/10/2019
       897 PUR 05/03/2019
      2622 PUR 01/22/2024
      2645 PUR 02/14/2024
      2668 PUR 03/08/2024
      2691 PUR 03/31/2024
      2714 PUR 04/23/2024
      3335 PUR 01/04/2026
      3358 PUR 01/27/2026
      3381 PUR 02/19/2026
      3404 PUR 03/14/2026
      3427 PUR 04/06/2026
      3450 PUR 04/29/2026
      3473 PUR 05/22/2026
      3496 PUR 06/14/2026
      3703 PUR 01/07/2027
      3726 PUR 01/30/2027
      3749 PUR 02/22/2027
      3772 PUR 03/17/2027
      3795 PUR 04/09/2027
      3818 PUR 05/02/2027
      3841 PUR 05/25/2027
      3864 PUR 06/17/2027
      2737 PUR 05/16/2024
      2760 PUR 06/08/2024
      2783 PUR 07/01/2024
      2967 PUR 01/01/2025
      2990 PUR 01/24/2025
      3013 PUR 02/16/2025
      3036 PUR 03/11/2025
      3059 PUR 04/03/2025
      3082 PUR 04/26/2025
      3105 PUR 05/19/2025
      3128 PUR 06/11/2025
      3151 PUR 07/04/2025
      1518 PUR 01/13/2021
      1541 PUR 02/05/2021
      1564 PUR 02/28/2021
      1587 PUR 03/23/2021
      1610 PUR 04/15/2021
      1633 PUR 05/08/2021
      1656 PUR 05/31/2021
      1679 PUR 06/23/2021
      1886 PUR 01/16/2022
      1909 PUR 02/08/2022
      1932 PUR 03/03/2022
...
    986309 PUR 04/21/4717
    986332 PUR 05/14/4717
    986355 PUR 06/06/4717
    986378 PUR 06/29/4717
    990955 PUR 01/09/4730
    990978 PUR 02/01/4730
    991001 PUR 02/24/4730
    991024 PUR 03/19/4730
    995394 PUR 03/06/4742
    995417 PUR 03/29/4742
    995440 PUR 04/21/4742
    995463 PUR 05/14/4742
    995486 PUR 06/06/4742
    995509 PUR 06/29/4742
    986585 PUR 01/22/4718
    986608 PUR 02/14/4718
    986631 PUR 03/09/4718
    986654 PUR 04/01/4718
    986677 PUR 04/24/4718
    991047 PUR 04/11/4730
    991070 PUR 05/04/4730
    991093 PUR 05/27/4730
    991116 PUR 06/19/4730
    991323 PUR 01/12/4731
    995716 PUR 01/22/4743
    995739 PUR 02/14/4743
    995762 PUR 03/09/4743
    995785 PUR 04/01/4743
    995808 PUR 04/24/4743
    995831 PUR 05/17/4743
    995854 PUR 06/09/4743
    995877 PUR 07/02/4743

22054 rows selected.

SQL>
SQL> select * From table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6qatrtphp5wjt, child number 0
-------------------------------------
select * from datetst where res_value = 'PUR' and mydt <=
to_char(sysdate +230, 'MM/DD/RRRR')

Plan hash value: 3255216368

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |       |       |   158 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DATETST      | 25757 |   704K|   158   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | DATETEST_IDX | 49112 |       |   158   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("MYDT"<=TO_CHAR(SYSDATE@!+230,'MM/DD/RRRR'))
   2 - access("RES_VALUE"='PUR')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


25 rows selected.

SQL>
SQL> drop table datetst purge;

Table dropped.

SQL>

SQL> create table datetst(
  2  myid    number,
  3  res_value varchar2(3),
  4  mydt    date);

Table created.

SQL>
SQL> create index datetest_idx on datetst(res_value);

Index created.

SQL>
SQL> begin
  2  	     for i in 1..1000000 loop
  3  		     if mod(i,23) = 0 then
  4  		     insert into datetst
  5  		     values(i, 'PUR', sysdate+i);
  6  		     else
  7  		     insert into datetst
  8  		     values(i, 'BID', sysdate+i);
  9  		     end if;
 10  	     end loop;
 11
 12  	     commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

SQL>
SQL> select *
  2  from datetst
  3  where res_value = 'PUR'
  4  and mydt <= sysdate +230;

      MYID RES MYDT
---------- --- ---------
        23 PUR 10-DEC-16
        46 PUR 02-JAN-17
        69 PUR 25-JAN-17
        92 PUR 17-FEB-17
       115 PUR 12-MAR-17
       138 PUR 04-APR-17
       161 PUR 27-APR-17
       184 PUR 20-MAY-17
       207 PUR 12-JUN-17
       230 PUR 05-JUL-17

10 rows selected.

SQL>
SQL> select * From table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0m2c2mv7zhx49, child number 0
-------------------------------------
select * from datetst where res_value = 'PUR' and mydt <= sysdate +230

Plan hash value: 3255216368

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |       |       |   148 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DATETST      |    35 |   875 |   148   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | DATETEST_IDX | 46482 |       |   148   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("MYDT"<=SYSDATE@!+230)
   2 - access("RES_VALUE"='PUR')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


24 rows selected.

SQL>

With 1,000,000 rows of data Oracle inflated the original 10-row result set to 22,054 rows, a whopping 220,440 percent increase. As the data volumes increase this result set will increase to even larger numbers, resulting in extremely long query times and vast numbers of incorrect results, something I doubt the original poster had counted on.

Using the correct datatype is essential in ensuring Oracle can do its job and do it properly, returning result sets that are reliable. Storing data in formst that doesn’t reflect the actual data type can be disastrous, as illustrated here. It pays when writing or purchasing applications that the proper datatype is in force for the columns being used.

It only makes sense.

November 14, 2016

“That’s Not What I Wanted.”

Filed under: General — dfitzjarrell @ 11:40

"Let me try once more," Milo said in an effort to explain. "In other words--"
"You mean you have other words?" cried the bird happily. "Well, by all means, 
use them. You're certainly not doing very well with the ones you have now."
-- Norton Juster, The Phantom Tollbooth

A recent question in the Oracle database forums has a user wanting to open his physical standby database in READ WRITE mode but still keep it as a physical standby. He found this blog post, followed the instructions and did, indeed, get what used to be his standby database opened in READ WRITE mode. What the blogger failed to mention was that his instructions performed a failover of the standby to primary, and did so in a manner that Oracle does not recommend.

To be honest there isn’t a way to open a physical standby database READ WRITE, keep the primary as a primary and maintain the physical standby as a physical standby. Active Data Guard allows the physical standby to be opened READ ONLY while continuing to apply redo to keep the standby in synch with the primary. (This option has extra cost as it requires additional licensing.) Once a failover has occurred the only recourse is to rebuild the standby database; it can’t be simply ‘brought back’ at the will and whim of the DBA.

What was suggested by the author of that blog post was to use:


alter database activate standby database;

which WILL eventually open the standby READ WRITE, failing it over to be the new primary in the process. Yet Oracle, in the documentation, clearly does not recommend using this option:


ACTIVATE [PHYSICAL|LOGICAL] STANDBY DATABASE [FINISH APPLY]

Performs a failover. The standby database must be mounted before it can be activated with this statement.

Note: Do not use the ALTER DATABASE ACTIVATE STANDBY DATABASE statement to failover because it causes
data loss. Instead, use the following best practices:

For physical standby databases, use the ALTER DATABASE RECOVER MANAGED STANDBY DATABASE statement 
with the FINISH keyword to perform the role transition as quickly as possible with little or no data loss
and without rendering other standby databases unusable.

For logical standby databases, use the ALTER DATABASE PREPARE TO SWITCHOVER and
ALTER DATABASE COMMIT TO SWITCHOVER statements.

Notice the warning given by Oracle:


Note: Do not use the ALTER DATABASE ACTIVATE STANDBY DATABASE statement to failover because it causes
data loss.

A far better option, although not permanent, would be to execute a switchover, where the primary and physical standby switch roles for a period of time before restoring the physical standby to its original role as PHYSICAL STANDBY. This takes more work to accomplish but it is reversible. An example of a switchover is shown below:


Ensure that no gaps exist in the redo apply:

SQL> select status, gap_status
  2  from v$archive_dest_status
  3  where dest_id = 2;
 
STATUS    GAP_STATUS
--------- ------------------------
VALID     NO GAP
 
SQL>

Make certain that the temp file name or names on the standby match those on the primary.  On the primary execute:

SQL> select file_name from dba_temp_files;
 
FILE_NAME
--------------------------------------------------------------------------------
/u01/app/oradata/yumpo/temp01.dbf
 
SQL>

and on the standby execute:

SQL> select name from v$tempfile;
 
NAME
--------------------------------------------------------------------------------
/u01/app/oradata/yumpo/temp01.dbf
 
SQL> 

There may be a delay configured for redo apply on the standby; to determine if a delay is configured execute:

SQL> select delay_mins from v$archive_dest where dest_id = 2; 
 
DELAY_MINS
----------
         0
 
SQL>

(If the DELAY_MINS is greater than 0 the delay can be removed by executing:

 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;

)

Check the SWITCHOVER_STATUS on the primary:

SQL> select switchover_status from v$database;
 
SWITCHOVER_STATUS
--------------------
TO STANDBY
 
SQL>

Valid values are TO STANDBY and SESSION ACTIVE.  Any other value signals that the redo transport 
isn't functioning and the switchover will fail.  If the status is SESSION ACTIVE query the
V$SESSION view for active sessions:

SQL> select sid, process, program
  2  from v$session where type = 'USER';

       SID PROCESS                  PROGRAM
---------- ------------------------ ---------------------------------------
         6 16952:11396              sqlplus.exe
        17 15728                    ORACLE.EXE (J002)
        18 10668                    ORACLE.EXE (J006)
       131 13196                    ORACLE.EXE (J003)
       134 16576:16772              sqlplus.exe
       135 9720                     ORACLE.EXE (J007)
       244 17404                    ORACLE.EXE (J004)
       248 17844                    ORACLE.EXE (J008)
       254 12992                    ORACLE.EXE (J000)
       364 8280                     ORACLE.EXE (J005)
       366 2320                     ORACLE.EXE (J001)

11 rows selected.

SQL>

When everything looks good execute the switchover; if the status is TO STANDBY the command is:

SQL> alter database commit to switchover to physical standby;
 
Database altered.

SQL>

If the status is SESSION ACTIVE you should modify the command as shown below:

SQL> alter database commit to switchover to physical standby with session shutdown;
 
Database altered.
 
SQL>

Shutdown the 'new' standby and afterwards mount it:

SQL> shutdown immediate
ORA-01507: database not mounted
 
 
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
 
...
Database mounted.
SQL> 

If all has gone according to plan the alert log on the 'new' standby should report:

Successful mount of redo thread 1, with mount id ...
Physical Standby Database mounted.

It's now time to check the 'old' standby for SWITCHOVER_STATUS.  The two valid values are  
TO PRIMARY and SESSIONS ACTIVE:

SQL> select switchover_status from v$database;
 
SWITCHOVER_STATUS
--------------------
TO PRIMARY
 
SQL>

Since no sessions are active the WITH SESSION SHUTDOWN clause will not be necessary:

SQL> alter database commit to switchover to primary;
 
Database altered.
 
SQL>

Open the 'new' primary database:

SQL> alter database open;
 
Database altered.
 
SQL>

Looking in the alert log you should see:

Standby became primary SCN: ...
Switchover: Complete - Database mounted as primary
Completed: alter database commit to switchover to primary

Check the LOG_ARCHIVE_DEST_n parameters and set up a new destination, if necessary:


SQL> alter system set log_archive_dest_2='service="yumpo" LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="yumpo" net_timeout=30 valid_for=(all_logfiles,primary_role)' scope = both;

 
System altered.
 
SQL>

Start the redo apply on the 'new' standby:

SQL> alter database recover managed standby database using current logfile disconnect from session;
 
Database altered.
 
SQL>

It's a good idea to verify that the redo transport is working; execute at least two log switches
on the 'new' primary to prove the redo is being transported:

SQL> alter system switch logfile;
 
System altered.
 
SQL> /
 
System altered.
 
SQL> /
 
System altered.
 
SQL>/
 
System altered.
 
SQL> select status, gap_status from v$archive_dest_status where dest_id = 2;
 
STATUS    GAP_STATUS
--------- ------------------------
VALID     NO GAP
 
SQL>

Presuming the status is VALID and there is no gap the switchover is complete.

To reverse the switchover the same steps are applied, starting at the ‘new’ primary and ending with the ‘new’ standby. Should there be no problems the primary database will be returned to the PRIMARY role, likewise the physical standby will return to the PHYSICAL STANDBY role. In between the switchover steps the original physical standby will be temporarily the primary database, opened in READ WRITE mode.

If you have Data Guard Broker configured a switchover is considerably easier to execute; you can read the Oracle documentation for a complete example.

Understanding Data Guard/Standby concepts is essential for knowing what can, and cannot, be done. And blog advice, even from this one, should be investigated before taking any action as the title and text may not accurately reflect what the recommended actions actually accomplish. It would be a shame to put yourself in an unrecoverable situation just because you jumped first and looked later.

In other words …

November 13, 2016

“If You Change This, Right Here …”

Filed under: Performance,stats — dfitzjarrell @ 10:47

"changes are so frightening."
-- Norton Juster, The Phantom Tollbooth

It isn’t unusual for Oracle to expand functionality with each new release; a short list includes AWR, ADDM, the SYSAUX tablespace, Flashback Query, Flashback Database and Edition-based Redefinition. Oracle 12.1 is no different as it provides an interesting performance enhancement, the in-memory column store, that can increase performance by reducing physical I/O load by making tables memory-resident (to simplify the description somewhat). In addition to the in-memory access to table data Oracle has introduced a new optimization called “Vector Transformation, including Vector Optimization” for in-memory operations. Even when this new optimization isn’t in use there are new code paths to take advantage of this technology. As a result execution plans can change; let’s look at an example originally provided by Jonathan Lewis that illustrates this.

(In-memory is fairly easy to configure. Set inmemory_size to a ‘reasonable’ non-zero value and then restart the database:


SQL> alter system set inmemory_size=500m scope=spfile;

System altered.

SQL>

For a complete reference to in-memory configuration read here.)

Two tables are created, neither one especially noteworthy, an index is created on each table and extended statistics are generated on the (n1,n2) column grouping:


SQL> 
SQL> create table t1
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level  
SQL> create table t2
  2  nologging
  3  as
  4  with generator as (
  5  	     select  --+ materialize
  6  		     rownum id
  7  	     from dual
  8  	     connect by
  9  		     level 'T1',
  5  		     method_opt       => 'for columns (n1,n2) size 1'
  6  	     );
  7  end;
  8  /
SQL> 

Executing the following query and displaying the execution plan shows no surprises since neither table has been altered to be in-memory:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 1718706536                                                                                                                           
                                                                                                                                                      
-------------------------------------------------------------------------------------------------                                                     
| Id  | Operation                               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |                                                     
-------------------------------------------------------------------------------------------------                                                     
|   0 | SELECT STATEMENT                        |       |       |       |  5127 (100)|          |                                                     
|   1 |  SORT AGGREGATE                         |       |     1 |       |            |          |                                                     
|   2 |   VIEW                                  |       | 10001 |       |  5127   (1)| 00:00:01 |                                                     
|   3 |    HASH UNIQUE                          |       | 10001 |   351K|  5127   (1)| 00:00:01 |                                                     
|*  4 |     HASH JOIN                           |       | 10001 |   351K|  5125   (1)| 00:00:01 |                                                     
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   100 |  2100 |     3   (0)| 00:00:01 |                                                     
|*  6 |       INDEX RANGE SCAN                  | T1_N1 |   100 |       |     1   (0)| 00:00:01 |                                                     
|   7 |      TABLE ACCESS FULL                  | T2    |  1000K|    14M|  5113   (1)| 00:00:01 |                                                     
-------------------------------------------------------------------------------------------------                                                     
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   5 - filter("T1"."N2"=50)                                                                                                                           
   6 - access("T1"."N1"=50)                                                                                                                           
                                                                                                                                                      
SQL> 

Having the extended statistics allowed the optimizer to estimate the correct cardinality between columns n1 and n2, providing a better initial plan. Now things will get a bit interesting; table T2 is altered and put in-memory:


SQL> 
SQL> alter table t2 inmemory;
SQL> 

The same query is executed again and the plan changes because of T2 now being in-memory, even though the new vector transformation has not been applied:


SQL> 
SQL> select
  2  	     /*+
  3  		     qb_name(main)
  4  	     */
  5  	     count(*)
  6  from    (
  7  	     select
  8  		     /*+ qb_name(inline) */
  9  		     distinct t1.v1, t2.v1
 10  	     from
 11  		     t1,t2
 12  	     where
 13  		     t1.n1 = 50
 14  	     and     t1.n2 = 50
 15  	     and     t2.rand = t1.id
 16  	     )
 17  ;

      9912                                                                                                                                            
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  a7gb3ncf922zz, child number 0                                                                                                                 
-------------------------------------                                                                                                                 
select         /*+                 qb_name(main)         */                                                                                           
count(*) from    (         select                 /*+ qb_name(inline)                                                                                 
*/                 distinct t1.v1, t2.v1         from                                                                                                 
t1,t2         where                 t1.n1 = 50         and     t1.n2 =                                                                                
50         and     t2.rand = t1.id         )                                                                                                          
                                                                                                                                                      
Plan hash value: 106371239                                                                                                                            
                                                                                                                                                      
----------------------------------------------------------------------------------------------------                                                  
| Id  | Operation                                | Name    | Rows  | Bytes | Cost (%CPU)| Time     |                                                  
----------------------------------------------------------------------------------------------------                                                  
|   0 | SELECT STATEMENT                         |         |       |       |   223 (100)|          |                                                  
|   1 |  SORT AGGREGATE                          |         |     1 |       |            |          |                                                  
|   2 |   VIEW                                   |         | 10001 |       |   223  (15)| 00:00:01 |                                                  
|   3 |    HASH UNIQUE                           |         | 10001 |   351K|   223  (15)| 00:00:01 |                                                  
|*  4 |     HASH JOIN                            |         | 10001 |   351K|   222  (14)| 00:00:01 |                                                  
|   5 |      JOIN FILTER CREATE                  | :BF0000 |   100 |  2100 |     3   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1      |   100 |  2100 |     3   (0)| 00:00:01 |                                                  
|*  7 |        INDEX RANGE SCAN                  | T1_N1   |   100 |       |     1   (0)| 00:00:01 |                                                  
|   8 |      JOIN FILTER USE                     | :BF0000 |  1000K|    14M|   209  (11)| 00:00:01 |
|*  9 |       TABLE ACCESS INMEMORY FULL         | T2      |  1000K|    14M|   209  (11)| 00:00:01 |                                                  
----------------------------------------------------------------------------------------------------                                                  
                                                                                                                                                      
Predicate Information (identified by operation id):                                                                                                   
---------------------------------------------------                                                                                                   
                                                                                                                                                      
   4 - access("T2"."RAND"="T1"."ID")                                                                                                                  
   6 - filter("T1"."N2"=50)                                                                                                                           
   7 - access("T1"."N1"=50)                                                                                                                           
   9 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
                                                                                                                                                      
SQL> 

The cost of the tablescan was reduced considerably, from 5113 to 209 but that isn’t the most thought-provoking part of the plan, however, as a serial Bloom filter is used for the hash join due to the in-memory code path. With a table stored ‘normally’ (not in-memory) a Bloom filter would not be effective; with the in-memory option a Bloom filter can use enhancements such as simultaneous multiple probe access into table t2 (in this example) in addition to the benefits of avoiding disk access, disk buffer activity and row-by-row data retrieval. All of those avoidances reduce CPU time considerably when compared to a ‘regular’ table, one that is not in-memory.

The in-memory column store is a very nice addition to an already powerful database, made even better by new optimizer transformations that speed data access by providing execution paths not available for tables accessed from disk. ‘Experiment’ with the in-memory option to see what performance improvements it might provide; you might be pleasantly surprised at what you find.

Sometimes unexpected change is good.

October 29, 2016

It’s A Match

Filed under: General — dfitzjarrell @ 10:29

"So many things are possible as long as you don't know they're impossible."
-- Norton Juster, The Phantom Tollbooth

With every new release Oracle provides new features that are very useful, even if you didn’t think of them before. In 12.1.0.2 Oracle provides a new functionality for finding pattern matches in a data set; using the MATCH_RECOGNIZE function it’s fairly easy to generate results based on a defined pattern in the data. Let’s look at some examples and see how to use MATCH_RECOGNIZE to find and report user-defined patterns in a data set. The examples provided were taken from the Live SQL website (https://livesql.oracle.com) and from the Oracle documentation (https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8981), where MATCH_RECOGNIZE is more thoroughly explained. That being said let’s go through an introduction to the function and some of what it can accomplish.

These examples begin with a table of click data, similar to what an advertising agency might collect from users clicking on an advertisement. Sessions are defined as being 10 minutes or more apart; all data within a 10-minute window for the same user is considered to be the same session. We create the table and load the data:


SQL> 
SQL> --
SQL> -- Create table and populate
SQL> --
SQL> CREATE TABLE clickdata (tstamp integer, userid varchar2(15));

Table created.

SQL> INSERT INTO clickdata VALUES(1, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(2, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(11, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(12, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(22, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(23, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(32, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(34, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(43, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(44, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(47, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(48, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(53, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(59, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(60, 'Sam');

1 row created.

SQL> INSERT INTO clickdata VALUES(63, 'Mary');

1 row created.

SQL> INSERT INTO clickdata VALUES(68, 'Sam');

1 row created.

SQL> commit;

Commit complete.

Using a very basic MATCH_RECOGNIZE query we return the timestamp and the user id to see how many sessions each user had:


SQL> 
SQL> --
SQL> -- Return rows that are part of the same
SQL> -- session for a login
SQL> --
SQL> -- Any session tstamp value within 10
SQL> -- minutes of the prior tstamp is considered
SQL> -- part of the same session
SQL> --
SQL> -- In MATCH_RECOGNIZE the PATTERN is the
SQL> -- sequence or data pattern we are looking for
SQL> --
SQL> -- In this case b maps to any row, which gives
SQL> -- the ability to check for matches
SQL> --
SQL> -- s* states that 0 or more rows must be mapped
SQL> -- to confirm a match
SQL> --
SQL> -- s is defined as the difference between the
SQL> -- current tstamp value and the prior tstamp
SQL> -- value and the difference must be less than
SQL> -- or equal to 10
SQL> --
SQL> -- This will give a list of the activity of the
SQL> -- different continuous sessions
SQL> --
SQL> -- The ALL ROWS PER MATCH returns every matching
SQL> -- row
SQL> --
SQL> 
SQL> SELECT
  2   tstamp,
  3   userid
  4  FROM clickdata MATCH_RECOGNIZE(
  5  	ALL ROWS PER MATCH
  6  	PATTERN (b s*)
  7  	DEFINE
  8  	    s as (s.tstamp - prev(s.tstamp) <= 10)
  9   );

    TSTAMP USERID                                                                                                                                     
---------- ---------------                                                                                                                            
         1 Mary                                                                                                                                       
         2 Sam                                                                                                                                        
        11 Mary                                                                                                                                       
        12 Sam                                                                                                                                        
        22 Sam                                                                                                                                        
        23 Mary                                                                                                                                       
        32 Sam                                                                                                                                        
        34 Mary                                                                                                                                       
        43 Sam                                                                                                                                        
        44 Mary                                                                                                                                       
        47 Sam                                                                                                                                        

    TSTAMP USERID                                                                                                                                     
---------- ---------------                                                                                                                            
        48 Sam                                                                                                                                        
        53 Mary                                                                                                                                       
        59 Sam                                                                                                                                        
        60 Sam                                                                                                                                        
        63 Mary                                                                                                                                       
        68 Sam                                                                                                                                        

17 rows selected.

Two interesting things are found in this construct, the PATTERN keyword and the DEFINE keyword. PATTERN defines a generalized pattern, using Perl-style regular expression syntax, to declare a particular pattern in the data. DEFINE sets up the definitions for the symbols used in the PATTERN statement. Since b has no conditions on it every row is returned; the conditions on s state that the difference between the current timestamp and the previous timestamp must be less than or equal to 10. There are 17 rows of data in the table and all 17 are returned with this MATCH_RECOGNIZE query. Modifying that query to find the unique session identifier for each session fails because not enough pieces of the MATCH_RECOGNZE syntax are included; it ‘fails’ by returning 1 for the count of each session and that isn’t true:


SQL> 
SQL> --
SQL> -- The MATCH_NUMBER() Function should return the
SQL> -- session id for each different session
SQL> --
SQL> -- In this case it doesn't as we are missing a
SQL> -- piece of the puzzle
SQL> --
SQL> SELECT
  2   tstamp,
  3   userid,
  4   session_id
  5  FROM clickdata MATCH_RECOGNIZE(
  6  	MEASURES match_number() as session_id
  7  	ALL ROWS PER MATCH
  8  	PATTERN (b s*)
  9  	DEFINE
 10  	    s as (s.tstamp - prev(s.tstamp) <= 10)
 11   );

    TSTAMP USERID          SESSION_ID                                                                                                                 
---------- --------------- ----------                                                                                                                 
         1 Mary                     1                                                                                                                 
         2 Sam                      1                                                                                                                 
        11 Mary                     1                                                                                                                 
        12 Sam                      1                                                                                                                 
        22 Sam                      1                                                                                                                 
        23 Mary                     1                                                                                                                 
        32 Sam                      1                                                                                                                 
        34 Mary                     1                                                                                                                 
        43 Sam                      1                                                                                                                 
        44 Mary                     1                                                                                                                 
        47 Sam                      1                                                                                                                 

    TSTAMP USERID          SESSION_ID                                                                                                                 
---------- --------------- ----------                                                                                                                 
        48 Sam                      1                                                                                                                 
        53 Mary                     1                                                                                                                 
        59 Sam                      1                                                                                                                 
        60 Sam                      1                                                                                                                 
        63 Mary                     1                                                                                                                 
        68 Sam                      1                                                                                                                 

17 rows selected.

Proving Mary has three sessions is fairly easy:


SQL> 
SQL> --
SQL> -- As all records show 1 session this isn't accurate
SQL> -- since Mary has 3 separate sessions according to
SQL> -- the previously established criteria
SQL> --
SQL> SELECT *
  2  FROM clickdata
  3  WHERE userid = 'Mary';

    TSTAMP USERID                                                                                                                                     
---------- ---------------                                                                                                                            
         1 Mary                                                                                                                                       
        11 Mary                                                                                                                                       
        23 Mary                                                                                                                                       
        34 Mary                                                                                                                                       
        44 Mary                                                                                                                                       
        53 Mary                                                                                                                                       
        63 Mary                                                                                                                                       

7 rows selected.

The first two records belong to the first session, the third is another session and the remaining records make up the third session. Correcting the MATCH_RECOGNIZE query is a simply matter of partitioning (grouping) by userid and ordering each group by the tstamp values:


SQL> 
SQL> --
SQL> -- To 'fix' this we need to add some instructions
SQL> -- to the MATCH_RECOGNIZE function call
SQL> --
SQL> -- By partitioning  by userid  (collecting the data
SQL> -- based on the userid column) and ordering each
SQL> -- grouping by tstamp the MATCH_RECOGNIZE() call
SQL> -- can  accurately return the session id
SQL> --
SQL> SELECT
  2   tstamp,
  3   userid,
  4   session_id
  5  FROM clickdata MATCH_RECOGNIZE(
  6  	PARTITION BY userid ORDER BY tstamp
  7  	MEASURES match_number() as session_id
  8  	ALL ROWS PER MATCH
  9  	PATTERN (b s*)
 10  	DEFINE
 11  	    s as (s.tstamp - prev(s.tstamp) <=10)
 12   );

    TSTAMP USERID          SESSION_ID                                                                                                                 
---------- --------------- ----------                                                                                                                 
         1 Mary                     1                                                                                                                 
        11 Mary                     1                                                                                                                 
        23 Mary                     2                                                                                                                 
        34 Mary                     3                                                                                                                 
        44 Mary                     3                                                                                                                 
        53 Mary                     3                                                                                                                 
        63 Mary                     3                                                                                                                 
         2 Sam                      1                                                                                                                 
        12 Sam                      1                                                                                                                 
        22 Sam                      1                                                                                                                 
        32 Sam                      1                                                                                                                 

    TSTAMP USERID          SESSION_ID                                                                                                                 
---------- --------------- ----------                                                                                                                 
        43 Sam                      2                                                                                                                 
        47 Sam                      2                                                                                                                 
        48 Sam                      2                                                                                                                 
        59 Sam                      3                                                                                                                 
        60 Sam                      3                                                                                                                 
        68 Sam                      3                                                                                                                 

17 rows selected.

Now the report correctly displays that there were three sessions for each user.

Before we go further let’s look at some parts of the MATCH_RECOGNIZE statement by defining the keywords being used. PARTITION groups the data by the column or columns specified, similar to the way the analytic functions work. ORDER BY orders the data in each group by the column or columns provided. The MEASURES keyword allows you to define the values you want to return and how to compute them based on the pattern being matched. The MATCH_NUMBER() function returns the number of the pattern match, starting with 1 and increasing incrementally as each pattern match is found. The FIRST and LAST keywords return the first and last values for a given pattern match, and can be used in calculations. Finally the ALL ROWS PER MATCH directive returns one row for each criteria match in the data. Using these in the following modifications to the MATCH_RECOGNIZE query produces a report of the session number, the number of events in each session, the start time and the session duration:


SQL> 
SQL> --
SQL> -- Let's expand this report
SQL> --
SQL> -- Setting up the MEASURES clause allows us
SQL> -- to return:
SQL> --
SQL> --  * the number of events per session
SQL> --  * the start time of each session
SQL> --  * the session duration
SQL> --
SQL> -- We have already successfully returned the session id
SQL> -- so now we add the count(*) for each group, the
SQL> -- start time for each group and the elapsed time for
SQL> -- each group calculated from the last tstamp for
SQL> -- each group
SQL> --
SQL> -- In one group first and last are the same so the difference
SQL> -- is 0
SQL> --
SQL> SELECT
  2   tstamp,
  3   userid,
  4   session_id,
  5   no_of_events,
  6   start_time,
  7   session_duration
  8  FROM clickdata MATCH_RECOGNIZE(
  9  	PARTITION BY userid ORDER BY tstamp
 10  	MEASURES match_number() as session_id,
 11  		 count(*) as no_of_events,
 12  		 first(tstamp) start_time,
 13  		 last(tstamp) - first(tstamp) session_duration
 14  	ALL ROWS PER MATCH
 15  	PATTERN (b s*)
 16  	DEFINE
 17  	    s as (s.tstamp - prev(s.tstamp) <= 10)
 18   );

    TSTAMP USERID          SESSION_ID NO_OF_EVENTS START_TIME SESSION_DURATION                                                                        
---------- --------------- ---------- ------------ ---------- ----------------                                                                        
         1 Mary                     1            1          1                0                                                                        
        11 Mary                     1            2          1               10                                                                        
        23 Mary                     2            1         23                0                                                                        
        34 Mary                     3            1         34                0                                                                        
        44 Mary                     3            2         34               10                                                                        
        53 Mary                     3            3         34               19                                                                        
        63 Mary                     3            4         34               29                                                                        
         2 Sam                      1            1          2                0                                                                        
        12 Sam                      1            2          2               10                                                                        
        22 Sam                      1            3          2               20                                                                        
        32 Sam                      1            4          2               30                                                                        

    TSTAMP USERID          SESSION_ID NO_OF_EVENTS START_TIME SESSION_DURATION                                                                        
---------- --------------- ---------- ------------ ---------- ----------------                                                                        
        43 Sam                      2            1         43                0                                                                        
        47 Sam                      2            2         43                4                                                                        
        48 Sam                      2            3         43                5                                                                        
        59 Sam                      3            1         59                0                                                                        
        60 Sam                      3            2         59                1                                                                        
        68 Sam                      3            3         59                9                                                                        

17 rows selected.

As mentioned above using ALL ROWS PER MATCH can produce a long report. Taking this one step further and changing ALL ROWS PER MATCH to ONE ROW PER MATCH produces a shorter, summary report of the activity:


SQL> 
SQL> --
SQL> -- One of the nice aspects of MATCH_RECOGNIZE is the
SQL> -- ability to generate summary reports
SQL> --
SQL> -- The last report was nice but a bit lengthy
SQL> --
SQL> -- Using the ONE ROW PER MATCH directive and removing
SQL> -- the tstamp column from the select statement
SQL> -- allows MATCH_RECOGNIZE to return only the summary
SQL> -- information, making a more compact and, in this case,
SQL> -- usable report
SQL> --
SQL> SELECT
  2   userid,
  3   session_id,
  4   no_of_events,
  5   start_time,
  6   session_duration
  7  FROM clickdata MATCH_RECOGNIZE(
  8  	PARTITION BY userid ORDER BY tstamp
  9  	MEASURES match_number() as session_id,
 10  		 count(*) as no_of_events,
 11  		 first(tstamp) start_time,
 12  		 last(tstamp) - first(tstamp) session_duration
 13  	ONE ROW PER MATCH
 14  	PATTERN (b s*)
 15  	DEFINE
 16  	    s as (s.tstamp - prev(s.tstamp) 
 17   );

USERID          SESSION_ID NO_OF_EVENTS START_TIME SESSION_DURATION
--------------- ---------- ------------ ---------- ----------------
Mary                     1            2          1               10
Mary                     2            1         23                0
Mary                     3            4         34               29
Sam                      1            4          2               30
Sam                      2            3         43                5
Sam                      3            3         59                9

6 rows selected.

SQL>

Let’s create another table and data set, this time for stock prices for the month of April:


SQL> 
SQL> --
SQL> -- Create table for stock data
SQL> --
SQL> -- Populate the table with data
SQL> -- for April
SQL> --
SQL> CREATE TABLE Ticker (SYMBOL VARCHAR2(10), tstamp DATE, price NUMBER);

Table created.

SQL> 
SQL> INSERT INTO Ticker VALUES('ACME', '01-Apr-11', 12);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '02-Apr-11', 17);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '03-Apr-11', 19);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '04-Apr-11', 21);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '05-Apr-11', 25);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '06-Apr-11', 12);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '07-Apr-11', 15);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '08-Apr-11', 20);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '09-Apr-11', 24);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '10-Apr-11', 25);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '11-Apr-11', 19);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '12-Apr-11', 15);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '13-Apr-11', 25);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '14-Apr-11', 25);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '15-Apr-11', 14);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '16-Apr-11', 12);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '17-Apr-11', 14);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '18-Apr-11', 24);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '19-Apr-11', 23);

1 row created.

SQL> INSERT INTO Ticker VALUES('ACME', '20-Apr-11', 22);

1 row created.

SQL> 
SQL> COMMIT;

Commit complete.

This data has a different pattern structure, where V-shaped patterns are generated by the changing stock prices. Let’s use MATCH_RECOGNIZE to find and report those patterns. This first query will return the total number of patterns found in the data:


SQL> 
SQL> --
SQL> -- Use MATCH_RECOGNIZE to find areas
SQL> -- in the data where the stock price
SQL> -- 'dipped' then rose
SQL> --
SQL> -- The data provided describes three
SQL> -- patterns of dipping and rising prices
SQL> --
SQL> -- The query will return three rows,
SQL> -- one per matching pattern
SQL> --
SQL> -- The only stock symbol in the table
SQL> -- is 'ACME' but this query would return
SQL> -- one row for each pattern matched per
SQL> -- stock symbol
SQL> --
SQL> -- Here we group by symbol and order
SQL> -- by the tstamp values
SQL> --
SQL> -- The MEASURES section defines the
SQL> -- starting timestamp of the pattern,
SQL> -- the lowest value of the group and
SQL> -- the highest value after the low has
SQL> -- been reached
SQL> --
SQL> -- Note that in one case the ending
SQL> -- tstamp for one pattern is the starting
SQL> -- tstamp for the	next pattern
SQL> --
SQL> -- This query introduces a new wrinkle.
SQL> -- AFTER MATCH SKIP TO LAST UP, which
SQL> -- causes the pattern matching for the
SQL> -- next group to begin at the last UP
SQL> -- value from the current pattern
SQL> --
SQL> SELECT *
  2  FROM Ticker MATCH_RECOGNIZE (
  3  	  PARTITION BY symbol
  4  	  ORDER BY tstamp
  5  	  MEASURES  STRT.tstamp AS start_tstamp,
  6  		    LAST(DOWN.tstamp) AS bottom_tstamp,
  7  		    LAST(UP.tstamp) AS end_tstamp
  8  	  ONE ROW PER MATCH
  9  	  AFTER MATCH SKIP TO LAST UP
 10  	  PATTERN (STRT DOWN+ UP+)
 11  	  DEFINE
 12  	     DOWN AS DOWN.price  PREV(UP.price)
 14  	  ) MR
 15  ORDER BY MR.symbol, MR.start_tstamp;

SYMBOL     START_TST BOTTOM_TS END_TSTAM                                                                                                              
---------- --------- --------- ---------                                                                                                              
ACME       05-APR-11 06-APR-11 10-APR-11                                                                                                              
ACME       10-APR-11 12-APR-11 13-APR-11                                                                                                              
ACME       14-APR-11 16-APR-11 18-APR-11                                                                                                              

We expand the report to return more data. The comments explain what is being returned and how the MEASURES are coded:


SQL> 
SQL> --
SQL> -- This next query returns all rows for
SQL> -- each pattern match and returns a bit more
SQL> -- data
SQL> --
SQL> -- There are 5 MEASURES being used
SQL> --
SQL> -- The start timestamp
SQL> --
SQL> -- The bottom of the v-shaped pattern
SQL> --
SQL> -- The top of the v-shaped pattern
SQL> --
SQL> -- The row number to know which values
SQL> -- belong to a given pattern match
SQL> --
SQL> -- The classifier (UP, DOWN, STRT)
SQL> -- to know where the value lies in the
SQL> -- pattern
SQL> --
SQL> SELECT *
  2  FROM Ticker MATCH_RECOGNIZE (
  3  	  PARTITION BY symbol
  4  	  ORDER BY tstamp
  5  	  MEASURES  STRT.tstamp AS start_tstamp,
  6  		    FINAL LAST(DOWN.tstamp) AS bottom_tstamp,
  7  		    FINAL LAST(UP.tstamp) AS end_tstamp,
  8  		    MATCH_NUMBER() AS match_num,
  9  		    CLASSIFIER() AS var_match
 10  	  ALL ROWS PER MATCH
 11  	  AFTER MATCH SKIP TO LAST UP
 12  	  PATTERN (STRT DOWN+ UP+)
 13  	  DEFINE
 14  	     DOWN AS DOWN.price  PREV(UP.price)
 16  	  ) MR
 17  ORDER BY MR.symbol, MR.match_num, MR.tstamp;

SYMBOL     TSTAMP    START_TST BOTTOM_TS END_TSTAM  MATCH_NUM VAR_MATCH                           PRICE                                               
---------- --------- --------- --------- --------- ---------- ------------------------------ ----------                                               
ACME       05-APR-11 05-APR-11 06-APR-11 10-APR-11          1 STRT                                   25                                               
ACME       06-APR-11 05-APR-11 06-APR-11 10-APR-11          1 DOWN                                   12                                               
ACME       07-APR-11 05-APR-11 06-APR-11 10-APR-11          1 UP                                     15                                               
ACME       08-APR-11 05-APR-11 06-APR-11 10-APR-11          1 UP                                     20                                               
ACME       09-APR-11 05-APR-11 06-APR-11 10-APR-11          1 UP                                     24                                               
ACME       10-APR-11 05-APR-11 06-APR-11 10-APR-11          1 UP                                     25                                               
ACME       10-APR-11 10-APR-11 12-APR-11 13-APR-11          2 STRT                                   25                                               
ACME       11-APR-11 10-APR-11 12-APR-11 13-APR-11          2 DOWN                                   19                                               
ACME       12-APR-11 10-APR-11 12-APR-11 13-APR-11          2 DOWN                                   15                                               
ACME       13-APR-11 10-APR-11 12-APR-11 13-APR-11          2 UP                                     25                                               
ACME       14-APR-11 14-APR-11 16-APR-11 18-APR-11          3 STRT                                   25                                               

SYMBOL     TSTAMP    START_TST BOTTOM_TS END_TSTAM  MATCH_NUM VAR_MATCH                           PRICE                                               
---------- --------- --------- --------- --------- ---------- ------------------------------ ----------                                               
ACME       15-APR-11 14-APR-11 16-APR-11 18-APR-11          3 DOWN                                   14                                               
ACME       16-APR-11 14-APR-11 16-APR-11 18-APR-11          3 DOWN                                   12                                               
ACME       17-APR-11 14-APR-11 16-APR-11 18-APR-11          3 UP                                     14                                               
ACME       18-APR-11 14-APR-11 16-APR-11 18-APR-11          3 UP                                     24                                               

15 rows selected.

Going even further let’s generate a report that displayes where the row falls in the pattern (STRT, DOWN, UP), the total number of UP days in the pattern, the total number of days for each pattern, the day number in each pattern, the running price difference and the the starting price for the pattern. Remember that the FINAL keyword returns the last value for each pattern:


SQL> 
SQL> --
SQL> -- This query aggregates data and returns
SQL> -- the number of days mapped to the UP pattern
SQL> -- (up_days), the total number of days each
SQL> -- for each pattern found, the day number within
SQL> -- each match (a running count that resets when
SQL> -- a new pattern match begins) and the running
SQL> -- price difference between each row in a match
SQL> --
SQL> SELECT *
  2  FROM Ticker MATCH_RECOGNIZE (
  3    PARTITION BY symbol
  4    ORDER BY tstamp
  5    MEASURES
  6  	 MATCH_NUMBER() AS match_num,
  7  	 CLASSIFIER() AS var_match,
  8  	 FINAL COUNT(UP.tstamp) AS up_days,
  9  	 FINAL COUNT(tstamp) AS total_days,
 10  	 RUNNING COUNT(tstamp) AS cnt_days,
 11  	 price - STRT.price AS price_dif
 12    ALL ROWS PER MATCH
 13    AFTER MATCH SKIP TO LAST UP
 14    PATTERN (STRT DOWN+ UP+)
 15    DEFINE
 16  	 DOWN AS DOWN.price  PREV(UP.price)
 18    ) MR
 19  ORDER BY MR.symbol, MR.match_num, MR.tstamp;

SYMBOL     TSTAMP     MATCH_NUM VAR_MATCH                         UP_DAYS TOTAL_DAYS   CNT_DAYS  PRICE_DIF      PRICE                                 
---------- --------- ---------- ------------------------------ ---------- ---------- ---------- ---------- ----------                                 
ACME       05-APR-11          1 STRT                                    4          6          1          0         25                                 
ACME       06-APR-11          1 DOWN                                    4          6          2        -13         12                                 
ACME       07-APR-11          1 UP                                      4          6          3        -10         15                                 
ACME       08-APR-11          1 UP                                      4          6          4         -5         20                                 
ACME       09-APR-11          1 UP                                      4          6          5         -1         24                                 
ACME       10-APR-11          1 UP                                      4          6          6          0         25                                 
ACME       10-APR-11          2 STRT                                    1          4          1          0         25                                 
ACME       11-APR-11          2 DOWN                                    1          4          2         -6         19                                 
ACME       12-APR-11          2 DOWN                                    1          4          3        -10         15                                 
ACME       13-APR-11          2 UP                                      1          4          4          0         25                                 
ACME       14-APR-11          3 STRT                                    2          5          1          0         25                                 

SYMBOL     TSTAMP     MATCH_NUM VAR_MATCH                         UP_DAYS TOTAL_DAYS   CNT_DAYS  PRICE_DIF      PRICE                                 
---------- --------- ---------- ------------------------------ ---------- ---------- ---------- ---------- ----------                                 
ACME       15-APR-11          3 DOWN                                    2          5          2        -11         14                                 
ACME       16-APR-11          3 DOWN                                    2          5          3        -13         12                                 
ACME       17-APR-11          3 UP                                      2          5          4        -11         14                                 
ACME       18-APR-11          3 UP                                      2          5          5         -1         24                                 

15 rows selected.

Remember that the original pattern to match was a V-shape, and there were three. Changing the query to look for W shapes, where the end of the previous V shape is the start of the next V shape, we find only one pattern in the data:


SQL> 
SQL> --
SQL> -- This query looks for W shapes, of which
SQL> -- there is one in the data
SQL> --
SQL> -- The query was changed to look for two
SQL> -- consecutive down/up patterns where the
SQL> -- ending value of the first down/up is the
SQL> -- starting value of the next down/up
SQL> -- pattern
SQL> --
SQL> SELECT *
  2  FROM Ticker MATCH_RECOGNIZE (
  3    PARTITION BY symbol
  4    ORDER BY tstamp
  5    MEASURES
  6  	 MATCH_NUMBER() AS match_num,
  7  	 CLASSIFIER()  AS  var_match,
  8  	 STRT.tstamp AS start_tstamp,
  9  	 FINAL LAST(UP.tstamp) AS end_tstamp
 10    ALL ROWS PER MATCH
 11    AFTER MATCH SKIP TO LAST UP
 12    PATTERN (STRT DOWN+ UP+ DOWN+ UP+)
 13    DEFINE
 14  	 DOWN AS DOWN.price  PREV(UP.price)
 16    ) MR
 17  ORDER BY MR.symbol, MR.match_num, MR.tstamp;

SYMBOL     TSTAMP     MATCH_NUM VAR_MATCH                      START_TST END_TSTAM      PRICE                                                         
---------- --------- ---------- ------------------------------ --------- --------- ----------                                                         
ACME       05-APR-11          1 STRT                           05-APR-11 13-APR-11         25                                                         
ACME       06-APR-11          1 DOWN                           05-APR-11 13-APR-11         12                                                         
ACME       07-APR-11          1 UP                             05-APR-11 13-APR-11         15                                                         
ACME       08-APR-11          1 UP                             05-APR-11 13-APR-11         20                                                         
ACME       09-APR-11          1 UP                             05-APR-11 13-APR-11         24                                                         
ACME       10-APR-11          1 UP                             05-APR-11 13-APR-11         25                                                         
ACME       11-APR-11          1 DOWN                           05-APR-11 13-APR-11         19                                                         
ACME       12-APR-11          1 DOWN                           05-APR-11 13-APR-11         15                                                         
ACME       13-APR-11          1 UP                             05-APR-11 13-APR-11         25                                                         

9 rows selected.

SQL> 

Why weren’t there two W-shaped patterns? The last V-shape didn’t start at the same point as the end of the prior V-shape so it didn’t make a second W-shaped pattern in the data.

MATCH_RECOGNIZE is a powerful and useful additon to Oracle 12.1.0.x as it can enable you to define and report patterns in your data. It’s ideal for data warehousing applications and as an analytical tool to report repeating patterns in user data. Everyone may not get benefit from MATCH_RECOGNIZE but, for those who do, it’s a welcome change to an already exceptional RDBMS.

Sometimes the impossible is possible.

« Previous PageNext Page »

Blog at WordPress.com.