Oracle Tips and Tricks — David Fitzjarrell

July 6, 2016

He’s Making A List

Filed under: General — dfitzjarrell @ 10:16

"But I suppose there's a lot to see everywhere, if only you keep your eyes open."
-- Norton Juster, The Phantom Tollbooth

Recently a tweet by Chris Saxon (@chrisrsaxon) with respect to SQL, the language and implementation, started a list of contributions which should cause people to think a bit differently about how the language works and how queries should be structured. The list, found here, is by no stretch of the imagination complete but it is a very good place to begin re-thinking how to write SQL. Let’s look at a few of the listed items a bit further (since the list is likekly to grow not all of the list members can be covered).

Franck Pachot start the list with the following items:


				1. SQL is a declarative language
				2. It is compiled to procedural operations
				3. It operates on row sets

So what, exactly, is a “declarative language”? According to the accepted definition it is:


	... non-procedural and very high-level (4th generation). This means the programmer specifies what
        needs to be done rather than how to do it.

Unlike languages such as C/C++, Pascal, FORTRAN, BASIC, Java, Perl and Python, the SQL language simply requires that you decide what you want done and, in the case of Oracle, the optimizer decides HOW to get it done. That doesn’t mean that you can’t have some input to influence the ‘how’, it simply means you don’t need to think about it. Certainly hints are available to ‘nudge’ the optimizer in the direction more experienced coders may think is ‘best’, but there is no requirement to use them.

His second item is a reference to PL/SQL, the procedural component to Oracle’s SQL*Plus, the ADA-based language extension where procedures, functions, triggers and anonymous procedural blocks are written. Yes, you can go happily along with many tasks and not need PL/SQL but it does come in handy when repeated tasks are necessary or when intermediate feedback (well, sort of) is desired. [Pipelined functions do provide immediate feedback during execution, but they are a special case.] Many PL/SQL examples have been posted here, so look around and they can be found.

His third item was addressed by Jonathan Lewis in an Oracle Forums response; let’s look at the problem that initiated the set-based response and Jonathan Lewis’ offering.

A query and part of the execution plan were posted with the poster asking how to get rid of the table scan. The table has a primary key index so, on the face of it, it is interesting that the primary key wasn’t used.

Looking at the definition for the ‘problem’ table (LF_HOTEL_TEMP) we find only two columns, both of which are in the primary key definition. The tables were created:


SQL> create table LF_HOTEL_TEMP
  2  (
  3    HOTEL_CODE VARCHAR2(4000),
  4    SERVICE_ID NUMBER(3)
  5  )
  6  /

Table created.

SQL>
SQL> create table LF_TS_ROOMTYPE_PROPERTIES
  2  (
  3    HOTEL_CODE    VARCHAR2(20),
  4    ROOM_TYPE     VARCHAR2(500),
  5    BOARD_TYPE    VARCHAR2(500),
  6    ROOM_AMT      FLOAT,
  7    SERVICE_ID    NUMBER,
  8    CURRENCY_CODE VARCHAR2(10)
  9  )
 10  /

Table created.

SQL>
SQL> create table LF_HB_ROOMTYPE_PROPERTIES
  2  (
  3    HOTEL_CODE    VARCHAR2(20),
  4    ROOM_TYPE     VARCHAR2(500),
  5    BOARD_TYPE    VARCHAR2(500),
  6    ROOM_AMT      FLOAT,
  7    SERVICE_ID    NUMBER,
  8    CURRENCY_CODE VARCHAR2(10)
  9  )
 10  /

Table created.

SQL>
SQL> create table LF_GTA_ROOMTYPE_PROPERTIES
  2  (
  3    HOTEL_CODE    VARCHAR2(20),
  4    ROOM_TYPE     VARCHAR2(500),
  5    BOARD_TYPE    VARCHAR2(500),
  6    ROOM_AMT      FLOAT,
  7    SERVICE_ID    NUMBER,
  8    CURRENCY_CODE VARCHAR2(10)
  9  )
 10  /

Table created.

SQL>

and then the primary key and additional indexes were created:


SQL> alter table lf_hotel_temp
  2    add constraint lf_hotel_temp_PK primary key (HOTEL_CODE,service_id)
  3  /

Table altered.

SQL>
SQL> create  index LF_hb_roomtype_prop_IDX on lf_hb_roomtype_properties (HOTEL_CODE)
  2  /

Index created.

SQL>
SQL> create  index LF_ts_roomtype_prop_IDX on lf_ts_roomtype_properties (HOTEL_CODE)
  2  /

Index created.

SQL>
SQL> create index LF_gta_roomtype_prop_IDX on lf_gta_roomtype_properties (HOTEL_CODE)
  2  /

Index created.

SQL>

The original poster reported 278,000 rows in the LF_HOTEL_TEMP table so data was generated to reproduce that nunber of rows. Next the remaining tables were populated so that none of the three remaining tables shared any data:


SQL> begin
  2  	     for i in 1..278000 loop
  3  		     insert into lf_hotel_temp
  4  		     values(i, mod(i,999)+1);
  5  	     end loop;
  6
  7  	     commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL>
SQL> insert into lf_ts_roomtype_properties
  2    (hotel_code)
  3  select hotel_code from lf_hotel_temp
  4  where service_id  insert into lf_ts_roomtype_properties
  2    (hotel_code)
  3  select hotel_code from lf_hotel_temp
  4  where service_id between 313 and 642;

91740 rows created.

SQL> insert into lf_gta_roomtype_properties
  2    (hotel_code)
  3  select hotel_code from lf_hotel_temp
  4  where service_id between 271 and 823;

153743 rows created.

SQL> insert into lf_hb_roomtype_properties
  2    (hotel_code)
  3  select hotel_code from lf_hotel_temp
  4  where service_id between 571 and 999;

119262 rows created.

SQL> COMMIT;

Commit complete.

SQL>

Let’s look at the original query and its execution plan:


SQL> set autotrace on linesize 140
SQL>
SQL> -- original query
SQL> SELECT a.hotel_code
  2    FROM lf_hotel_temp a
  3  WHERE a.service_id = : p_service_id
  4  	    AND (NOT EXISTS (SELECT *
  5  	       FROM lf_ts_roomtype_properties b
  6  	      WHERE a.hotel_code = b.hotel_code)
  7  	     or NOT EXISTS (SELECT *
  8  	       FROM lf_gta_roomtype_properties b
  9  	      WHERE a.hotel_code = b.hotel_code)
 10  	    or	NOT EXISTS (SELECT *
 11  	       FROM lf_hb_roomtype_properties b
 12  	      WHERE a.hotel_code = b.hotel_code));

HOTEL_CODE
-------------------------------------------------------------------
1998
999
5994
...
243756
235764
238761

278 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4111332730

-----------------------------------------------------------------------------------------------
| Id  | Operation          | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                          |     3 |    33 |   512   (4)| 00:00:01 |
|*  1 |  FILTER            |                          |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| LF_HOTEL_TEMP            |   278 |  3058 |    94  (16)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | LF_TS_ROOMTYPE_PROP_IDX  |     1 |     7 |     3   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN | LF_GTA_ROOMTYPE_PROP_IDX |     1 |     7 |     1   (0)| 00:00:01 |
|*  5 |   INDEX RANGE SCAN | LF_HB_ROOMTYPE_PROP_IDX  |     1 |     7 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "LF_TS_ROOMTYPE_PROPERTIES" "B" WHERE
              "B"."HOTEL_CODE"=:B1) OR  NOT EXISTS (SELECT 0 FROM "LF_GTA_ROOMTYPE_PROPERTIES" "B"
              WHERE "B"."HOTEL_CODE"=:B2) OR  NOT EXISTS (SELECT 0 FROM "LF_HB_ROOMTYPE_PROPERTIES"
              "B" WHERE "B"."HOTEL_CODE"=:B3))
   2 - filter("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   3 - access("B"."HOTEL_CODE"=:B1)
   4 - access("B"."HOTEL_CODE"=:B1)
   5 - access("B"."HOTEL_CODE"=:B1)



Statistics
----------------------------------------------------------
         11  recursive calls
          0  db block gets
       2077  consistent gets
        872  physical reads
          0  redo size
       4119  bytes sent via SQL*Net to client
        570  bytes received via SQL*Net from client
         20  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        278  rows processed


SQL>

All but LF_HOTEL_TEMP use the indexes to speed access to the data; this may be due to the fact that LF_HOTEL_TEMP is a two-column table and the priomary key index will be larger than the table since it contains the table data and a rowid for each row. The main idea illustrated by this problem involves being able to find a logically equivalent way of writing the SQL. Jonathan Lewis decided that set operations might produce a ‘better’ plan and re-wrote the query as shown below, producing a plan accessing LF_HOTEL_TEMP by the primary key index:


SQL> -- JL solution
SQL> var p_service_id number
SQL> exec : p_service_id := 1

PL/SQL procedure successfully completed.

SQL>
SQL> select  /*+ dynamic_sampling(0) */
  2  	     hotel_code
  3  from    lf_hotel_temp
  4  where   service_id = :p_service_id
  5  minus   (
  6  	     select  hotel_code
  7  	     from    lf_ts_roomtype_properties
  8  	     where   hotel_code is not null
  9  	     intersect
 10  	     select  hotel_code
 11  	     from    lf_gta_roomtype_properties
 12  	     where   hotel_code is not null
 13  	     intersect
 14  	     select  hotel_code
 15  	     from    lf_hb_roomtype_properties
 16  	     where   hotel_code is not null
 17  	     )
 18  ;

HOTEL_CODE
---------------------------------------------------------------
100899
101898
102897
...
999
9990
99900


278 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 775735246

--------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                          |     1 |  2159 |     4 (100)| 00:00:01 |
|   1 |  MINUS                |                          |       |       |            |          |
|   2 |   SORT UNIQUE NOSORT  |                          |     1 |  2015 |     1 (100)| 00:00:01 |
|*  3 |    INDEX FULL SCAN    | LF_HOTEL_TEMP_PK         |     1 |  2015 |     0   (0)| 00:00:01 |
|   4 |   INTERSECTION        |                          |       |       |            |          |
|   5 |    INTERSECTION       |                          |       |       |            |          |
|   6 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     1 (100)| 00:00:01 |
|*  7 |      INDEX FULL SCAN  | LF_TS_ROOMTYPE_PROP_IDX  |     4 |    48 |     0   (0)| 00:00:01 |
|   8 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     1 (100)| 00:00:01 |
|*  9 |      INDEX FULL SCAN  | LF_GTA_ROOMTYPE_PROP_IDX |     4 |    48 |     0   (0)| 00:00:01 |
|  10 |    SORT UNIQUE NOSORT |                          |     4 |    48 |     1 (100)| 00:00:01 |
|* 11 |     INDEX FULL SCAN   | LF_HB_ROOMTYPE_PROP_IDX  |     4 |    48 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
       filter("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   7 - filter("HOTEL_CODE" IS NOT NULL)
   9 - filter("HOTEL_CODE" IS NOT NULL)
  11 - filter("HOTEL_CODE" IS NOT NULL)



Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5996  consistent gets
       2416  physical reads
     234680  redo size
       4119  bytes sent via SQL*Net to client
        570  bytes received via SQL*Net from client
         20  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        278  rows processed

SQL>

Notice that using set-based processing improved the execution plan. Because of the first item on this list it’s easy to forget that SQL is a set-based implementation; sometimes it takes ‘interesting’ problems to lead us down the set-based pathway.

As the tweet thread progressed Jeff Smith (@thatjeffsmith) offered additional items that, not specific to SQL, are good to remember nonetheless:


				4. getting results back doesn't mean your code is right.
				5. learning never stops

Number 4 is an important lesson to remember; just because you get data doesn’t necessarily mean that data is right. Many times queries are written and run only to discover that the join conditions aren’t correct or are missing entirely, making the results useless. Always think through the process, and, if possible, manually generate a few rows to know what SHOULD be returned. Comparing those to the query results should show if the logic in the query is correct.

That learning never stops is one reason blogs are written and read; there’s always something new, different, or changed that makes ‘resting on one’s laurels’ nearly impossible, especially in IT. A day doesn’t go by that I don’t learn something, and, to be honest, if that day ever comes I’ll consider it wasted.

There are more items in that list that I won’t mention here but just because they aren’t commented on doesn’t make them any less important. Read through the list (the link provided should bring up the latest chain of tweets) and ponder those responses. It’s time to think about SQL anew, especially if some of these comments weren’t in your idea of what SQL is and what it does. Perspectives change, and ideas along with them, so never stop learning. The more you know, the more you can grow.

Santa isn’t the only one who makes lists, you know.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: