Oracle Tips and Tricks — David Fitzjarrell

October 14, 2008

Is Anybody There?

Filed under: General — dfitzjarrell @ 18:14

It’s interesting to see what people submit to google.com; lately this query came to my attention:

“how to see when the table is populated in oracle”

which, on the face of it, seems to be a rather silly question, with a somewhat obvious answer:

“select count(*) from [insert table name here];”

But, is that a truly reliable way to take the inventory of a table’s contents? Let’s investigate this further and see what might not be so obvious from a cursory glance.

I’ve discussed, in more than one post, how Oracle provides a consistent image of the data for the point in time when a query begins. And I’ve proven you can’t rely upon a select statement to provide a repeatable and usable result to manually generate sequential numbers. This same mechanism can wreak havoc on determining whether a table is populated or not because at the time you may be querying the table’s contents someone else may be deleting those same records that appear before your very eyes. Absent a commit in the ‘offending’ session you’ll never know that Mercantile Flabbenjammitz just obliterated all of that data because, through the miracle of the UNDO records, you can still see every byte that used to be stored in that table. Along comes a commit and — whoosh! — the data you saw just a moment earlier now vanishes.

So how DOES one know when a table is populated? Popular ‘wisdom’ provides several possible answers:

Table statistics
The High-Water Mark (HWM)
The aforementioned “select count(*) from …”
Querying DBA_EXTENTS

How many of those are reliable? Let’s find out.

Table statistics are a really good indicator of whether or not a table is populated, until, well, it isn’t populated any more:

SQL> create table blarzenflotz(
  2        glerm   number,
  3        snangul varchar2(40),
  4        borm    number
  5  );

Table created.

SQL>
SQL> begin
  2        for i in 1..1000000 loop
  3         insert into blarzenflotz
  4         values(i, 'Schnerkenporf vasul '||i, mod(i*47, 3));
  5        end loop;
  6
  7        commit;
  8
  9  end;
 10  /

PL/SQL procedure successfully completed.

SQL>
SQL> begin
  2        for i in 1..1000000 loop
  3         insert into blarzenflotz
  4         values(i, 'Schnerkenporf vasul '||i, mod(i*47, 3));
  5        end loop;
  6
  7        commit;
  8
  9  end;
 10  /

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BLARZENFLOTZ', estimate_percent => 100)

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*)
  2  from blarzenflotz;

  COUNT(*)
----------
   2000000

SQL>
SQL> select num_rows
  2  from user_tables
  3  where table_name = 'BLARZENFLOTZ';

  NUM_ROWS
----------
   2000000

SQL>
SQL> delete from blarzenflotz;

2000000 rows deleted.

SQL>
SQL> select count(*)
  2  from blarzenflotz;

  COUNT(*)
----------
         0

SQL>
SQL> select num_rows
  2  from user_tables
  3  where table_name = 'BLARZENFLOTZ';

  NUM_ROWS
----------
   2000000

SQL>

Without gathering new statistics after such an operation (and that’s not likely to occur since statistics gathering is usually a scheduled task) you can see how deceiving they can be. There are no rows in the table, yet the statistics state otherwise. Hmmmm. A similar fate befalls you if someone else, in another session, is deleting rows from the table of interest and has yet to commit the deed. You’ll go happily along, ‘knowing’ there are two-million rows in the table, blissfully unaware that Morgenstern Slapdragon has just deleted every last one of them. Your ‘select count(*) from …’ query returns the expected 2000000 as a result, and old Morgenstern, running the same query, reports that nothing is left of that data. Sneak in a ‘commit’ and now both queries return what Morgenstern knew five minutes ago — the data in that table is gone.

Will querying DBA_EXTENTS tell you anything of worth with respect to the actual data population? I’m afraid not, as that view will only tell you that X extents have been allocated to the object/segment; there is no information in that view to report if those extents are actually populated.

The high-water mark won’t tell you anything, either, as that is the end of the blocks which have or did have data in them. The high-water mark is not reset by deletes, only by a truncate, so deleting all of the data in a table won’t move the HWM, and using that to indicate data population is as reliable as the statistics or the ‘select count(*) from …’ query in a multi-user environment. And even using ROWID is suspect, as those are retrieved from the UNDO blocks to return a consistent data set regardless of the state or number of uncommitted transactions against the table or tables queried.

So how DO you know when a table is populated? You play your cards and you take your chances. Odds are in your favor such tools as described here will provide a good idea whether a table is populated or not. Just remember that because of Oracle’s read consistency mechanism sometimes those results could be wrong.

But the odds of the results being correct are far better than any you can get in Las Vegas.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: