Oracle Tips and Tricks — David Fitzjarrell

August 29, 2017

Parallel Reality

Filed under: Performance — dfitzjarrell @ 14:33

"Expect everything, I always say, and the unexpected never happens."
-- Norton Juster, The Phantom Tollbooth

The following question was recently posted in an Oracle forum:


hi Friends,
I see  this wait event latch: parallel query alloc buffer, when a job meant for doing some cleanup ran this query.
Why does this wait event come happen? , i searched google,MOS no exact hit for explanation of the exact same event.
Looking at query does it happen because of the incorrect use of parallel hint i.e. no object mentioned and also it
is not specified like parallel(5) .  It is not causing any big  issues, just want to know for my understanding sake.
 
Regd,
Sachin

There were also queries, results and parallel execution plans posted but those aren’t included here. The main issue with this question is the lack of information on the following event:


				latch: parallel query alloc buffer

Parallel query slaves and coordinators allocate message buffers; what may be surprising is the sheer number of these message buffers that can be allocated for a given parallel execution. There’s a forumla that can be used to calculate the number of buffers required based upon the parallel degree calculated or specified; let’s call that value p. Given that information the buffers required would be calculated as:


				p(p+1)*3

for non-RAC databases and:


				p(p+1)*4

for RAC installations. Let’s go through some calculations for non-RAC systems to see how many buffers could be required for various degrees of parallelism. We’ll start with something simple and consider a query executed with a parallel degree of 4; using the first formula provided we get:


				4(4+1) * 3 -> 4(5) * 3 -> 20 * 3 -> 60

so, for a relatively small-scale parallel execution Oracle will need 60 message buffers for the parallel query coordinator and its slaves. Let’s consider a parallel degree of 16; since the degree is 4 times the size of the original query (4) one might expect that the number of buffers might be 240 but that would be considerably less than Oracle would require:


				16(16+1) * 3 -> 16(17) * 3 -> 272 * 3 -> 816

By running at a parallel degree of 16 Oracle now requires 816 message buffers, 13.6 times the number of buffers for the parallel degree 4 execution. Since it’s not uncommon with today’s servers let’s run one more calculation, this time at a parallel degree of 128:


				126(128+1) * 3 -> 128(129) * 3 -> 16512 * 3 -> 45936

Now Oracle is allocating (or attempting to, anyway) 45,936 message buffers. Given that number of buffers to allocate it’s not difficult to understand why such a latch wait could be in the top waits for a given database. One factor affecting the length of time it takes Oracle to allocate such a large number of message buffers is where the PX message pool (‘PX msg pool’) is located; the shared pool or the large pool are common areas where this pool can be found. To find out where, in your database configuration, the ‘PX msg pool’ is found the following query can be run:


break on pool skip 1
column name format a32
 
select * from v$sgastat where pool = 'large pool' or name like '%PX%' or name like '%para%' 
order by pool, name;

The results might look like these:


FNERBLE @ uulooloo > break on pool skip 1
FNERBLE @ uulooloo > column name format a32
FNERBLE @ uulooloo >
FNERBLE @ uulooloo > select * from v$sgastat where pool = 'large pool' or name like '%PX%' 
                2  > or name like '%para%' order by pool, name;

POOL         NAME                                  BYTES     CON_ID
------------ -------------------------------- ---------- ----------
large pool   PX msg pool                        15728640          0
             free memory                        17825792          0

shared pool  PX QC deq stats                        1696          0
             PX QC msg stats                        3080          0
             PX list of chunk lists                 1088          0
             PX msg pool struct                       56          0
             PX scan rate stats                     1080          0
             PX server deq stats                    1696          0
             PX server msg stats                    3080          0
             PX subheap                           314824          0

POOL         NAME                                  BYTES     CON_ID
------------ -------------------------------- ---------- ----------
shared pool  PX subheap desc                         256          0
             parallel kcbibr                          32          0
             parallel kcbibr dbwr bitv                16          0
             parallel_max_servers                  29440          0
             param hash values                     15872          0
             parameter blocks                      31744          0
             parameter handle                     153624          0
             parameter string values              290632          0
             parameter table block               1174584          0
             parameter text value                   9576          0
             parameter value memory                 1376          0

21 rows selected.

FNERBLE @ uulooloo >

In my database the ‘PX msg pool’ is found in the large pool, a desirable location since it’s probably easier to allocate such buffers there than in the shared pool; using the shared pool could cause the ‘latch: parallel query alloc buffer’ wait to have wait times that could put it as a top-10 wait in an AWR or Statspack report due to contention for available resources.

Looking at another database, this time using version 11.2.0.3, with that same query produces:


FNERBLE @ uulooloo > break on pool skip 1
FNERBLE @ uulooloo > 
FNERBLE @ uulooloo > column name format a32
FNERBLE @ uulooloo > select * from v$sgastat where pool = 'large pool' or name like '%PX%' 
                2  > or name like '%para%' order by pool, name;



POOL         NAME                                  BYTES
------------ -------------------------------- ----------
large pool   PX msg pool                         8192016
             free memory                        16973808

shared pool  PX QC deq stats                        1696
             PX QC msg stats                        2904
             PX list of chunk lists                 1632
             PX msg pool struct                       56
             PX server deq stats                    1696
             PX server msg stats                    2904
             PX subheap                            95592
             parallel kcbibr dbwr bitv                 8

POOL         NAME                                  BYTES
------------ -------------------------------- ----------
shared pool  parallel_max_servers                   5472
             param hash values                     11008
             parameter blocks                      22008
             parameter handle                     292400
             parameter string values              197856
             parameter table block               2179872
             parameter text value                   7328
             parameter value memory                  712


18 rows selected.

FNERBLE @ uulooloo >

Again we see the ‘PX msg pool’ located in the large pool, but don’t take that for granted as earlier releases may put that pool in the shared pool which could create long waits on the ‘latch: parallel query alloc buffer’ event.

Let’s now go through some calculations for RAC systems to see how those numbers change. We’ll use the same parallel degrees found in the first set of calculations:


				4(4+1) * 4 -> 4(5) * 4 -> 20 * 4 -> 80

so, for a relatively small-scale parallel execution Oracle will need 80 message buffers for the parallel query coordinator and its slaves for a RAC configuration, 20 more than the non-RAC example. Let’s consider a parallel degree of 16:


				16(16+1) * 4 -> 16(17) * 4 -> 272 * 4 -> 1088

By running at a parallel degree of 16 in a RAC configuration Oracle now requires 1088 message buffers. As Exadata and ODA sytems are more common now than they were a few years ago a parallel degree of 128 is not unheard of, even with Auto DOP. For such a query execution we find that Oracle will need:


				126(128+1) * 4 -> 128(129) * 4 -> 16512 * 4 -> 66048

Now Oracle is allocating 66,048 message buffers, a large number, indeed. On ‘lesser’ systems (read that as ‘commodity servers’) such a buffer allocation could easily and quickly generate long event wait times for the ‘parallel query alloc buffer’ latch; such long waits could still be possible on Oracle’s engineered systems if the overall utilization is high.

It may not be a top-10 wait in your database but it’s good to be aware of what that latch wait event represents and why it can possibly generate long wait times, especially with a parallel degree of 16 or higher, because the number of message buffers Oracle requires can skyrocket.

And that should be expected.

Blog at WordPress.com.