It is possible to run afoul of an ORA-04031 error for no discernable reason (at least at the time the error presents itself). Being that ORA-04031 errors can have a multitude of causes and, as a result, a multitude of solutions it’s difficult to address the entire situation in one post. One particular situation, which presented two possible initial interpretations, will be discussed including the investigation and eventual solution. So, let’s dig in and see what brought forth such an investigation and where it led.
ORA-04031 errors are not usually written to the alert log; the exception to this is if a background database process is affected. From 10gR1 onwards trace files are written to the user_dump_dest; these files provide information for Oracle support in diagnosing these errors. And those same trace files can be a help to you in discovering why your database is suddenly throwing ORA-04031 errors.
Since 10gR1 Oracle has subdivided the shared pool into subheaps (actually since 9i but the algorithm has been expanded and improved from 10gR1 onwards), the number of which is determined by the SGA size and the number of processors available. An ‘extra’ subheap, subheap 0, may also be available should there be free shared pool space unallocated to any of the other subheaps (found in the shared pool reserve, usually configured to be 10% of the shared pool allocation); this subheap contains memory available for allocation to any of the active subheaps in the shared pool. Since subheap 0 does not appear in any of the SGA stat reports no unallocated memory exists; this is not to say these subheaps do not have free memory in them, just that no additional memory is available for subheap expansion.
The trace files report. in this instance, that subheap 7 is the subheap generating the ORA-04031 errors:
HEAP DUMP heap name="sga heap(7,0)" desc=380079e88
The subheap affected is the first number in the parenthesised list with the second number indicating the sub-subheap (and, since that is 0 the subheaps are not further subdivided). Using Tanel Poder’s sgastatx.sql script the overall subheap allocations can be displayed; notice that subheap 7 has a smaller allocation than the rest:
SQL> @sgastatx %
-- All allocations:
SUBPOOL BYTES MB
------------------------------ ---------- ----------
shared pool (1): 318767456 304
shared pool (2): 352326152 336
shared pool (3): 318767528 304
shared pool (4): 352321896 336
shared pool (5): 318767672 304
shared pool (6): 503317024 480
shared pool (7): 302046408 288.05
shared pool (Total): 2466314136 2352.06
8 rows selected.
-- Allocations matching "%":
old 15: AND LOWER(ksmssnam) LIKE LOWER('%&1%')
new 15: AND LOWER(ksmssnam) LIKE LOWER('%%%')
SUBPOOL NAME SUM(BYTES) MB
------------------------------ -------------------------- ---------- ----------
shared pool (1): free memory 57472568 54.81
db_block_hash_buckets 53327376 50.86
library cache 43470152 41.46
...
shared pool (2): free memory 77810608 74.21
db_block_hash_buckets 53327576 50.86
library cache 43899072 41.87
sql area 22086320 21.06
...
shared pool (4): free memory 92167560 87.9
db_block_hash_buckets 53327576 50.86
library cache 42164432 40.21
Checkpoint queue 21498432 20.5
...
shared pool (5): free memory 64019296 61.05
db_block_hash_buckets 53327888 50.86
library cache 37621344 35.88
...
shared pool (6): free memory 230536936 219.86
db_block_hash_buckets 57521680 54.86
library cache 39577624 37.74
sql area 22639112 21.59
...
shared pool (7): db_block_hash_buckets 53327376 50.86
library cache 40609216 38.73
free memory 39128512 37.32
sql area 24575688 23.44
Checkpoint queue 21760608 20.75
FileOpenBlock 18539544 17.68
ASM extent pointer array 15474656 14.76
ASH buffers 10485760 10
kglsim heap 5306112 5.06
CCursor 4617920 4.4
trace buffer 4210688 4.02
KCB Table Scan Buffer 4198400 4
PCursor 4048272 3.86
event statistics per sess 3804800 3.63
XDB Schema Cac 3671336 3.5
Sort Segment 3594240 3.43
private strands 3434496 3.28
Heap0: KGL 3413608 3.26
parameter table block 2479640 2.36
simulator hash buckets 2404928 2.29
transaction 2336176 2.23
PX subheap 2291080 2.18
sessions 2264240 2.16
dbwriter coalesce buffer 2105344 2.01
object queue 1864800 1.78
KTI-UNDO 1831024 1.75
enqueue 1336592 1.27
state objects 1313720 1.25
KGLS heap 1273704 1.21
kglsim object batch 1078056 1.03
row cache 1073184 1.02
FileIdentificatonBlock 1037624 .99
partitioning d 949920 .91
krbmror 946400 .9
procs: ksunfy 752928 .72
MTTR advisory 673920 .64
object queue hash buckets 673056 .64
buffer handles 609104 .58
db_files 599104 .57
kglsim hash table bkts 598016 .57
call 553608 .53
obj stat memo 426816 .41
type object de 371224 .35
DML lock 351232 .33
ksfqpar 332072 .32
...
1602 rows selected.
SQL>
The free memory in subheap 7 is reported as 37.32 MB where the free memory in the remaining 6 subheaps ranges from around 55MB to almost 220MB:
shared pool (1): free memory 57472568 54.81
shared pool (2): free memory 77810608 74.21
shared pool (3): free memory 59792512 57.02
shared pool (4): free memory 92167560 87.9
shared pool (5): free memory 64019296 61.05
shared pool (6): free memory 230536936 219.86
shared pool (7): free memory 39128512 37.32
Once memory is allocated to a subheap it cannot be reallocated to another subheap which may need it, thus subheap 7 cannot benefit from the ‘excess’ space in subheap 6 and throws the ORA-04031 error when it tries to expand after consuming the 37+ MB of free space it has available. If there is over 37 MB of free space left why is the allocation failing? Let’s look at V$SHARED_POOL_RESERVED and report on the total number of ORA-04031 errors generated since startup and the size of the last failing expansion attempt:
SQL> select request_failures "ORA-04031 Count",
2 last_failure_size "Size",
3 to_Char(sysdate, 'DD-MON-RRRR HH24:MI:SS') curr_dt
4 from v$shared_pool_reserved
5 /
ORA-04031 Count Size CURR_DT
--------------- ---------- --------------------
10871 4192 07-JUN-2010 14:17:34
[The data in these columns is available whether or not shared_pool_reserved_size is set, so should you have this parameter set to 0 you need not worry as the above query will still return the desired results.] Notice the last failing allocation was for just over 4 KB of memory; since ASMM is in use memory is allocated in granules of 4 MB or 16 MB in size (depending upon the size of the SGA) and in this case the granule size is 16 MB. And since the failing allocation is just over 4 KB it points to a lack of available resource rather than fragmentation of the shared pool.
If ASMM is in use (sga_target and sga_max_size are set) then increasing the value for sga_target may correct the situation by allocating more memory to the shared pool and the shared pool reserve. If manual shared memory management is in place then increasing the shared_pool_size and bouncing the database would possibly solve the problem.
Yet another option exists, which can be used in conjunction with increasing the shared pool size: setting an undocumented parameter, _kghdsidx_count, to a lower value to reduce the number of shared pool subheaps. Since this is not a dynamic parameter it will need to be set in the init.ora file or with the
alter system set “_kghdsidx_count”=[some number] scope=spfile;
statement and then ‘bounce’ the database. Of course one should not fiddle with undocumented parameters without Oracle Support’s blessings so if this situation ever affects you an SR should be created to ensure you’re performing correct and supported actions to resolve the issue. As a side note Oracle support would suggest such an adjustment in 9i databases when similar situations arose.
Oracle also states that Bugs 4467058, 5552515, and 6981690 can also throw ORA-04031 errors; these bugs still plague 10.2.0.4 (if you’re still on this release) as they are fixed in 11.1 and/or 11.2. Bug 7340448 affects 10.2.0.4 as well, increasing the shared pool memory growth from ‘create table … as select …’ statements and use of the REGEXP_LIKE functionality. These bugs are described in My Oracle Support document 396940.1.
ORA-04031 errors can be frustrating to investigate as what seems like the correct route to a solution sometimes ends up at a dead end. There are plenty of resources for this error, though (check google.com and see how many results come back), thus plenty of help is available for those who take the time to carefully examine all of the evidence presented to them via trace files and data dictionary views.
The answer is out there.
Maybe.
you said "Since ASMM is in use memory is allocated in granules of 4 MB or 16 MB in size (depending upon the size of the SGA) and in this case the granule size is 16 MB. And since the failing allocation is just over 4 KB it points to a lack of available resource rather than fragmentation of the shared pool. "Can you please explain the statement above. Shared pool subheap has 37MB and session makes over 4KB request to this pool with 16MB granule how did you come to conclusion it is because of resource but not fragmentation ? Doesn't subheap have enough room in this conditions ? thanks
Comment by Coskan Gundogar — December 30, 2010 @ 09:30 |
The slightly over 4KB allocation that fails occurs after the 37 MB of free space has been requested; the free space necessary for subheap expansion is the 37 MB of existing free space plus the 4.1 KB that the free space pool doesn't have. All of the trace files generated reported subheap 7 as the offender and a query of the shared pool fragmentation shows no sections of the shared pool that are less than at least 8 KB in size(almost all are in the MB range) thus the 37 MB of free space is, in this case, contiguous. That presumption is proven by increasing the sga_target by 1 GB, which adds space to subheap 0 (the unallocated free space pool) which then provides subheap 7 with the additional resource necessary for expansion. The revised numbers for subheap 7 after the sga_target increase and the subheap expansion:shared pool (7): free memory 60062456 57.28 Subheap 0, after allocation, was again exhausted providing no further memory to allocate to any of the 7 active subheaps (all active subheaps received additional memory from what remained after buffer pool and cache management).In an attempt to keep the post fairly brief and readable I apparently left out some important information from which the posted conclusions were drawn.
Comment by d_d_f — January 4, 2011 @ 04:14 |