Using Google Chrome?

Download my
free Chrome Extension, Power Notes Searcher, to make searching for and evaluating SAP notes, much easier.

Recent Posts

Monday, August 01, 2011

SAP BI - ORA-04031 - Oracle 10.2.0.5 - Bug 9737897 - End Of.

Print Friendly and PDF

If you use SAP BI 7.0 on an Oracle 10.2.0.5 64bit database you may experience this issue.
Our BI system performs tha delta loads for FI data overnight.
These are large loads of records into the cubes.

The problem is Oracle error 04031:
----------------------------------------------------------------------------------
Mon Jul 11 00:46:26 GMT Daylight Time 2011
Errors in file j:\oracle\bwp\saptrace\usertrace\bwp_ora_3716.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")

ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")

ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")

Mon Jul 11 00:46:26 GMT Daylight Time 2011
Errors in file j:\oracle\bwp\saptrace\background\bwp_smon_8756.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select file#, block#, ts# fr...","Typecheck","seg:kggfaAllocSeg")

Mon Jul 11 00:46:27 GMT Daylight Time 2011
Errors in file j:\oracle\bwp\saptrace\background\bwp_smon_8756.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select file#, block#, ts# fr...","Typecheck","seg:kggfaAllocSeg")

----------------------------------------------------------------------------------


Once it hits this problem, it causes the rest of the chain to fail on the index rebuilds.
Sometimes the user queries also fail in the Bex Web front-end. The user sees a similar error as those shown above.

Using V$SGASTAT I was able to take snapshots of the Shared Pool usage via a Windows scheduled task.
I traced the problem to the memory allocation of "obj stat memo" in the Shared Pool of the database.
It's constantly growing, and although the "free memory" reduces slightly, it doesn't seem to correlate to the increase in "obj stat memo" usage.

So next step was looking on Oracle Support.
I found bug 9737897, which is valid *upto* 10.2.0.4. But it does not state that it has actually been fixed in 10.2.0.5.
The bug description says: "An SGA memory leak is possible for "obj stat memo" and "obj htab chun"
memory if a segment is dropped and recreated many times."

The bug note recommends querying x$ksolsfts (see here about this table http://yong321.freeshell.org/computer/x$table.html).
The following SQL is recommended by Oracle:

select distinct fts_objd, fts_objn from x$ksolsfts x
where not exists (select 1 from obj$ o
where o.dataobj#=x.fts_objd and o.obj#=x.fts_objn);

The note says that a high number of records returned could indicate that you are seeing the symptoms of the bug.
I was seeing over 160,000 records in a database that had been restarted 24 hours prior.

I produced a lovely Excel chart showing the snapshot times against the memory use of the "obj stat memo" component.
The chart showed significant increases (step ups) of approximately 10MB at specific points in time.
I was then able to search SM37 using the "extended" option, to show background jobs active during the memory "step ups" identified by the chart.
Generally I saw 10-20 jobs active at the time, but the correlation showed that more often than not, the "step ups" were at near enough the same time as a BI_PROCESS_INDEX job.

Looking in the background job log, I was able to identify the process chain and the type of index/load that was being performed.
The indexes in question were FI cube related. They were quite large indexes and they were bitmap indexes ("CREATE BITMAP INDEX...").
Each night these indexes got re-built after a load.
The re-build command was visible in the background job log and included "COMPUTE STATISTIC".

So I had successfully identified a segment(s) that was being dropped and re-created regularly.
This met with the Oracle bug description.

At the bottom of the bug note, it states:

"Note:
If you suffer "obj stat memo" memory leak problem in 10.2.0.5, you can try
_object_statistics = false as a workaround, or following init parameter to see
if it solves problem.
_disable_objstat_del_broadcast=false"

So what does SAP say about this.
Well the following SAP notes were/are available:
1120481 - ORA-4031 - memory allocation errors for object-level stat
1070102 - High memory allocation by 'obj stat memo'
869006 - Composite SAP note: ORA-04031

The only note of relevance is 1120481 which has a short reference to Oracle 10.2.0.4.
The other notes tend to point the read at just blindly increasing the Share Pool. I tried this, it didn't work when 900MB was used by "obj stat memo"!

So, stuck with only one possible solution (other than upgrading to 11g), I can confirm that setting "_object_statistics = false" (alter system set "_object_statistics"=false scope=spfile;) and bouncing the database solved the problem as it completely removes the use of the "obj stat memo" memory area (it's no longer visible in V$SGASTAT).

2 comments:

Anonymous said...

Thank you for your sharing

giri said...

Excellent explanation, this really helped me in resolving same kind of issue. Thanks for your posting.