Siebel OM crash rca and reproduction scenario identified in 2 hours

Jul 20

Siebel OM crash rca and reproduction scenario identified in 2 hours

One of our customers just realized they would have saved between 2 weeks and 1.5 months of severe business impact (visible @ executive level) had they used germain CRT and germain APM combined! (thanks to our client for providing us this feedback!). Oracle Support spent several weeks investigating that issue and as of today (1.5month later) has still not found the exact location of the crash… germain CRT found the location of that crash (at the escript line level) and the user scenario that caused the crash, in just 2 hours.

Here is some details on how this investigation went:

6/21/18 – CUSTOMER tells the GERMAIN TEAM: “Our team has delayed taken corrective action on a production bug for a week now trying to do RCA on their own. It is becoming “nuclear” with our business….we do not know the exact scenario of the crash. We are trying to work with Oracle support for the moment and they are telling us it has something to do with a problem with us destroying objects in our scripts”

6/22/18 – GERMAIN TEAM responds to the CUSTOMER: “germain CRT found the root-cause of the crash and tuning recommendations and here are below the germain CRT screenshots showing you how we did it”

6/27/18 – ORACLE tells the CUSTOMER:
“Oracle SR# 3-17748586591.
1. you are not nullifying the objects in right order.
2. There is no try catch finally block at Action BD Remarks Custom BC, we are not nullifying the objects.
3. Enable trace parameters at OM level as we cannot find the exact location of the crash”

6/27/18 – GERMAIN TEAM responds to CUSTOMER: “nullifying objects in the wrong order is true (as reported on 06/22/18) yet it is only one of the several causes. Germain CRT found a number of other issues that will lead to that Siebel OM crash. See screenshots below.”

6/28/18 – CUSTOMER says to GERMAIN TEAM: “This is the First time I am feeling so much satisfied with Germain CRT/APM . Since from the beginning when I started using these Germain Tools, I felt it is not giving proper information and it was easy for me to search in the log instead of Germain APM. Now I totally agree with you that if we would have used this Germain CRT/APM we could have saved lot of time and effort !!”

07/25/18 – (as we are writing this article) Oracle is still looking for the exact location of the crash

Using germain APM:
Germain APM alerted the siebel Ops Team that a crash occurred and is provided the root-cause analysis on its dashboard…here are bits and pieces of it for that crash:

First, germain APM identified the list of users affected by that OM crash

Then germain APM provided the User Scenario that caused the OM crash, scenario that can be used to reproduce the crash (our customer just had to “watch” what the user was doing via this germain APM’s video recording/replay mechanism of any user session which works for any web app like siebel, etc)

Then germain APM identified that this OM crash was not caused by a memory leak

Then germain APM analyzed the FDR file and reported the siebel eScript that was executed before OM crash, and placed a “tuning recommendations” light bulb to allow you to view germain CRT tuning recommendations for that siebel escript or/and repository object

Using germain CRT now:
(A click on that light bulb rerouted the user to germain CRT:)

germain CRT identified the 1st root-cause of the OM crash

the first root-cause of that crash happens to be in a siebel escript:



…the germain CRT recommendations are crystal clear…siebel OM will crash!


Then there is that second root-cause (oracle did report that this one should be the root-cause of the crash, based on the core stack, but oracle has still not found the exact escript that is causing it, 1.5 month later as we are writing this article …on the contrary, germain CRT did find the exact escript and the actual location and variable !


And the actual siebel eScript is right here (the proof right there!):


Then germain CRT found a 3rd root-cause causing this OM crash


And here in that case, germain CRT even provides the actual Oracle Tech Note (that are typically meant to emphasize on critical dev best-practice that needs to be followed by developers!)


Then germain CRT also found a 4th root-cause causing this OM crash

And once more here are the tuning recommendations provided by germain CRT:

All this analysis performed within 2 hours (or within minutes once you know germain APM and germain CRT)! this is our customer smiling: