Monday, March 19, 2012

ORADEBUG How To


What is ORADEBUG:

ORADEBUG is a debugger tool, can be used for (tracing any session, dump DB memory structure, suspend/resume a session and the most useful use is to analyze the hang when an instance is hanging by creating a report shows the blocked/blocker sessions).

How to use ORADEBUG:

First login to SQLPLUS with prelim option AS SYSDBA. prelim option will allow you connect to SGA without opening a session, it cannot be used in an already connected session.

# sqlplus /nolog
SQL> set _prelim on
SQL> conn / as sysdba

or:
# sqlplus -prelim "/ as sysdba"


Second set your PID:

SQL> oradebug setmypid

To perform a DB cluster wide analysis:

SQL> oradebug setinst all

To make the tracefile size unlimited:

SQL> oradebug unlimit

To Run the Hang Analysis: <The most useful use for ORADEBUG>

SQL> oradebug -g def hanganalyze <level#>

-> This will create a logfile and will give you it's path.

-Available levels are:
10     Dump all processes (IGN state)
5      Level 4 + Dump all processes involved in wait chains (NLEAF state)
4*    Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3      Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2  Only HANGANALYZE basic output, no process dump at all

*Level 4 and above are resource intensive + large output, which can impact the instance performance.
and thus Oracle recommends to use level 3: 
SQL> ORADEBUG -g def hanganalyze 3


How to read HANGANALYZE trace:

Generally: Read the log and search for these keywords "final blocker" if not found search for "blocked by" & "blocking".

Note: Open chains consist of processes waiting for a resource held by a blocker. These are not true hangs and can be resolved by killing the holding session and allowing the blocked processes to proceed. This can be useful when a process is holding a latch, frequent example is the library latch, and other processes are waiting for the latch.


Check section "State of ALL nodes" At the end of the HANGANALYZE log :

The data under this line means:
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):

"cnode" column means: cluster node number
"sid" is the Oracle session ID!
"sess_srno" column means: SERIAL#
"ospid" The operating system process ID
"state" columns have the following vlaues:

Session with status "LEAF" or "LEAF_NW" means a session blocking others "Good candidate for kill" specially "LEAF".
Session with status "NLEAF" means waiting session and appear to be hanging. 
Session with status "IN_HANG" means there is a problem!
Session with status "IGN" or "IGN_DMP" means the session is idle.


Caution: Do not kill critical processes like SMON or PMON as that would terminate the instance.

Other Commands:

To Check the current trace file name:

SQL> oradebug tracefile_name

Flush any pending writes to the trace file and close it:

SQL> oradebug flush
SQL> oradebug close_trace

To trace a specific session with it's OS PID:

SQL> oradebug setospid

To trace a specific session with it's Oracle PID:

SQL> oradebug setorapid


Using ORADEBUG to freeze/suspend a session:

First you have to get the PID for that session and set ORADEBUG as it:
SQL> oradebug setospid 12518

*12518 is the PID for the session you want to kill

Second Freeze/Hang the session even it running in a middle of something it will freeze:
SQL> oradebug suspend

To UnFreeze/Resume the session to continue it's work:
SQL> oradebug resume


Trace Oracle Clusterware:

Trace CRS events:

SQL> oradebug dump crs 3

Trace CSS behaviour:

SQL> oradebug dump css 3

Trace OCR:

SQL> oradebug dump ocr 3

Remember:
-You can run oradebug in parallel using different sqlplus sessions.

-ORADEBUG utility is poorly documented by Oracle because the caveats with using this tool to avoid potential damage to the database when calling kernel functions.

Reference: http://mgogala.byethost5.com/oradebug.pdf

No comments: