After dozen of experiments I finally come to next action sequence with gdb as the most universal/reliable one.
Good case we have access to client libraries
- Check gdb version. it should be 6.x the later the better.
- Create directory D inside you working one
- Create D/gdbrc file with *full* path to your directory
set solib-absolute-prefix /home/dms/Sept12/12_09_2008_20_00_node4/D
Notice: set substitute-path doesn't work because gdb apply it to source files only
- symlink apropriate D/java
- run
gdb -x D/gdbrc D/java core
- type
info shared
You will see something like:
(gdb) info shared From To Syms Read Shared Object Library No /lib/tls/libpthread.so.0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0x00a05bb0 0x00a068c4 Yes /home/dms/Sept12/D/lib/libdl.so.2 ... 0x008c9c00 0x009b9800 Yes /home/dms/Sept12/D/lib/tls/libc.so.6 No /lib/libnsl.so.1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- leave gdb and copy or link missed library under D.
In my case:
D/lib/tls/lib/tls/libpthread.so.0 D/lib/libnsl.so.1
To gather all libraries necessary to open a coredump on other machine run:
gdb -batch --eval "info shared" D/java core 2> /dev/null |\ sed -n -e 's/^.*Yes[^\/]*\//\//p' -e 's/^.*No[^\/]*\//\//p' > filelist
on your own machine and than:
cat filelist | zip zipme.zip -@
on client one
Bad case - we don't have access to original libraries. We still can restore JVM part of stack trace manually.
To do it:
- Run some java app (e.g. Java2Demo.jar) with exactly the same version of jdk and JVM part of command line.
- Kill it by kill -BUS to get a core.
- Open core with gdb and check whether upper part of stack trace match customer's one -
#0 0xffffe424 in __kernel_vsyscall () #1 0xb76f96e0 in raise () from /lib/libc.so.6 #2 0xb76faf15 in abort () from /lib/libc.so.6 #3 0xb70abbaf in os::abort(bool) #4 0xb71de555 in VMError::report_and_die() #5 0xb70b257c in JVM_handle_linux_signal () #6 0xb70ae7a4 in signalHandler(int, siginfo*, void*) () #7 <signal handler called>
Ever without symbols - you should have exactly 6 entries before <> and os::abort is right the next after libc abort.
- Type info shared and get addresses where jvm is loaded:
0xb6bf2bd0 0xb71fe250 Yes (*) /opt/jdk1.6.0_18/jre/lib/i386/server/libjvm.so
- Calculate offset of os:abort : 0xb70abbaf - 0xb6bf2bd0 = 0x4b8fdf
- Calculate jvm size: 0xb71fe250 - 0xb6bf2bd0 = 0x60b680
- Go to cu core and calculate jvm start and end addresses
<os::abort> - 0x4b8fdf = N, N + 0x60b680
*Calculate offset between your JVM and cu JVM
<os::abort> - 0xb70abbaf (os:abort from my core, or compare two JVM starts)
- Get stacktrace offset, check whether it within range. apply difference
- Go to your coredump and type
info address <recalcualted_address>