When your application dies with the
"Segmentation fault" error
(generated by the default SIGSEGVsignal handler)
and generates a core file, you can analyze the
core file using gdb or a similar debugger to
find out what caused the segmentation fault (or
segfault).
21.6.1. Getting Ready to Debug
To debug the
core file, you may need to recompile Perl and
mod_perl so that their executables contain debugging symbols. Usually
you have to recompile only mod_perl, but if the
core dump happens in the
libperl.so library and you want to see the whole
backtrace, you will probably want to recompile Perl as well.
For example, sometimes people send this kind of backtrace to the
mod_perl list:
#0 0x40448aa2 in ?? ( )
#1 0x40448ac9 in ?? ( )
#2 0x40448bd1 in ?? ( )
#3 0x4011d5d4 in ?? ( )
#4 0x400fb439 in ?? ( )
#5 0x400a6288 in ?? ( )
#6 0x400a5e34 in ?? ( )
This kind of trace is absolutely useless, since you cannot tell where
the problem happens from just looking at machine addresses. To
preserve the debug symbols and get a meaningful backtrace, recompile
Perl with
-DDEBUGGING during the
./Configure stage (or with
-Doptimize="-g", which, in addition to adding
the -DDEBUGGING option, adds the
-g option, which allows you to debug the Perl
interpreter itself).
After recompiling Perl,
recompile mod_perl with
PERL_DEBUG=1 during the perl
Makefile.PL stage. Building mod_perl with
PERL_DEBUG=1 will:
Add -g to EXTRA_CFLAGS,
passed to your C compiler during compilation.
Turn on the PERL_TRACE option.
Set PERL_DESTRUCT_LEVEL=2.
Link against libperld if -e
$Config{archlibexp}/CORE/libperld$Config{lib_ext} (i.e., if
you've compiled perl with
-DDEBUGGING).
During make install, Apache strips all the
debugging symbols. To prevent this, you should use the Apache
—without-execstrip./configure option. So if you configure Apache
via mod_perl, you should do this:
Alternatively, you can copy the unstripped binary manually. For
example, we did this to give us an Apache binary called
httpd_perl that contains debugging symbols:
The next stage is to create a package that
aborts abnormally with a segfault, so you will be able to reproduce
the problem and exercise the debugging technique explained here.
Luckily, you can download
Debug::DumpCore from CPAN, which does a very simple
thing—it segfaults when called as:
use Debug::DumpCore;
Debug::DumpCore::segv( );
Debug::DumpCore::segv( ) calls a function, which
calls another function, which dereferences a NULL
pointer, which causes the segfault:
int *p;
p = NULL;
printf("%d", *p); // cause a segfault
For those unfamiliar with C programming, p is a
pointer to a segment of memory. Setting it to NULL
ensures that we try to read from a segment of memory to which the
operating system does not allow us access, so of course dereferencing
the NULL pointer through *p
causes a segmentation fault. And that's what we
want.
Of course, you can use Perl's CORE::dump(
) function, which causes a core dump,
but you don't get the nice long trace provided by
Debug::DumpCore, which on purpose calls a few
other functions before causing a segfault.
21.6.3. Dumping the core File
Now let's dump the
core file from within the mod_perl server.
Sometimes the program aborts abnormally via the
SIGSEGVsignal (a segfault), but no
core file is dumped. And without the
core file it's hard to find the
cause of the problem, unless you run the program inside
gdb or another debugger in the first place. In
order to get the core file, the application
must:
Have the same effective UID as the real UID (the same goes for GID).
This is the case with mod_perl unless you modify these settings in
the server configuration file.
Be running from a directory that is writable by the process at the
moment of the segmentation fault. Note that the program might change
its current directory during its run, so it's
possible that the core file will need to be
dumped in a different directory from the one from which the program
was started. For example when mod_perl runs an
Apache::Registryscript, it changes its directory
to the one in which the script's source is located.
Be started from a shell process with sufficient resource allocations
for the core file to be dumped. You can override
the default setting from within a shell script if the process is not
started manually. In addition, you can use
BSD::Resource to manipulate the setting from
within the code as well.
You can use ulimit for a Bourne-style shell and
limit for a C-style shell to check and adjust
the resource allocation. For example, inside
bash, you may set the core
file size to unlimited with:
panic% ulimit -c unlimited
or for csh:
panic% limit coredumpsize unlimited
For example, you can set an upper limit of 8 MB on the
core file with:
panic% ulimit -c 8388608
This ensures that if the core file would be
bigger than 8 MB, it will be not created.
You must make sure that you have enough
disk space to create a big
core file (mod_perl core
files tend to be of a few MB in size).
Note that when you are running the program under a debugger like
gdb, which traps the SIGSEGV
signal, the core file will not be dumped.
Instead, gdb allows you to examine the program
stack and other things without having the core
file.
First let's test that we get the
core file from the command line (under
tcsh):
Indeed, we can see that the core file was
dumped. Let's write a simple script that uses
Debug::DumpCore, as shown in Example 21-9.
Example 21-9. core_dump.pl
use strict;
use Debug::DumpCore ( );
use Cwd( )
my $r = shift;
$r->send_http_header("text/plain");
my $dir = getcwd;
$r->print("The core should be found at $dir/core\n");
Debug::DumpCore::segv( );
In this script we load the Debug::DumpCore and
Cwd modules. Then we acquire the request object
and send the HTTP headers. Now we come to the real part—we get
the current working directory, print out the location of the
core file that we are about to dump, and finally
call Debug::DumpCore::segv( ), which dumps the
core file.
Before we run the script we make sure that the shell sets the
core file size to be unlimited, start the server
in single-server mode as a non-root user, and
generate a request to the script:
panic% cd /home/httpd/httpd_perl/bin
panic% limit coredumpsize unlimited
panic% ./httpd_perl -X
# issue a request here
Segmentation fault (core dumped)
Our browser prints out:
The core should be found at /home/httpd/perl/core
And indeed the core file appears where we were
told it would (remember that Apache::Registry
scripts change their directory to the location of the script source):
To see the
backtrace,
execute the where or
bt commands:
(gdb) where
#0 0x4039f781 in crash_now_for_real (
suicide_message=0x403a0120 "Cannot stand this life anymore")
at DumpCore.xs:10
#1 0x4039f7a3 in crash_now (
suicide_message=0x403a0120 "Cannot stand this life anymore",
attempt_num=42) at DumpCore.xs:17
#2 0x4039f824 in XS_Debug_ _DumpCore_segv (cv=0x84ecda0)
at DumpCore.xs:26
#3 0x401261ec in Perl_pp_entersub ( )
from /usr/lib/perl5/5.6.1/i386-linux/CORE/libperl.so
#4 0x00000001 in ?? ( )
Notice that only the symbols from the
DumpCore.xs file are available (plus
Perl_pp_entersub from
libperl.so), since by default
Debug::DumpCore always compiles itself with the
-g flag. However, we cannot see the rest of the
trace, because our Perl and mod_perl libraries and Apache server were
built without the debug symbols. We need to recompile them all with
the debug symbols, as explained earlier in this chapter.
Then we repeat the process of starting the server, issuing a request,
and getting the core file, after which we run
gdb again against the executable and the dumped
core file:
(gdb) bt
#0 0x40448aa2 in crash_now_for_real (
suicide_message=0x404499e0 "Cannot stand this life anymore")
at DumpCore.xs:10
#1 0x40448ac9 in crash_now (
suicide_message=0x404499e0 "Cannot stand this life anymore",
attempt_num=42) at DumpCore.xs:17
#2 0x40448bd1 in XS_Debug_ _DumpCore_segv (my_perl=0x8133b60, cv=0x861d1fc)
at DumpCore.xs:26
#3 0x4011d5d4 in Perl_pp_entersub (my_perl=0x8133b60) at pp_hot.c:2773
#4 0x400fb439 in Perl_runops_debug (my_perl=0x8133b60) at dump.c:1398
#5 0x400a6288 in S_call_body (my_perl=0x8133b60, myop=0xbffff160, is_eval=0)
at perl.c:2045
#6 0x400a5e34 in Perl_call_sv (my_perl=0x8133b60, sv=0x85d696c, flags=4)
at perl.c:1963
#7 0x0808a6e3 in perl_call_handler (sv=0x85d696c, r=0x860bf54, args=0x0)
at mod_perl.c:1658
#8 0x080895f2 in perl_run_stacked_handlers (hook=0x8109c47 "PerlHandler",
r=0x860bf54, handlers=0x82e5c4c) at mod_perl.c:1371
#9 0x080864d8 in perl_handler (r=0x860bf54) at mod_perl.c:897
#10 0x080d2560 in ap_invoke_handler (r=0x860bf54) at http_config.c:517
#11 0x080e6796 in process_request_internal (r=0x860bf54) at http_request.c:1308
#12 0x080e67f6 in ap_process_request (r=0x860bf54) at http_request.c:1324
#13 0x080ddba2 in child_main (child_num_arg=0) at http_main.c:4595
#14 0x080ddd4a in make_child (s=0x8127ec4, slot=0, now=1028133659)
#15 0x080ddeb1 in startup_children (number_to_start=4) at http_main.c:4792
#16 0x080de4e6 in standalone_main (argc=2, argv=0xbffff514) at http_main.c:5100
#17 0x080ded04 in main (argc=2, argv=0xbffff514) at http_main.c:5448
#18 0x40215082 in _ _libc_start_main ( ) from /lib/i686/libc.so.6
Reading the trace from bottom to top, we can see that it starts with
Apache functions, moves on to the mod_perl and then Perl functions,
and finally calls functions from the
Debug::DumpCore package. At the top we can see the
crash_now_for_real( ) function, which was the one
that caused the segmentation fault; we can also see that the faulty
code was at line 10 of the DumpCore.xs file. And
indeed, if we look at that line number we can see the reason for the
segfault—the dereferencing of the NULL
pointer:
9: int *p = NULL;
10: printf("%d", *p); /* cause a segfault */
In our example, we knew what Perl script had caused the segmentation
fault. In the real world, it is likely that you'll
have only the core file, without any clue as to
which handler or script has triggered it. The special
curinfogdb macro can help:
Start the gdb debugger as before.
.gdbinit, the file with various useful
gdb macros, is located in the source tree of
mod_perl. We use the gdbsource function to load these macros, and when
we run the curinfo macro we learn that the
core was dumped when
/home/httpd/perl/core_dump.pl was executing the
code at line 9.
These are the bits of information that are important in order to
reproduce and resolve a problem: the filename and line number where
the fault occurred (the faulty function is
Debug::DumpCore::segv( ) in our case) and the
actual line where the segmentation fault occurred (the
printf("%d", *p) call in XS code). The former is
important for problem reproducing, since it's
possible that if the same function was called from a different script
the problem wouldn't show up (not the case in our
example, where using a dereferenced NULL pointer
will always cause a segmentation fault).
21.6.5. Extracting the Backtrace Automatically
With the help
of
Debug::FaultAutoBT, you can try to get the backtrace
extracted automatically, without any need for the
core file. As of this writing this CPAN module
is very new and doesn't work on all platforms.
To use this module we simply add the following code in the startup
file:
use Debug::FaultAutoBT;
use File::Spec::Functions;
my $tmp_dir = File::Spec::Functions::tmpdir;
die "cannot find out a temp dir" if $tmp_dir eq '';
my $trace = Debug::FaultAutoBT->new(dir => "$tmp_dir");
$trace->ready( );
This code tries to automatically figure out the location of the
temporary directory, initializes the
Debug::FaultAutoBT object with it, and finally
uses the method ready( ) to set the signal
handler, which will attempt to automatically get the backtrace. Now
when we repeat the process of starting the server and issuing a
request, if we look at the error_log file, it
says:
SIGSEGV (Segmentation fault) in 29072
writing to the core file /tmp/core.backtrace.29072
And indeed the file /tmp/core.backtrace.29072
includes a backtrace similar to the one we extracted before, using
the core file.
21.5. Debugging Perl Code
21.7. Hanging Processes: Detection and Diagnostics