[mp2] Win32 apache2 "Restarting" underload

[prev] [thread] [next] [Date index for 2004/12/01]

From: Ben Mehling
Subject: [mp2] Win32 apache2 "Restarting" underload
Date: 00:55 on 01 Dec 2004
Thanks in advance for looking at this post.

We are running into some trouble porting an existing PERL app running
under an older cgi accelerator (Blue Titan's Velocigen) on IIS to run
under Apache2/mod_perl(1.99) on Win32.  We are testing on a Dell dual
P4 xeon(ht), 1GB ram, running win2k3. The DB server is on a separate
host, connecting via DBI/DBD::ODBC.

Our application consists of a handful of perl scripts (.cat files,
essentially .pl's) that call our own libraries (standard .pm files). 
The webapp runs fine under our install (Pre-compiled binary:
Perl-5.8-win32-bin-0.8.exe through perl.apache.org) until we place a
load on the server (using jMeter).

Under load we are seeing multiple Apache restarts in the error logs
(see below).  The error produced ("Parent: child process exited with
status 3221226324 -- Restarting.") when Apache dies led us to several
mail list threads discussing 'non thread safe' modules (loaded into
Apache) killing it in a similar way (with different status codes)...

We searched through our various modules in use (Apache, Perl 5.8,
mod_perl, Apache::DBI, DBI, DBD::ODBC, DBD::MySQL, Date::Calc,
Net::SMTP) to find if any weren't thread safe.  Everything looked
clean.  We couldn't find any reason to believe any of the modules
weren't thread safe.

Finally, we wrote a simple script that runs under mod_perl and
accesses the MS SQL server via DBI/DBD::ODBC performing a simple query
(returns 2 columns of 25 rows of data).  We loaded up the server using
jMeter to hit this simple script and Apache still dies under load.

Somewhat different results occur if Apache::DBI is enabled or not,
however the difference only appears to be the time it takes to kill
the server.

Overall we can't find any consistency about how long it takes to die
or why it's dying.  We may be including a known-bad module or we've
missed something obvious.  I hope someone on the list can point us in
the right direction as we are facing a deadline in the next few weeks.
 Thank you for your time.

- Ben

=== 

(I hope including links to public URLs is acceptable, this email is
already quite long.) Here's the relevant conf/version/startup files:

Apache/PERL version info (3KB):
http://projects.gsm.uci.edu/files/gsm/serverupgrade4/beast_config.txt
httpd.conf (36KB):
http://projects.gsm.uci.edu/files/gsm/serverupgrade4/httpd.conf.txt
perl.conf (1KB):
http://projects.gsm.uci.edu/files/gsm/serverupgrade4/perl.conf.txt
startup.pl (1KB):
http://projects.gsm.uci.edu/files/gsm/serverupgrade4/startup.pl.txt
test.cat (test script) (1KB):
http://projects.gsm.uci.edu/files/gsm/serverupgrade4/test.cat.txt

Here are the relevant snippets of testing data and error logs:

/tools/test.cat (jMeter: 50 threads, 1sec ramp-up, infinite loop):

 - with apache::dbi ENABLED in startup.pl
 - built in ms sql connection pooling disabled

start: 11:28
end: 11:32

Restart after ~10 seconds:

[Mon Nov 22 17:02:51 2004] [notice] Child 5560: Starting 250 worker threads.
[Tue Nov 23 11:28:18 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:28:18 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:28:18 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:28:22 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:28:23 2004] [notice] Parent: Created child process 4152
[Tue Nov 23 11:28:24 2004] [notice] Child 4152: Child process is running
[Tue Nov 23 11:28:25 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 4152
[Tue Nov 23 11:28:25 2004] [notice] Child 4152: Acquired the start mutex.
[Tue Nov 23 11:28:25 2004] [notice] Child 4152: Starting 250 worker threads.

Restart after 1 minute:

[Tue Nov 23 11:29:35 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:29:35 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:29:35 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:29:35 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:29:36 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:29:37 2004] [notice] Parent: Created child process 1160
[Tue Nov 23 11:29:39 2004] [notice] Child 1160: Child process is running
[Tue Nov 23 11:29:39 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 1160
[Tue Nov 23 11:29:39 2004] [notice] Child 1160: Acquired the start mutex.
[Tue Nov 23 11:29:39 2004] [notice] Child 1160: Starting 250 worker threads.

Restarts after 3 minutes:

[Tue Nov 23 11:30:47 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:30:47 2004] [error] Can't coerce array into hash at
(null) line 589.\n
[Tue Nov 23 11:30:48 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:30:49 2004] [notice] Parent: Created child process 4464
[Tue Nov 23 11:30:51 2004] [notice] Child 4464: Child process is running
[Tue Nov 23 11:30:51 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 4464
[Tue Nov 23 11:30:51 2004] [notice] Child 4464: Acquired the start mutex.
[Tue Nov 23 11:30:51 2004] [notice] Child 4464: Starting 250 worker threads.
[Tue Nov 23 11:30:54 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:30:55 2004] [notice] Parent: Created child process 5168
[Tue Nov 23 11:30:56 2004] [notice] Child 5168: Child process is running
[Tue Nov 23 11:30:56 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 5168
[Tue Nov 23 11:30:56 2004] [notice] Child 5168: Acquired the start mutex.
[Tue Nov 23 11:30:56 2004] [notice] Child 5168: Starting 250 worker threads.
[Tue Nov 23 11:31:00 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:31:01 2004] [notice] Parent: Created child process 2032
[Tue Nov 23 11:31:02 2004] [notice] Child 2032: Child process is running
[Tue Nov 23 11:31:02 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 2032
[Tue Nov 23 11:31:02 2004] [notice] Child 2032: Acquired the start mutex.
[Tue Nov 23 11:31:02 2004] [notice] Child 2032: Starting 250 worker threads.

/tools/test.cat (jMeter: 50 threads, 1sec ramp-up, infinite loop)

 - with apache::dbi DISABLED in startup.pl
 - built in ms sql connection pooling disabled

start: 11:35
end: 11:48

Everything went well for over 10 minutes (111688 requests, rate of
138.6/sec). Control-reloading the page was pretty snappy too. Then
apache restarts a bunch of times and jmeter gets confused, stops
testing:
error log:

[Tue Nov 23 11:34:37 2004] [notice] Child 1868: Starting 250 worker threads.
[Tue Nov 23 11:45:41 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:45:41 2004] [notice] Parent: Created child process 4484
[Tue Nov 23 11:45:42 2004] [notice] Child 4484: Child process is running
[Tue Nov 23 11:45:42 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 4484
[Tue Nov 23 11:45:42 2004] [notice] Child 4484: Acquired the start mutex.
[Tue Nov 23 11:45:42 2004] [notice] Child 4484: Starting 250 worker threads.
[Tue Nov 23 11:45:45 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:45:46 2004] [notice] Parent: Created child process 2336
[Tue Nov 23 11:45:46 2004] [notice] Child 2336: Child process is running
[Tue Nov 23 11:45:46 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 2336
[Tue Nov 23 11:45:46 2004] [notice] Child 2336: Acquired the start mutex.
[Tue Nov 23 11:45:46 2004] [notice] Child 2336: Starting 250 worker threads.
[Tue Nov 23 11:45:50 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:45:50 2004] [notice] Parent: Created child process 5364
[Tue Nov 23 11:45:51 2004] [notice] Child 5364: Child process is running
[Tue Nov 23 11:45:51 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 5364
[Tue Nov 23 11:45:51 2004] [notice] Child 5364: Acquired the start mutex.
[Tue Nov 23 11:45:51 2004] [notice] Child 5364: Starting 250 worker threads.
[Tue Nov 23 11:45:55 2004] [notice] Parent: child process exited with
status 3221226324 -- Restarting.
[Tue Nov 23 11:45:55 2004] [notice] Parent: Created child process 4004
[Tue Nov 23 11:45:56 2004] [notice] Child 4004: Child process is running
[Tue Nov 23 11:45:56 2004] [info] Parent: Duplicating socket 220 and
sending it to child process 4004
[Tue Nov 23 11:45:56 2004] [notice] Child 4004: Acquired the start mutex.
[Tue Nov 23 11:45:56 2004] [notice] Child 4004: Starting 250 worker threads.
[Tue Nov 23 11:47:04 2004] [error] [client 128.200.186.240] request
failed: error reading the headers
[Tue Nov 23 11:48:16 2004] [error] [client 128.200.186.240] request
failed: error reading the headers

        -- 
        Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

[mp2] Win32 apache2 "Restarting" underload
Ben Mehling 00:55 on 01 Dec 2004

Generated at 11:26 on 21 Dec 2004 by mariachi v0.52