Without question, the most popular Apache/Perl module is Apache::Registry module. This module emulates the CGI environment, allowing programmers to write scripts that run under CGI or mod_perl without change. Existing CGI scripts may require some changes, simply because a CGI script has a very short lifetime of one HTTP request, allowing you to get away with ``quick and dirty'' scripting. Using mod_perl and Apache::Registry requires you to be more careful, but it also gives new meaning to the work ``quick''! Apache::Registry maintains a cache of compiled scripts, which happens the first time a script is accessed by a child server or once again if the file is updated on disk.
Although it may be all you need, a speedy CGI replacement is only a small part of this project. Callback hooks are in place for each stage of a request. Apache-Perl modules may step in during the handler, header parser, uri translate, authentication, authorization, access, type check, fixup and logger stages of a request.
Mike Stok
http://www.stok.co.uk/~mike/mod_perl/
and
http://www.tiac.net/users/stok/mod_perl/
See the lib/ directory for example modules and apache-modlist.html for a comprehensive list.
See the eg/ directory for example scripts.
You may load modules at server startup via:
PerlModule Apache::SSI SomeOther::Module
There is a limit of 10 PerlModule's, if you need more to be loaded when the server starts, use one PerlModule to pull in many or use the PerlScript directive described below.
Optionally:
PerlScript perl-scripts/script_to_load_at_startup.pl
This script will be loaded when the server starts. See eg/startup.pl for an example to start with.
In an httpd.conf <Location /foo> or .htaccess you need:
PerlHandler sub_routine_name
This is the name of the subroutine to call to handle each request. e.g. in the PerlModule Apache::Registry this is ``Apache::Registry::handler''.
If PerlHandler is not a defined subroutine, mod_perl assumes it is a package name which defines a subroutine named ``handler''.
PerlHandler Apache::Registry
Would load Registry.pm (if it is not already) and call it's subroutine ``handler''.
There are several stages of a request where the Apache API allows a module to step in and do something. The Apache documentation will tell you all about those stages and what your modules can do. By default, these hooks are disabled at compile time, see the INSTALL document for information on enabling these hooks. The following configuration directives take one argument, which is the name of the subroutine to call. If the value is not a subroutine name, mod_perl assumes it is a package name which implements a 'handler' subroutine.
PerlChildInitHandler (requires apache_1.3b1 or higher) PerlPostReadRequestHandler (requires apache_1.3b1 or higher) PerlInitHandler PerlTransHandler PerlHeaderParserHandler (requires apache_1.2.0 or higher) PerlAccessHandler PerlAuthenHandler PerlAuthzHandler PerlTypeHandler PerlFixupHandler PerlHandler PerlLogHandler PerlCleanupHandler PerlChildExitHandler (requires apache_1.3b1 or higher)
Only ChildInit, ChildExit, PostReadRequest and Trans handlers are not allowed in .htaccess files.
%ENV
is magical in that it inherits environment variables from the parent
process and will set them should a process spawn a child. However, with
mod_perl we're in the parent process that would normally setup the common
environment variables before spawning a CGI process. Therefore, mod_perl
must feed these variables to %ENV
directly. Normally, this does not happen until the response stage of a
request when PerlHandler
is called. If you wish to set variables that will be available before then,
such as for a PerlAuthenHandler
, you may use the PerlSetEnv
configuration directive:
PerlSetEnv SomeKey SomeValue
CGI-Perl/1.1
when running under mod_perl.
if(exists $ENV{MOD_PERL}) { #we're running under mod_perl ... } else { #we're NOT running under mod_perl }
perl_run()
function, which is called once each time the Perl program is executed, e.g.
once per (mod_cgi) CGI scripts. However, mod_perl only calls
perl_run()
once, during server startup. Any END blocks encountered during main server startup, i.e. those pulled in by the
PerlScript or by any PerlModule are suspended and run at server shutdown, aka child_exit
(requires apache 1.3b1+). Any END
blocks that are encountered during compilation of Apache::Registry scripts
are called after the script done is running, including subsequent
invocations when the script is cached in memory. All other END blocks encountered during other Perl*Handler callbacks, e.g. PerlChildInitHandler, will be suspended while the process is running and called during child_exit
when the process is shutting down. Module authors may be wish to use $r->register_cleanup
as an alternative to END blocks if this behavior is not desirable.
Here's I'm just running
% /usr/bin/perl -e '1 while 1'
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 10214 dougm 67 0 668K 212K run 0:04 71.55% 21.13% perl
Now with a few random modules:
% /usr/bin/perl -MDBI -MDBD::mSQL -MLWP::UserAgent -MFileHandle -MIO -MPOSIX -e '1 while 1'
10545 dougm 49 0 3732K 3340K run 0:05 54.59% 21.48% perl
Here's my httpd linked with libperl.a, not having served a single request:
10386 dougm 5 0 1032K 324K sleep 0:00 0.12% 0.11% httpd-a
You can reduce this if you configure perl 5.004+ with -Duseshrplib. Here's my httpd linked with libperl.sl, not having served a single request:
10393 dougm 5 0 476K 368K sleep 0:00 0.12% 0.10% httpd-s
Now, once the server starts receiving requests, the embedded interpreter will compile code for each 'require' file it has not seen yet, each new Apache::Registry subroutine that's compiled, along with whatever modules it's use'ing or require'ing. Not to mention AUTOLOADing. (Modules that you 'use' will be compiled when the server starts unless they are inside an eval block.) httpd will grow just as big as our /usr/bin/perl would, or a CGI process for that matter, it all depends on your setup.
Newer Perl versions also have other options to reduce runtime memory
consumption. See Perl's INSTALL file for details on -DPACK_MALLOC
and -DTWO_POT_OPTIMIZE
. With these options, my httpd shrinks down ~150K.
The mod_perl INSTALL document explains how to build the Apache:: extensions as shared libraries (with 'perl Makefile.PL DYNAMIC=1'). This may save you some memory, however, it doesn't work on a few systems such as aix and unixware.
For me, once everything is compiled, the processes no longer grow, I can live with the size at that point. For others, this size might be too big, or they might be using a module that leaks or have code of their own that leaks, in any case using the apache configuration directive 'MaxRequestsPerChild' is your best bet to keep the size down, but at the same time, you'll be slowing things down when Apache::Registry scripts have to recompile. Tradeoffs...
TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND p4 5016 dougm 154 20 3808K 2636K sleep 0:01 9.62 4.07 httpd
Here's a freshly started httpd who's served one request for the same script using the CGI.pm function interface:
TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND p4 5036 dougm 154 20 3900K 2708K sleep 0:01 3.19 2.18 httpd
Now do the math: take that difference, figure in how many other scripts import the same functions and how many children you have running. It adds up!
use strict
and use vars
keeps modules clean and reduces a bit of noise. However, use vars also creates aliases as the Exporter does, which eat up more space. When possible, try to use fully qualified
names instead of use vars. Example:
package MyPackage; use strict; @MyPackage::ISA = qw(...); $MyPackage::VERSION = "1.00";
vs.
package MyPackage; use strict; use vars qw(@ISA $VERSION); @ISA = qw(...); $VERSION = "1.00";
-w
or -T
. Since the command line is only parsed once, when the server starts, these
switches are unavailable to mod_perl scripts. However, most command line
arguments have a perl special variable equivilant. For example, the $^W
variable coresponds to the -w
switch. Consult perlvar for more details. The switch which enables taint checks does not have a
special variable, so mod_perl provides the PerlTaintCheck directive to turn on taint checks. In httpd.conf, enable with:
PerlTaintCheck On
Now, any and all code compiled inside httpd will be checked.
#Apache::Registry script use strict; use vars qw($dbh);
$dbh ||= SomeDbPackage->connect(...);
Since $dbh
is a global variable, it will not go out of scope, keeping the connection
open for the lifetime of a server process, establishing it during the
script's first request for that process.
It's recommended that you use one of the Apache::* database connection
wrappers. Currently for DBI users there is Apache::DBI
and for Sybase users Apache::Sybase::DBlib
. These modules hide the peculiar code example above. In addition,
different scripts may share a connection, minimizing resource consumption.
Example:
#httpd.conf has # PerlModule Apache::DBI #DBI scripts look exactly as they do under CGI use strict; my $dbh = DBI->connect(...);
Although $dbh shown here will go out of scope when the script ends, the Apache::DBI module's reference to it does not, keep the connection open.
Perl*Handler directives can define any number of subroutines, e.g. (in config files) PerlTransHandler OneTrans TwoTrans RedTrans BlueTrans
With the method, Apache->push_handlers, callbacks can be added to the stack by scripts at runtime by mod_perl scripts.
Apache->push_handlers takes the callback hook name as it's first argument and a subroutine name or reference as it's second. e.g.:
Apache->push_handlers("PerlLogHandler", \&first_one);
$r->push_handlers("PerlLogHandler", sub { print STDERR "__ANON__ called\n"; return 0; });
After each request, this stack is cleared out.
All handlers will be called unless a handler returns a status other than OK or DECLINED, this needs to be considered more. Post apache-1.2 will have a DONE return code to signal termiation of a stage, which Rob and I came up with while back when first discussing the idea of stacked handlers. 2.0 won't come for quite sometime, so mod_perl will most likely handle this before then.
example uses:
CGI.pm maintains a global object for it's plain function interface. Since the object is global, it does not go out of scope, DESTROY is never called. CGI->new can call:
Apache->push_handlers("PerlCleanupHandler", \&CGI::_reset_globals);
This function will be called during the final stage of a request, refreshing CGI.pm's globals before the next request comes in.
Apache::DCELogin establishes a DCE login context which must exist for the lifetime of a request, so the DCE::Login object is stored in a global variable. Without stacked handlers, users must set
PerlCleanupHandler Apache::DCELogin::purge
in the configuration files to destroy the context. This is not ``user-friendly''. Now, Apache::DCELogin::handler can call:
Apache->push_handlers("PerlCleanupHandler", \&purge);
Persistent database connection modules such as Apache::DBI could push a PerlCleanupHandler handler that iterates over %Connected, refreshing connections or just checking that ones have not gone stale. Remember, by the time we get to PerlCleanupHandler, the client has what it wants and has gone away, we can spend as much time as we want here without slowing down response time to the client.
PerlTransHandlers may decide, based or uri or other condition, whether or not to handle a request, e.g. Apache::MsqlProxy. Without stacked handlers, users must configure:
PerlTransHandler Apache::MsqlProxy::translate PerlHandler Apache::MsqlProxy
PerlHandler is never actually invoked unless translate()
sees
the request is a proxy request ($r->proxyreq), if it is a proxy request,
translate()
set $r->handler(``perl-script''), only then
will PerlHandler handle the request. Now, users do not have to specify
'PerlHandler Apache::MsqlProxy', the translate()
function can
set it with push_handlers().
Includes, footers, headers, etc., piecing together a document, imagine (no need for SSI parsing!):
PerlHandler My::Header Some::Body A::Footer
This was my first test:
#My.pm package My;
sub header { my $r = shift; $r->content_type("text/plain"); $r->send_http_header; $r->print("header text\n"); } sub body { shift->print("body text\n") } sub footer { shift->print("footer text\n") } 1; __END__ #in config <Location /foo> SetHandler "perl-script" PerlHandler My::header My::body My::footer </Location>
Parsing the output of another PerlHandler? this is a little more tricky, but consider:
<Location /foo> SetHandler "perl-script" PerlHandler OutputParser SomeApp </Location> <Location /bar> SetHandler "perl-script" PerlHandler OutputParser AnotherApp </Location>
Now, OutputParser goes first, but it untie's *STDOUT
and
re-tie's to it's own package like so:
package OutputParser;
sub handler { my $r = shift; untie *STDOUT; tie *STDOUT => 'OutputParser', $r; }
sub TIEHANDLE { my($class, $r) = @_; bless { r => $r}, $class; }
sub PRINT { my $self = shift; for (@_) { #do whatever you want to $_ $self->{r}->print($_ . "[insert stuff]"); } }
1; __END__
To build in this feature, configure with:
% perl Makefile.PL PERL_STACKED_HANDLERS=1 [PERL_FOO_HOOK=1,etc]
Another method 'Apache->can_stack_handlers' will return TRUE if mod_perl was configured with PERL_STACKED_HANDLERS=1, FALSE otherwise.
package My; @ISA = qw(BaseClass);
sub handler ($$) { my($class, $r) = @_; ...; }
package BaseClass;
sub method ($$) { my($class, $r) = @_; ...; }
__END__
Configuration:
PerlHandler My
or
PerlHandler My->handler
Since the handler is invoked as a method, it may inherit from other classes:
PerlHandler My->method
In this case, the 'My' class inherits this method from 'BaseClass'.
To build in this feature, configure with:
% perl Makefile.PL PERL_METHOD_HANDLERS=1 [PERL_FOO_HOOK=1,etc]
<Perl> sections can contain *any* and as much Perl code as you wish.
These sections are compiled into a special package who's symbol table
mod_perl can then walk and grind the names and values of Perl
variables/structures through the Apache core config gears. Most of the
configurations directives can be represented as $Scalars
or
@Lists
. A @List
inside these sections is simply converted into a single-space delimited
string for you inside. Here's an example:
#httpd.conf <Perl> @PerlModule = qw(Mail::Send Devel::Peek);
#run the server as whoever starts it $User = getpwuid($>) || $>; $Group = getgrgid($)) || $);
$ServerAdmin = $User;
</Perl>
Block sections such as <Location></Location> are represented in a
%Hash
, e.g.:
$Location{"/~dougm/"} = { AuthUserFile => '/tmp/htpasswd', AuthType => 'Basic', AuthName => 'test', DirectoryIndex => [qw(index.html index.htm)], Limit => { METHODS => 'GET POST', require => 'user dougm', }, };
#If a Directive can take say, two *or* three arguments #you may push strings and the lowest number of arguments #will be shifted off the @List #or use array reference to handle any number greater than #the minimum for that directive push @Redirect, "/foo", "http://www.foo.com/";
push @Redirect, "/imdb", "http://www.imdb.com/";
push @Redirect, [qw(temp "/here" "http://www.there.com")];
Other section counterparts include %VirtualHost
, %Directory
and
%Files
.
These are somewhat boring examples, but they should give you the basic idea. You can mix in any Perl code your heart desires. See eg/httpd.conf.pl and eg/perl_sections.txt for some examples.
Currently for <Perl> sections to work, the PerlScript
configuration directive must be defined, /dev/null
will do just fine.
A tip for syntax checking outside of httpd:
<Perl> #!perl #... code here ...
__END__ </Perl>
Now you may run perl -cx httpd.conf
.
It may be the case that <Perl> sections are not completed or an oversight was made in an certain area. If they do not behave as you expect, please send a report to the modperl mailing list.
To configure this feature build with 'perl Makefile.PL PERL_SECTIONS=1'
A `sub' key value may be anything a Perl*Handler can be: subroutine name, package name (defaults to package::handler), Class->method call or anonymous sub {}
Example:
Child <!--#perl sub="sub {print $$}" --> accessed <!--#perl sub="sub {print ++$Access::Cnt }" --> times. <br>
<!--#perl sub="Package::handler" arg="one" arg="two" -->
The Apache::Include module makes it simple to include Apache::Registry scripts with the mod_include perl directive.
Example:
<!--#perl sub="Apache::Include" arg="/perl/ssi.pl" -->
You can also use 'virtual include' to include Apache::Registry scripts of course. However, using #perl will save the overhead of making Apache go through the motions of creating/destroying a subrequest and making all the necessary access checks to see that the request would be allowed outside of a 'virtual include' context.
To enable perl in mod_include parsed files, when building apache the following must be present in the Configuration file:
EXTRA_CFLAGS=-DUSE_PERL_SSI -I. `perl -MExtUtils::Embed -ccopts`
mod_perl's Makefile.PL script can take care of this for you as well:
perl Makefile.PL PERL_SSI=1
If you're interested in sprinkling Perl code inside your HTML documents, you'll also want to look at the Apache::Embperl, Apache::ePerl and Apache::SSI modules.
benchmark/
directory for some examples.