Perl Program Anatomy

This is a skeleton for a typical Perl program.
#!/usr/local/bin/perl

use 5.8.8;
use strict;
use Getopt::Std;
use Pod::Usage;

my %Opts;
my($Param1, $Param2);

ParseArgs();
Foo();
Bar();

for ($Param1)
{
    /flip/ and Flop();
    /flop/ and Flip();
}

Baz();

sub ParseArgs
{
    getopt('', \%Opts);

    $Opts{H} and pod2usage(VERBOSE=>1);
    $Opts{M} and pod2usage(VERBOSE=>2);

   ($Param1, $Param2) = @ARGV;
    $Param2 or 
        pod2usage(VERBOSE=>0);
}

sub Foo
{
}

sub Bar
{
}

sub Baz
{
}

__END__
#!/usr/local/bin/perl
This is the sharp-bang, or shebang. It tells your shell where to find the perl interpreter. N.B. The shebang doesn't usually work under Windows.

The shebang may be followed by command-line switches, such as -w; these will be passed to the perl interpreter.

use 5.8.8;
This line causes a compile-time error if the version of the Perl interpreter is less than 5.8.8.

Perl has evolved over the years. It's a good idea to require the version number of the interpreter on which you're developing the program; it might not run on earlier versions.

use strict;
use strict enforces a restricted programming model on your code. It is strongly recommended.

The most obvious effect of use strict is that global variables must be referred to through fully qualified package names, e.g. $Foo::Bar::baz, rather than $baz. This has the practical consequence of flagging typos in lexically declared (my) variables.

use Getopt::Std;
Getopt::Std is a module that processes command-line switches. Using Getopt::Std will save you typing and debugging, and will lend a consistent interface to your programs.

There are other Getopt:: modules that support different styles of command-line switches. See Parsing the Command Line with Getopt::* for a survey.

use Pod::Usage;
Pod::Usage is a module that reads a POD from the program source and prints it in various formats. Pod::Usage allows programs to be self-documenting.
my %Opts;
Many programs take command line switches. I like to store these in a hash at file scope.
my($Param1, $Param2);
Many programs take command line parameters. I like to store these in my variables at file scope. In an actual program, I give the parameters mnemonic names.
ParseArgs();
Foo();
Bar();
for ($Param1)
{
    /flip/ and Flop();
    /flop/ and Flip();
}
Baz();
This is where the program runs.

ParseArgs is discussed below; it parses and validates the command line. The rest of the code does the substantive work of the program.

This should be a view of the program from 50,000 feet. If it runs more than 20 or 30 lines, consider breaking it up into smaller subroutines.

On the other hand, if this section is nothing more than a call to a single subroutine, named (what else?) Main(), then get rid of Main() hoist its contents up to here.

sub ParseArgs
{
ParseArgs parses and validates the command line.
    getopt('', \%Opts);
getopt() is a routine in Getopt::Std that parses command-line switches. For each switch that it finds, it sets a key in a hash. For example, if the user specifies -H on the command line, then getopt('', \%Opts) will set $Opts{H}.
    $Opts{H} and 
        pod2usage(VERBOSE=>1);

    $Opts{M} and 
        pod2usage(VERBOSE=>2);
pod2usage is a routine in Pod::Usage. It locates the program source, parses the POD from it, and prints it to the screen, and exits. With VERBOSE=>0, it just prints the synopsis. VERBOSE=>1 adds the options, and VERBOSE=>2 prints the entire man page.
    ($Param1, $Param2) = @ARGV;
    $Param2 or 
        pod2usage(VERBOSE=>0);
}
This line picks up positional parameters from the command line. I design my interfaces so that there is always at least one positional parameter. If the user doesn't provide the required number, the program prints a usage line and exits.
sub Foo
{
}

sub Bar
{
}

sub Baz
{
}
This is where subroutines go.
__END__
__END__ is the logical end of program text. You can put anything you want after this token; it will be ignored by the compiler.
Steven McDougall / resume / swmcd@world.std.com / 1999 Oct 17