At left you see a 3D rendition of the classic sinx/x function as produced by
the PDL (Perl Data Language) extension to
perl. If you look at similar plots at the Octave Review, Scilab Review, Yorick Review, and the R Language Review, you'll see a
similar graph to those made by the other languages. Other types of PDL graphs
will be shown later in this web page.
A Bit of History
If you're unfamiliar with the PDL, perhaps a little history of its
development would be helpful. The PDL is an extension of the report generating
language perl. Perl, if you've never used it, is quite a conceptual departure
from conventional languages. It is a scripting language with the most
interesting and powerful text manipulation libraries.
In perl, text manipulation expressions are call "regular
expressions." They give perl an incredibly powerful way of parsing,
combining, searching, and manipulating textual information.
These unusual text handling capabilities of perl go far beyond report
generating. I use perl all the time for working with ASCII data files. Data can
be extracted from almost any kind of ASCII file and presented in more tabular
format for perl or other programs to process. Perl has also grown to be one of
the main CGI languages used to support the web. Perl has available many
extensions, including networking.
Perl and the PDL are available in Windows, Unix and Linux, but the full
graphics of the PDL are only available in Unix and Linux. This review only
covers the Linux version.
In recent years, perl has grown to become an object oriented language. This
object oriented capability gave some astronomers an opening to visualize a
mathematical language that would be most profound in nature.
Most matrix math languages are very good at doing all manner of
mathematical, engineering, and statistical calculations. But all are very
limited in the general handling of data. That is, arcane ASCII files of data
often must be parsed and converted by some other process into something more
digestible by these languages. And report or document making from data
processed by a typical matrix language must usually be done by some other
process. Matrix math languages mostly do math and graphics, and that's all.
But these astronomers thought that if one could use the object oriented
capabilities of perl to add a powerful matrix math capability, then one
would have in a single language everything needed to manipulate data -- from
beginning to end. A language that could tackle data in almost any form, process
it as well and as quickly as any matrix language, and output data in virtually
any form -- including document format or reports. The whole enchilada.
That's the PDL. It's a matrix math language extension to the already
incredible language perl. It lets one tackle the same mathematical,
engineering, signal processing, and statistical problems that can done with any
other matrix language. The PDL is even more capable than some math languages,
being able to handle matrices with much more than 2 dimensions. And since
astronomers tend to work with astronomical images, the PDL also has
capabilities that make image processing much easier than is possible with other
matrix languages.
PDL Syntax
The basic syntax of perl, aside from the "regular expressions," is very
similar to C. Interactive use of the PDL is through a program
called perldl. The PDL nomenclature is expressed largely in the object
nomenclature of perl. PDL operations are mostly object member function
calls. To many, the strange and lengthly commands of the PDL might be
objectionable, and the PDL is not particularly convenient to work with
interactively by using the perldl utility. But if perl programs are ran as
executable script files (easily done), then the verboseness of the commands is
of little consequence.
It takes some time to get used to using the PDL set of data handling
objects. Following is an example of how one creates a 4 column, 3 row matrix in
Octave and in perl.
In Octave
x = [1,2,3,4;11,12,13,14;21,22,23,34];
In the PDL, use the pdl operator
$x = pdl([1,2,3,4],[11,12,13,14],[21,22,23,34]);
The pdl operator in PDL makes an object that holds the matrix data
and the member functions that can operate on it. The object created with the
pdl operator is called a piddle.
Below is the Octave and PDL syntax for making a matrix that is a subset of a
larger matrix. The new matrix in the following example has only columns 2 and 3
rows.
In Octave:
y = x(:,2:3);
In the PDL:
$y = $x->slice("1:2,:");
Confused? Don't be, it isn't that bad. First note that data objects names in
the PDL are preceded by a dollar sign (as are scalars in perl). Second, note
that in the PDL, indices run from 0 to n-1, where in most languages they go
from 1 to n. In the PDL, the column indices come first, followed by the row
indices. It's the other way around in most other languages. Finally, note that
with the PDL, the sub matrix extraction is done by referencing the slice
member function of the $x matrix object.
Remember, $x is actually an object with data, not a matrix. Slice lets one
pick contiguous ranges of indices. Another operator, named dice, allows
non-contiguous ranges to be selected. You must admit, the PDL is the only
language you'll probably find that lets you slice and dice piddles.
Not obvious in this example is another attribute of the PDL. In the Octave
example, as in most other matrix languages, the y matrix is a new and separate
matrix. Subsequent operations on y will not have any effect on x because y is a
completely separate entity. But with the PDL, $y is just a handle pointing to
the subset of $x that includes columns 2 and 3 (1 and 2 in PDL index notation).
Operations on $y will actually affect the values of that subset in $x. What if
you don't want that to happen? Then you use the pdl operator or copy member
function to create a new matrix.
$y = pdl($x->slice("1:2,:"))
or
$y = $x->slice("1:2,:")->copy;
The odd business of being able to create handles that represent subsets of
larger matrices may seem senseless, but it's part of the elegance of the PDL
that helps with image processing. It can often help considerably to reduce
memory requirements, avoiding unnecessarily making interim matrices. To
supplement the image processing aspects of the PDL, routines are included to
read and write fits format image files, and the PDL Pic package is available
to handle other kinds of image formats.
Making Your Own PDL Functions
Perl user defined functions can be stored in files with a pm
extension. An odd requirement is that each pm file must return a value when
loaded, so simply a 1 followed by a semicolon must be at the end of a pm
file.
For a pm file to load properly, put this as last line:
1;
Pm files can contain as many functions as you like, though for efficiency
it's better to package similar functions together. These packages can then
be loaded with the use command. One issue is that you must tell perl
where your library files are, and perl doesn't automatically expand shorthand
file indicators. That is, the common practice of using the tilde character to
indicate your home directory isn't directly expanded by perl. But you can
use the glob command to force expansion of a path.
To set library location to a directory named perl in your home directory
and then load libraries from there:
use lib glob("~/perl/");
use lib1;
use lib2;
Packaging related subroutines into pm libraries is a handy way to quickly
load the routines. By editing the .perldlrc file in your home directory,
you can enter your commonly used libraries and have them always loaded when
you run the interactive perldl environment.
One issue with the pm files is that if you're working on a new routine and
make an error, you must exit perldl each time and re-enter in order to try
changes to a pm library.
There is a way to have perldl auto load routines in the same way as
MATLAB and Octave. Like MATLAB and Octave, these auto load files must have
file names the same as their function names (except for the .pdl extension
on the file name). By creating your own functions in .pdl files, you don't
have to load them with the use command as with pm files, you can just
call the functions, and they'll automatically be loaded when called.
To have the autoloader always available to you when you run perldl, put
the following commands into your .perldlrc file:
use PDL::AutoLoader;
If you want the autoloader to reload if you've edited a .pdl file,
add this line to the .perldlrc file also:
$PDL::AutoLoader::Rescan=1;
You can set the environment variable PDLLIB to the path of your own
pdl files so the PDL can find them.
Help is easy to find within the perl PDL system. In many languages, you can
get help on specific routines, but if you don't know what name a routine might
have, help is much harder to find. With perl (in the perldl interactive
environment), just enter ??topic to get a list of all functions whose
documentation contain the topic you enter. To then get information of a
specific routine, use a single question mark. ?avg displays the documentation
on the avg routine, as an example.
PDL Graphics
The perl PDL can accomplish a wide variety of graphical presentations,
and to do so it can call upon a number of graphics packages. I commonly use
the pgplot library for line plots, contour plots, and image plots. The
contour graph you see above left is an example of a line contour plot created
with the pgplot interface.
The graph on the right is a color filled contour graph of the sinx/x
function, also created using the pgplot package. In addition to creating
a graph for the screen, pgplot can directly output files in many standard
graphic formats, including gif, postscript, and png.
There are a large number of color schemes included in the PDL pgplot
package, and it's pretty easy to make your own. The color schemes are 3
row vectors of 256 values from 0 to 1, one row each for the red, blue, and
green component. These are saved in the fits file format.
The image above left is a map produced from topographic elevation values,
and are colored using a terrain colormap. As you can see, the images produced
by the PDL with the pgplot package can be pretty elaborate. This graphic has
almost a satellite photograph appearance.
The image above right is a digital image of the lunar crater Tycho,
displayed by the pgplot package. The image was taken through my 6 inch
reflecting telescope using an inexpensive web cam astrocamera (see Cheap
Astrophotograpy).
I chose this moon image to emphasis the collection of image
processing functions included with the perl PDL. I processed this image with a
PDL program I wrote, adding some sharpening and contrast adjustments with the
GIMP image processing program. I was able to
make a similar program with other matrix languages, but the PDL made the
process easy, and many more enhancements could be added that would be difficult
in other languages.
The PDL program I created allowed me to select a sub-region of the image,
select which of about 40 frames of the crater I wished to use, then align the
frames and compute an average image. Using this process averages out the pixel
anomalies and variations produced by the inexpensive web cam and the
atmosphere. The result, photographs like this fine image of the Tycho
crater.
The PDL also includes an interface to the plplot package from the PDL. The
plplot package has similar capabilities to the pgplot package.
The pgplog interface lets the user interact with a plot, allowing the
user to obtain mouse location, button pushed, and area selected. These features
are employed in my image stacking program to allow me to effectively stack
astrophotograpy images.
For 3D and surface plots, PDL provides the TriD package. This package, which
uses the computer's graphics card accelerator, plots all manner of 3D plots,
and allows the user to interactively rotate the images around different axes to
see the images from any perspective.
The left image above is one of the simplest of 3D plots provided by the
TriD package. It's a simple mesh plot made with lines. The center image
is a mesh with z-axis correlated colors added, and the right image is
made with full hidden surfaces, with the lines turned off.
As you can see, the PDL can make many kinds of graphs, and these are only
some of the simpler examples you can achieve. However, the more elaborate
graphic methods take a bit of time to learn. I learn what I need, then package
what I want in easy to use forms.
PDL File Handling
The perl PDL has an impressive, though perhaps a bit arcane, set of
file I/O routines. The simple rcols routine can read in columns of data
from ASCII files with ease. It is deceptively powerful, given that it can
use the powerful regular expression mechanism to handle things like comment
lines.
In following example, the routine asciiin can read in columns of
numbers from a file where the values are either blank or comma separated, and
the file can have comment lines with either an asterisk or hash symbol as the
first character on the line. The EXCLUDE parameter, in regular expression,
tells rcols to skip the asterisk or hash marked lines.
sub asciiin
{
my $fname = shift;
my @tmp = rcols($fname,{EXCLUDE=>'/^[\#\*]/'});
my $x = transpose(cat(@tmp));
return $x;
}
The asciiin subroutine uses rcols to read in a list of numbers, then uses
the cat command to convert the list to a PDL piddle, then uses the
transpose routine to put the piddle in the proper row/column matrix format.
That's quite of bit of stuff being handled by about 3 lines of code.
As to image files, the PDL has built in routines to read and write
fits format image files. If one installs the netpbm Linux
library, then instructs PDL programs to use the PDL::IO::Pic library,
the PDL programs can also read and write gif, jpeg, pnm
and other graphics formats. This is yet another feature that makes the PDL an
effective image processing platform.
Summary
In summary, the perl language with the PDL extension is a scripting language
of tremendous reach. With high level concepts like objects, integration with a
number of different graphics packages, a robust set of matrix functions, and
unique text manipulation routines, perl with the PDL is a one-stop shop for
data processing. One can work on data interactively through the perldl
program, or make batch files that are easy to create and maintain. The syntax
of the pdl is a bit different, but learning it gives the user capabilities not
available in most any other mathematics language.
Below are listed my subjective pros and cons about the PDL math language.
Pros:
Freely available for both Linux and Windows. Some of the graphic utilities
are only available in Linux.
The PDL has the entire Perl programming language to draw upon for support,
making string handling and ASCII data handling second to none.
The programming syntax of Perl is very C like, making easy to
transition to if you've used C, C++, or Java.
The Perl PDL system fully supports object oriented programming.
The Linux version can make use of the Linux Curses and TCL/TK interfaces.
Is fully integrated with the TriD 3d graphics platform and the PGPLOT
2D graphics platform.
With the PDL's integration with Perl, the combination is the most complete
language on the Math languange scene.
Is very fast when problems are fully vectorized an use matrix operators,
and still quite fast even when loops are used.
Works reasonably well interactively via perldl, and makes a very
easy to use and effective batch language for mathematics programming.
Has full integration to graphics utilities so that mouse clicks can
deliver data back to the perl script.
In addition to multi-dimensional matrices, the Perl/PDL supports more
complex data forms such as lists and hashes.
Cons:
Is a big system, taking more time to load than many others.
While Perl syntax is easy to learn, being much like C, the syntax of the
PDL can take awhile to master.
The PDL commands can be a bit verbose, making interactive use cumbersome.
If function libraries are used, troubleshooting new routines is cumbersome in that Perl won't reload a library. You must get out of perldl and
back in to cause a library to reload. This is not true if the autoloader feature for individual .pdl function files is used.