Linux Goodies

In Pursuit Of The Perfect O/S



A Review of the Perl PDL Matrix Language

Amazon Computers

PDL, the Powerful Math Extension for Perl

PDL graphic

Above you see a 3D rendition of the classic sinx/x function as produced by the PDL (Perl Data Language) extension to perl. If you look at similar plots at the Octave Review, Scilab Review, Yorick Review, and the R Language Review, you'll see a similar graph to those made by the other languages. Other types of PDL graphs will be shown later in this web page.

A Bit of History

If you're unfamiliar with the PDL, perhaps a little history of its development would be helpful. The PDL is an extension of the report generating language perl. Perl, if you've never used it, is quite a conceptual departure from conventional languages. It is a scripting language with the most interesting and powerful text manipulation libraries.

In perl, text manipulation expressions are called "regular expressions." They give perl an incredibly powerful way of parsing, combining, searching, and manipulating textual information.

These unusual text handling capabilities of perl go far beyond report generating. I use perl all the time for working with ASCII data files. Data can be extracted from almost any kind of ASCII file and presented in more tabular format for perl or other programs to process. Perl has also grown to be one of the main CGI languages used to support the web. Perl has available many extensions, including networking.

Perl and the PDL are available in Windows, Unix and Linux, but the full graphics of the PDL are only available in Unix and Linux. This review only covers the Linux version.

In recent years, perl has grown to become an object oriented language. This object oriented capability gave some astronomers an opening to visualize a mathematical language that would be most profound in nature.

Most matrix math languages are very good at doing all manner of mathematical, engineering, and statistical calculations. But all are very limited in the general handling of data. That is, arcane ASCII files of data often must be parsed and converted by some other process into something more digestible by these languages. And report or document making from data processed by a typical matrix language must usually be done by some other process. Matrix math languages mostly do math and graphics, and that's all.

But these astronomers thought that if one could use the object oriented capabilities of perl to add a powerful matrix math capability, then one would have in a single language everything needed to manipulate data -- from beginning to end. A language that could tackle data in almost any form, process it as well and as quickly as any matrix language, and output data in virtually any form -- including document format or reports. The whole enchilada.

That's the PDL. It's a matrix math language extension to the already incredible language perl. It lets one tackle the same mathematical, engineering, signal processing, and statistical problems that can done with any other matrix language. The PDL is even more capable than some math languages, being able to handle matrices with much more than 2 dimensions. And since astronomers tend to work with astronomical images, the PDL also has capabilities that make image processing much easier than is possible with other matrix languages.

PDL Syntax

The basic syntax of perl, aside from the "regular expressions," is very similar to C. Interactive use of the PDL is through a program called perldl. The PDL nomenclature is expressed largely in the object nomenclature of perl. PDL operations are mostly object member function calls. To many, the strange and lengthy commands of the PDL might be objectionable, and the PDL is not particularly convenient to work with interactively by using the perldl utility. But if perl programs are ran as executable script files (easily done), then the verboseness of the commands is of little consequence.

It takes some time to get used to using the PDL set of data handling objects. Following is an example of how one creates a 4 column, 3 row matrix in Octave and in perl.

In Octave
x = [1,2,3,4;   11,12,13,14;   21,22,23,34];

In the PDL, use the pdl operator
$x = pdl([1,2,3,4],   [11,12,13,14],   [21,22,23,34]);

The pdl operator in PDL makes an object that holds the matrix data and the member functions that can operate on it. The object created with the pdl operator is called a piddle.

Below is the Octave and PDL syntax for making a matrix that is a subset of a larger matrix. The new matrix in the following example has only columns 2 and 3 rows.

In Octave:
y = x(:,2:3);

In the PDL:
$y = $x->slice("1:2,:");


There's a few things to know that will clear things up. First note that data containers for PDL data are not just arrays or matrices as in Octave, but are data objects. Object names in the PDL are preceded by a dollar sign (as are scalars in perl). Because the data containers are objects, many familiar matrix manipulations must be accomplished by referencing object member functions, slice in this case.

Second, note that in the PDL, indices run from 0 to n-1, where in most languages they go from 1 to n. In the PDL, the column indices come first, followed by the row indices. This is for speed consideration reasons. Row-column indices are the other way around in most other languages.

So remember, $x in PDL is actually an object with data, not a matrix. Slice lets one select contiguous ranges of indices. Another operator, named dice, allows non-contiguous ranges to be selected. You must admit, the PDL is the only language you'll probably find that lets you slice and dice piddles.

Not obvious in this example is another attribute of the PDL. In the Octave and most other matrix languages for example, the $y matrix is a new and separate matrix. Subsequent operations on $y will not have any effect on $x because $y is a completely separate entity. But with the PDL, $y is just a handle pointing to the subset of $x that includes columns 2 and 3 (1 and 2 in PDL index notation). Operations on $y will actually affect the values of that subset in $x.

What if you don't want that to happen? Then you use the pdl operator or copy member function to create a new matrix. In the following examples, $y is what you might expect, a new and separate data container that is no longer connected to some section of $x.

$y = pdl($x->slice("1:2,:"))
$y = $x->slice("1:2,:")->copy;

The odd business of being able to create handles that represent subsets of larger matrices may seem senseless, but it's part of the elegance of the PDL that aids with image processing. It can often help considerably to reduce memory requirements, avoiding unnecessarily making interim matrices. To supplement the image processing aspects of the PDL, routines are included to read and write fits format image files, and the PDL Pic package is available to handle other kinds of image formats.

Making Your Own PDL Functions

Perl user-defined functions can be stored in files with a pm extension. An odd requirement is that each pm file must return a value when loaded, so simply a 1 followed by a semicolon must be at the end of a pm file.

For a pm file to load properly, put this as last line:

Pm files can contain as many functions as you like, though for efficiency it's better to package similar functions together. No use loading an entire library just to have a few functions available. Packages can be loaded with the use command. One issue is that you must tell perl where your library files are, and perl doesn't automatically expand shorthand file indicators. That is, the common practice of using the tilde character to indicate your home directory isn't directly expanded by perl. But you can use the glob command to force expansion of a path.

To set library location to a directory named perl in your home directory
and then load libraries from there:
use lib glob("~/perl/");
use lib1;
use lib2;

Packaging related subroutines into pm libraries is a handy way to quickly load the routines. By editing the .perldlrc file in your home directory and adding your desired additional libraries (pm files), you can and have them always loaded when you run the interactive perldl environment.

One issue to keep in mind about pm files is that if you're working on a new routine in a pm file and make a change, you must exit perldl each time and re-enter in order to try out changes to a pm library. You can't just type use libname again and expect it to reload.

There is a way to have perldl auto load routines in the same way as MATLAB and Octave. Like MATLAB and Octave, these auto load files must have file names the same as the function names they contain. The files must have a .pdl extension. By creating your own functions in .pdl files, you don't have to load them with the use command as with pm files, the functions will automatically be loaded when referenced.

To have the autoloader always available to you when you run perldl, put the following commands into your .perldlrc file:

use PDL::AutoLoader;

If you want the autoloader to reload if you've edited a .pdl file while still in perldl, add this line to the .perldlrc file also:


You can set the environment variable PDLLIB to the path of your own pdl files so the PDL can find them.

Help is easy to find within the perl PDL system. In many languages, you can get help on specific routines, but if you don't know what name a routine might have, help is much harder to find. With perl (in the perldl interactive environment), just enter ??topic to get a list of all functions whose documentation contain the topic you enter. To then get information of a specific routine, use a single question mark. ?avg displays the documentation on the avg routine, as an example.

PDL Graphics

pdl contour plot
pdl color contour plot

The perl PDL can accomplish a wide variety of graphical presentations, and to do so it can call upon a number of graphics packages. I commonly use the pgplot library for line plots, contour plots, and image plots. The contour graph you see above top is an example of a line contour plot created with the pgplot interface.

The graph on the bottom is a color filled contour graph of the sinx/x function, also created using the pgplot package. In addition to creating a graph for the screen, pgplot can directly output files in many standard graphic formats, including gif, postscript, and png.

pdl map plot

Color area plots like the one above can be down with different color themes. There are many color themes already in the plot library to choose from, usually shades of different colors. But you can make your own themes, as they are just fits data files with 3 row vectors of 256 values from 0 to 1, one row each for the red, blue, and green component. I created the color theme used in the above image to go from blue to white but moving through shades of green and brown between the blue and white extremes. It gives a terrain looking theme, making a plot of terrain data almost look like a satellite photograph.

The image above is a map produced from topographic elevation values, and are colored using the terrain color theme. As you can see, the images produced by the PDL with the pgplot package can be pretty elaborate.

pdl Tycho plot

The image above is a digital image of the lunar crater Tycho, displayed by the pgplot package. The image was taken through my 6 inch reflecting telescope using an inexpensive web cam astrocamera (see Cheap Astrophotography). It gives an example of how the PDL can be used for graphic image manipulation.

I chose this moon image to emphasis the collection of image processing functions included with the perl PDL. I processed this image with a PDL program I wrote, adding some sharpening and contrast adjustments with the GIMP image processing program. I was able to make a similar program with other matrix languages, but the PDL made the process easy, and many more enhancements could be added that would be difficult in other languages.

The PDL program I created allowed me to interactively select a sub-region of the image, select which of about 40 frames of the crater I wished to use, then align the frames and compute an average image. Using this process averages out the pixel anomalies and variations produced by the inexpensive web cam and the atmosphere. The result, photographs like this fine image of the Tycho crater.

The PDL also includes an interface to the plplot package from the PDL. The plplot package has similar capabilities to the pgplot package.

The pgplot interface lets the user interact with a plot, allowing the user to obtain mouse location, button pushed, and area selected. These features are employed in my image stacking program to allow me to effectively stack astrophotography images.

Plain PDL mesh plot
Color PDL mesh plot
PDL color 3d surface plot

For 3D and surface plots, PDL provides the TriD package. This package, which uses the computer's graphics card accelerator, plots all manner of 3D plots, and allows the user to interactively rotate the images around different axes to see the images from any perspective.

The top image above is one of the simplest of 3D plots provided by the TriD package. It's a simple mesh plot made with lines. The center image is a mesh with z-axis correlated colors added, and the bottom image is made with full hidden surfaces, with the lines turned off.

As you can see, the PDL can make many kinds of graphs, and these are only some of the simpler examples you can achieve. However, the more elaborate graphic methods take a bit of time to learn. I learn what I need, then package what I want in easy to use forms.

PDL File Handling

The perl PDL has an impressive, though perhaps a bit arcane, set of file I/O routines. The simple rcols routine can read in columns of data from ASCII files with ease. It is deceptively powerful, given that it can use the powerful regular expression mechanism to handle things like comment lines.

In following example, the routine asciiin can read in columns of numbers from a file where the values are either blank or comma separated, and the file can have comment lines with either an asterisk or hash symbol as the first character on the line. The EXCLUDE parameter, in regular expression, tells rcols to skip the asterisk or hash marked lines.

sub asciiin
my $fname = shift;
my @tmp = rcols($fname,{EXCLUDE=>'/^[\#\*]/'});
my $x = transpose(cat(@tmp));
return $x;

The asciiin subroutine uses rcols to read in a list of numbers, then uses the cat command to convert the list to a PDL piddle, then uses the transpose routine to put the piddle in the proper row/column matrix format. That's quite of bit of stuff being handled by about 3 lines of code.

As to image files, the PDL has built in routines to read and write fits format image files. If one installs the netpbm Linux library, then instructs PDL programs to use the PDL::IO::Pic library, the PDL programs can also read and write gif, jpeg, pnm and other graphics formats. This is yet another feature that makes the PDL an effective image processing platform.


In summary, the perl language with the PDL extension is a scripting language of tremendous reach. With high level concepts like objects, integration with a number of different graphics packages, a robust set of matrix functions, and unique text manipulation routines, perl with the PDL is a one-stop shop for data processing. One can work on data interactively through the perldl program, or make batch files that are easy to create and maintain. The syntax of the pdl is a bit different, but learning it gives the user capabilities not available in most any other mathematics language.

Below are listed my subjective pros and cons about the PDL math language.

  • Freely available for both Linux and Windows. Some of the graphic utilities are only available in Linux.

  • The PDL has the entire Perl programming language to draw upon for support, making string handling and ASCII data handling second to none.

  • The programming syntax of Perl is very C like, making easy to transition to if you've used C, C++, or Java.

  • The Perl PDL system fully supports object oriented programming.

  • The Linux version can make use of the Linux Curses and TCL/TK interfaces.

  • Is fully integrated with the TriD 3d graphics platform and the PGPLOT 2D graphics platform.

  • With the PDL's integration with Perl, the combination is the most complete language on the Math language scene.

  • Is very fast when problems are fully vectorized an use matrix operators, and still quite fast even when loops are used.

  • Works reasonably well interactively via perldl, and makes a very easy to use and effective batch language for mathematics programming.

  • Has full integration to graphics utilities so that mouse clicks can deliver data back to the perl script.

  • In addition to multi-dimensional matrices, the Perl/PDL supports more complex data forms such as lists and hashes.

  • Cons:
  • Is a big system, taking more time to load than many others.

  • While Perl syntax is easy to learn, being much like C, the syntax of the PDL can take awhile to master.

  • The PDL commands can be a bit verbose, making interactive use cumbersome.

  • If function libraries are used, troubleshooting new routines is cumbersome in that Perl won't reload a library. You must get out of perldl and back in to cause a library to reload. This is not true if the autoloader feature for individual .pdl function files is used.