Linux Goodies

In Pursuit Of The Perfect O/S

Home Page Go To Small Telescope Astronomy Writing and Sci Fi humorous science shirt,comical science shirt,funny science shirt,zany science shirt,humorous math shirt,comical math shirt,Pi shirt,Pi day shirt,zany math shirt,funny linux shirt,comical linux shirt,linux nerd shirt,humorous linux shirt,computer geek shirt

A Review of the Perl PDL Matrix Language

Home

Linux General Info
Migrating from Windows
Programming Languages
Making Documents
Linux on an Old Laptop
Linux User Survey

Linux Astronomy
Kstars Planetarium
Stellarium Planetarium
Xephem Planetarium

Window Manager Reviews
Dwm
Fluxbox
FVWM
Idesk
Icewm
Ratpoison
Window Maker

Linux Hints and Tips
Linux App Hints
Linux Distro Hints
Forth Programming Hints

Linux Lite Apps
MtPaint

Linux Language Reviews
Gforth Review
Linux Matrix Languages
Euler Math Toolbox Review
Octave Language Review
PDL Language Review
R Language Review
Scilab Language Review
Tela Language Review
Yorick Language Review

Linux Virtual Machines
Virtual Machine Intro
Qemu Review
Vbox Review
VMware Review
VM File Sharing
Freedos in DOSEMU
Freedos in QEMU

Linux Distro Reviews
Antix Review
Puppy Linux Review
  Puppy On Flash
  Frugal Puppy

Favorite Sites
Science T-shirts
Math T-shirts
Computer T-shirts
Climate Change T-shirts
Science Posters
Digital Art Posters
Witty T-shirts
Bowling T-shirts
Free Craft Howto's
Building a Dobsonian Telescope


PDL, the Powerful Math Extension for Perl

PDL graphic

At left you see a 3D rendition of the classic sinx/x function as produced by the PDL (Perl Data Language) extension to perl. If you look at similar plots at the Octave Review, Scilab Review, Yorick Review, and the R Language Review, you'll see a similar graph to those made by the other languages. Other types of PDL graphs will be shown later in this web page.



A Bit of History

If you're unfamiliar with the PDL, perhaps a little history of its development would be helpful. The PDL is an extension of the report generating language perl. Perl, if you've never used it, is quite a conceptual departure from conventional languages. It is a scripting language with the most interesting and powerful text manipulation libraries.

In perl, text manipulation expressions are call "regular expressions." They give perl an incredibly powerful way of parsing, combining, searching, and manipulating textual information.

These unusual text handling capabilities of perl go far beyond report generating. I use perl all the time for working with ASCII data files. Data can be extracted from almost any kind of ASCII file and presented in more tabular format for perl or other programs to process. Perl has also grown to be one of the main CGI languages used to support the web. Perl has available many extensions, including networking.

Perl and the PDL are available in Windows, Unix and Linux, but the full graphics of the PDL are only available in Unix and Linux. This review only covers the Linux version.

In recent years, perl has grown to become an object oriented language. This object oriented capability gave some astronomers an opening to visualize a mathematical language that would be most profound in nature.

Most matrix math languages are very good at doing all manner of mathematical, engineering, and statistical calculations. But all are very limited in the general handling of data. That is, arcane ASCII files of data often must be parsed and converted by some other process into something more digestible by these languages. And report or document making from data processed by a typical matrix language must usually be done by some other process. Matrix math languages mostly do math and graphics, and that's all.

But these astronomers thought that if one could use the object oriented capabilities of perl to add a powerful matrix math capability, then one would have in a single language everything needed to manipulate data -- from beginning to end. A language that could tackle data in almost any form, process it as well and as quickly as any matrix language, and output data in virtually any form -- including document format or reports. The whole enchilada.

That's the PDL. It's a matrix math language extension to the already incredible language perl. It lets one tackle the same mathematical, engineering, signal processing, and statistical problems that can done with any other matrix language. The PDL is even more capable than some math languages, being able to handle matrices with much more than 2 dimensions. And since astronomers tend to work with astronomical images, the PDL also has capabilities that make image processing much easier than is possible with other matrix languages.


PDL Syntax

The basic syntax of perl, aside from the "regular expressions," is very similar to C. Interactive use of the PDL is through a program called perldl. The PDL nomenclature is expressed largely in the object nomenclature of perl. PDL operations are mostly object member function calls. To many, the strange and lengthly commands of the PDL might be objectionable, and the PDL is not particularly convenient to work with interactively by using the perldl utility. But if perl programs are ran as executable script files (easily done), then the verboseness of the commands is of little consequence.

It takes some time to get used to using the PDL set of data handling objects. Following is an example of how one creates a 4 column, 3 row matrix in Octave and in perl.

In Octave
x = [1,2,3,4;11,12,13,14;21,22,23,34];

In the PDL, use the pdl operator
$x = pdl([1,2,3,4],[11,12,13,14],[21,22,23,34]);

The pdl operator in PDL makes an object that holds the matrix data and the member functions that can operate on it. The object created with the pdl operator is called a piddle.

Below is the Octave and PDL syntax for making a matrix that is a subset of a larger matrix. The new matrix in the following example has only columns 2 and 3 rows.

In Octave:
y = x(:,2:3);

In the PDL:
$y = $x->slice("1:2,:");

Confused? Don't be, it isn't that bad. First note that data objects names in the PDL are preceded by a dollar sign (as are scalars in perl). Second, note that in the PDL, indices run from 0 to n-1, where in most languages they go from 1 to n. In the PDL, the column indices come first, followed by the row indices. It's the other way around in most other languages. Finally, note that with the PDL, the sub matrix extraction is done by referencing the slice member function of the $x matrix object.

Remember, $x is actually an object with data, not a matrix. Slice lets one pick contiguous ranges of indices. Another operator, named dice, allows non-contiguous ranges to be selected. You must admit, the PDL is the only language you'll probably find that lets you slice and dice piddles.

Not obvious in this example is another attribute of the PDL. In the Octave example, as in most other matrix languages, the y matrix is a new and separate matrix. Subsequent operations on y will not have any effect on x because y is a completely separate entity. But with the PDL, $y is just a handle pointing to the subset of $x that includes columns 2 and 3 (1 and 2 in PDL index notation). Operations on $y will actually affect the values of that subset in $x. What if you don't want that to happen? Then you use the pdl operator or copy member function to create a new matrix.

$y = pdl($x->slice("1:2,:"))
or
$y = $x->slice("1:2,:")->copy;

The odd business of being able to create handles that represent subsets of larger matrices may seem senseless, but it's part of the elegance of the PDL that helps with image processing. It can often help considerably to reduce memory requirements, avoiding unnecessarily making interim matrices. To supplement the image processing aspects of the PDL, routines are included to read and write fits format image files, and the PDL Pic package is available to handle other kinds of image formats.

Making Your Own PDL Functions

Perl user defined functions can be stored in files with a pm extension. An odd requirement is that each pm file must return a value when loaded, so simply a 1 followed by a semicolon must be at the end of a pm file.

For a pm file to load properly, put this as last line:
1;

Pm files can contain as many functions as you like, though for efficiency it's better to package similar functions together. These packages can then be loaded with the use command. One issue is that you must tell perl where your library files are, and perl doesn't automatically expand shorthand file indicators. That is, the common practice of using the tilde character to indicate your home directory isn't directly expanded by perl. But you can use the glob command to force expansion of a path.

To set library location to a directory named perl in your home directory
and then load libraries from there:
use lib glob("~/perl/");
use lib1;
use lib2;

Packaging related subroutines into pm libraries is a handy way to quickly load the routines. By editing the .perldlrc file in your home directory, you can enter your commonly used libraries and have them always loaded when you run the interactive perldl environment.

One issue with the pm files is that if you're working on a new routine and make an error, you must exit perldl each time and re-enter in order to try changes to a pm library.

There is a way to have perldl auto load routines in the same way as MATLAB and Octave. Like MATLAB and Octave, these auto load files must have file names the same as their function names (except for the .pdl extension on the file name). By creating your own functions in .pdl files, you don't have to load them with the use command as with pm files, you can just call the functions, and they'll automatically be loaded when called.

To have the autoloader always available to you when you run perldl, put the following commands into your .perldlrc file:


use PDL::AutoLoader;

If you want the autoloader to reload if you've edited a .pdl file,
add this line to the .perldlrc file also:

$PDL::AutoLoader::Rescan=1;

You can set the environment variable PDLLIB to the path of your own pdl files so the PDL can find them.

Help is easy to find within the perl PDL system. In many languages, you can get help on specific routines, but if you don't know what name a routine might have, help is much harder to find. With perl (in the perldl interactive environment), just enter ??topic to get a list of all functions whose documentation contain the topic you enter. To then get information of a specific routine, use a single question mark. ?avg displays the documentation on the avg routine, as an example.

PDL Graphics

pdl contour plot pdl color contour plot

The perl PDL can accomplish a wide variety of graphical presentations, and to do so it can call upon a number of graphics packages. I commonly use the pgplot library for line plots, contour plots, and image plots. The contour graph you see above left is an example of a line contour plot created with the pgplot interface.

The graph on the right is a color filled contour graph of the sinx/x function, also created using the pgplot package. In addition to creating a graph for the screen, pgplot can directly output files in many standard graphic formats, including gif, postscript, and png.

pdl map plot pdl Tycho plot

There are a large number of color schemes included in the PDL pgplot package, and it's pretty easy to make your own. The color schemes are 3 row vectors of 256 values from 0 to 1, one row each for the red, blue, and green component. These are saved in the fits file format.

The image above left is a map produced from topographic elevation values, and are colored using a terrain colormap. As you can see, the images produced by the PDL with the pgplot package can be pretty elaborate. This graphic has almost a satellite photograph appearance.

The image above right is a digital image of the lunar crater Tycho, displayed by the pgplot package. The image was taken through my 6 inch reflecting telescope using an inexpensive web cam astrocamera (see Cheap Astrophotograpy).

I chose this moon image to emphasis the collection of image processing functions included with the perl PDL. I processed this image with a PDL program I wrote, adding some sharpening and contrast adjustments with the GIMP image processing program. I was able to make a similar program with other matrix languages, but the PDL made the process easy, and many more enhancements could be added that would be difficult in other languages.

The PDL program I created allowed me to select a sub-region of the image, select which of about 40 frames of the crater I wished to use, then align the frames and compute an average image. Using this process averages out the pixel anomalies and variations produced by the inexpensive web cam and the atmosphere. The result, photographs like this fine image of the Tycho crater.

The PDL also includes an interface to the plplot package from the PDL. The plplot package has similar capabilities to the pgplot package.

The pgplog interface lets the user interact with a plot, allowing the user to obtain mouse location, button pushed, and area selected. These features are employed in my image stacking program to allow me to effectively stack astrophotograpy images.

Plain PDL mesh plot Color PDL mesh plot PDL color 3d surface plot

For 3D and surface plots, PDL provides the TriD package. This package, which uses the computer's graphics card accelerator, plots all manner of 3D plots, and allows the user to interactively rotate the images around different axes to see the images from any perspective.

The left image above is one of the simplest of 3D plots provided by the TriD package. It's a simple mesh plot made with lines. The center image is a mesh with z-axis correlated colors added, and the right image is made with full hidden surfaces, with the lines turned off.

As you can see, the PDL can make many kinds of graphs, and these are only some of the simpler examples you can achieve. However, the more elaborate graphic methods take a bit of time to learn. I learn what I need, then package what I want in easy to use forms.

PDL File Handling

The perl PDL has an impressive, though perhaps a bit arcane, set of file I/O routines. The simple rcols routine can read in columns of data from ASCII files with ease. It is deceptively powerful, given that it can use the powerful regular expression mechanism to handle things like comment lines.

In following example, the routine asciiin can read in columns of numbers from a file where the values are either blank or comma separated, and the file can have comment lines with either an asterisk or hash symbol as the first character on the line. The EXCLUDE parameter, in regular expression, tells rcols to skip the asterisk or hash marked lines.

sub asciiin
{
my $fname = shift;
my @tmp = rcols($fname,{EXCLUDE=>'/^[\#\*]/'});
my $x = transpose(cat(@tmp));
return $x;
}

The asciiin subroutine uses rcols to read in a list of numbers, then uses the cat command to convert the list to a PDL piddle, then uses the transpose routine to put the piddle in the proper row/column matrix format. That's quite of bit of stuff being handled by about 3 lines of code.

As to image files, the PDL has built in routines to read and write fits format image files. If one installs the netpbm Linux library, then instructs PDL programs to use the PDL::IO::Pic library, the PDL programs can also read and write gif, jpeg, pnm and other graphics formats. This is yet another feature that makes the PDL an effective image processing platform.


Summary

In summary, the perl language with the PDL extension is a scripting language of tremendous reach. With high level concepts like objects, integration with a number of different graphics packages, a robust set of matrix functions, and unique text manipulation routines, perl with the PDL is a one-stop shop for data processing. One can work on data interactively through the perldl program, or make batch files that are easy to create and maintain. The syntax of the pdl is a bit different, but learning it gives the user capabilities not available in most any other mathematics language.

Below are listed my subjective pros and cons about the PDL math language.

Pros:
  • Freely available for both Linux and Windows. Some of the graphic utilities are only available in Linux.

  • The PDL has the entire Perl programming language to draw upon for support, making string handling and ASCII data handling second to none.

  • The programming syntax of Perl is very C like, making easy to transition to if you've used C, C++, or Java.

  • The Perl PDL system fully supports object oriented programming.

  • The Linux version can make use of the Linux Curses and TCL/TK interfaces.

  • Is fully integrated with the TriD 3d graphics platform and the PGPLOT 2D graphics platform.

  • With the PDL's integration with Perl, the combination is the most complete language on the Math languange scene.

  • Is very fast when problems are fully vectorized an use matrix operators, and still quite fast even when loops are used.

  • Works reasonably well interactively via perldl, and makes a very easy to use and effective batch language for mathematics programming.

  • Has full integration to graphics utilities so that mouse clicks can deliver data back to the perl script.

  • In addition to multi-dimensional matrices, the Perl/PDL supports more complex data forms such as lists and hashes.


  • Cons:
  • Is a big system, taking more time to load than many others.

  • While Perl syntax is easy to learn, being much like C, the syntax of the PDL can take awhile to master.

  • The PDL commands can be a bit verbose, making interactive use cumbersome.

  • If function libraries are used, troubleshooting new routines is cumbersome in that Perl won't reload a library. You must get out of perldl and back in to cause a library to reload. This is not true if the autoloader feature for individual .pdl function files is used.