Linux Goodies

In Pursuit Of The Perfect O/S



A Review of the Euler Mathematics Toolbox

Amazon Computers

Matrix Languages Head to Head

Matrix Language Advantages
Language Availability
Language Speed
Language Styles
Language Features
Benchmark Program
Language Syntax
Octave Notes
Scilab Notes
Euler Toolbox Notes
Yorick Notes
R Notes
PDL Notes

A Benchmark and Subjective Assessment of Matrix Languages

Are you searching for the best matrix language for your upcoming project? How about for those testy math or engineering classes? Perhaps you're just browsing, and would like to learn what matrix languages are all about. I've reviewed a number of matrix languages on different pages of this web site, but on this web page I will combine the individual reviews into more of a head to head review.

That head to head will show results of a benchmark produced by each of 6 matrix languages, each running an image processing program on the same data. I will also give my subjective assessment of each language as to what I see as its particular strengths and weaknesses. Hopefully the assessment will give you some insight for picking the language best suited to your programming style and your projects.


Why Matrix Languages in the First Place?

If you're new to the matrix language world, you may wonder why people in math, engineering and science flock to such languages. The concepts offered by matrix languages are discussed at Why Matrix Languages. Basically there is a desire to handle numbers on a computer easily, yet with speed. Easily tends to mean with a language that doesn't require you to do the old edit, compile, link, and run sequence. Interpretive languages avoid all that, so programming with them is much easier and code can be developed much faster. You just write code and execute.

The potential problem with interpretive languages is lack of speed. It takes much longer for an interpretive language to process large amounts of data, especially if a lot of looping is involved. The solution that matrix languages offer is speed. They accomplish that by having compiled routines that can process large matrices of data very quickly. In that way, the looping is down in the heart of compiled functions.

There are two trade-offs to matrix languages. One is that you must learn to vectorize your solutions so that you do little to no looping within your programs. That means you must learn to express your solutions as matrix equations. The other limitation is that in order to do this speed magic, matrix languages must potentially hold large amounts of data in memory. One nice thing is that modern computers tend to have huge amounts of memory, making matrix languages very useful and popular.

Here's a vectorize example. The first table shows a program to create one constant random vector and a large number of additional vectors to be dotted (dot product) to the constant vector. The program prints out the result on each 50000th cycle. The first language listing may look strange. It's a Forth like tool I wrote years ago before computers were big enough to run matrix languages effectively. The program is interpretive, working on a record of data at a time, and writing out the result as each record is processed. No big memory needed, and no compilation either. The catch with this utility? It uses a Reverse Polish Notation (rpn) language.

Non Vectorized Solution in Forth Like Language

0 = n : initialize counter
  rnd rnd rnd v= e : make initial random vector
do : start loop
  n 1 + = n : increment counter
  rnd rnd rnd v= b : make new random vector
  v@ b   v@ e   vdot = x : do dot product
  n 50000 == if   a . x . cr 0 = n   then : print, reset counter
loop : repeat loop

Though the language is likely unfamiliar, the concept is clear. Create a constant vector, then in a loop, produce a new random vector and dot product the constant vector with the new one. On a specified count, print results to the screen. In this cryptic language, the . means print. Each command has to be re-interpreted on each loop. Even though the language does a pseudo-compile to speed things up, it still can't keep up with a compiled program. But it is very easy to code and run, and is about as fast as a modern spreadsheet.

Here's a vectorized solution for the Euler Toolbox program:

Euler Toolbox Vectorized Solution

n=1000000; : n = number of vectors
y = random(3,1); : create random row vector
x = random(n,3); : create matrix of n vectors
z = x.y; : dot all rows of x with y
i = 1:50000:n; : create index for every 50000th
z[i] : print each 50000th result

Note that there are no explicit loops in the vectorized version. The number of random vectors needed are generated with a single statement. All dot products are produced with a single statement. The indices of the output values is generated in single statement. Just referencing the resulting vector (z) with the index variable and no trailing semi-colon lists every 50000th result.

Get the idea? To vectorize, you create matrices of results with matrix operators and matrix equations. There may be no loops in the program at all. The result is a program that runs very fast.


Decide on Your Criteria

What's Available?

To choose the best language for your purpose, it seems to me that there are about 4 things to consider. First of all, you have to consider what's available. I'm assuming we're talking free languages here, and Linux has likely the most available free languages. In this article, I discuss Octave, Scilab, Euler Toolbox, Yorick, R, and the PDL (Perl Data Language). There are certainly others, including but not limited to Freemat,Tela, and NumPy. I've reviewed all of the languages compared on this page on other pages of this site, in addition to Tela:

Euler Toolbox


Language Speed

There are a lot of benchmarks out there for most of the common languages. Some, like the Euler Matrix Toolbox (EMT for short), are less represented. I wrote one application for image processing that works languages pretty hard, and converted that to each of the languages discussed here. Details are presented later, but for this particular task, impressions are as follows, listed in order of speed:

Yorick : Fast, nearly 3x faster than the others
Octave : Moderately fast, 2.4x slower than Yorick
PDL : About 3x slower than Yorick
Euler Toolbox : About 3x slower than Yorick
R : About 3x slower than Yorick
Scilab : About 5x slower than Yorick

In truth, you should only use this benchmark data as a rough guide. It's only one comparison, and though it used a number of matrix functions in each language, surely isn't extensive in examination. Other tests using different features will likely give different results. But it is possibly useful as you begin your search, if speed is an important criteria for you.



You might be most interested in programming style. Each language is different in that regard. Some may look like what you've used before, others may look quite foreign. The following table gives some gross insight into to the respective language styles (more detail later):

Octave : Very Similar to Matlab, Fortran-like
Scilab : Somewhat Similar to Matlab
Euler Toolbox : Basic-like, simple and easy to learn
Yorick : Very C-like
R : Roughly C-like
PDL : C-like, but PDL objects are unique

Features Emphasized

It may be useful to consider what features may best suit your particular feature needs, irrespective of language style. Each language is good in a general purpose sense, like general data processing. But each excels at something. The following table gives some hints:

Octave : Signal Processing & file i/o, large user base
Scilab : Signal Processing, Xcos symbolic utility
Euler Toolbox : Simplicity and instructional ability
Yorick : Speed and scientific support
R : Statistics, has large user base
PDL : Image file flexibility and all-in-one shop

That's very concise and likely in need of elaboration, which will come later.


The Benchmark

I've worked with all of the languages compared in this article in the past, with the exception of the Euler Toolbox. I'd played with it before, but I thought I needed to take on a couple of bigger projects to get more familiar with it. One of the programs I wrote in the Euler Toolbox was a new look at an old Yorick program. The Yorick program is one I've used to process lunar and planetary images taken with some of my telescopes and my Celestron NexImage astro-camera. Some examples of processed images can be viewed at ETX Astro Photos.

The camera is a web cam modified for astro-photography use. It slides into the focuser of a telescope, replacing the eyepiece. It produces avi movie files of my selected targets. My goal is to align and correlate each successive frame with the first frame of a target movie, and ultimately average all the acceptable (chosen by correlation) frames together to produce an image free of web cam pixelization, and mostly free of atmospheric distortion. In matrix language programming, the result isn't a big program in lines of code, but a program that gives the languages lots of work to do, as correlating images takes a lot of mathematical processing.

I don't believe any of the matrix programs I use can read avi movie files directly. To solve that problem I used the Linux mplayer program to split out each frame of the movie to a pnm image file. The pnm family of file structures are very simple formats that some of the matrix languages could already read, and is easy to code in the languages that lack a pnm reader.

The old Yorick program is an early prototype, and as I experimented and tuned my solution, the code became pretty unreadable. But the concept that evolved is simple enough, so I wrote the Euler version from scratch, taking advantage of all that I'd learned in making the prototype. The result is a program of much cleaner code, and it produces images every bit as good as the ones produced by the old Yorick program.

Since the program isn't too big in the sense of lines of code, and does employ quite a bit of file i/o as well as math calculations, I thought it would make a good exercise for programming again in the other matrix languages I have available. In addition, it seemed that it would provide a pretty good benchmark for comparing the languages.

The exercise also proved that though each of these languages has some emphasis that makes it most suitable for certain problems, all of the languages are good for general purpose work. In two of the languages, Euler Toolbox and Octave, I did have to write routines to handle image files, but otherwise all languages had the features that could be packaged to do what I needed. Though I haven't looked, I suspect that I could have found image file i/o routines for Octave.

The Benchmark Program

The goal of the benchmark program is to improve a single image, say of the moon, like this:

Raw Albategnius Frame
Raw Albategnius Crater Image

You might think you could just use a photo manipulation program like Gimp to sharpen the raw image, but that doesn't work. Even just a small amount of sharpening on a single image will produce terrible pixel noise as shown below.

Noisy Albategnius Frame
Sharpened Single Frame

But if a few dozen images are aligned and averaged together, the graininess goes away, and sharpening produces images that reveal the detail nearly down to the resolution of the particular telescope.

Raw Albategnius Frame
Processed Albategnius Crater Image

The Benchmark Results

The much cleaner Euler Toolbox program was translated into the PDL language, the R language, the Yorick language, the Octave language. and the Scilab language. In the benchmark test, each program processed 154 images of Jupiter, taken with my NexStar 5SE and the Celestron NexImage camera. Each image (frame) was 640x480 pixels. Each image was aligned with the first image, and a correlation coefficient calculated as to correlation with the 1st image. Then each program used a correlation criteria to select images to combine into a final output. It's possible that I didn't use each language's optimal way of solving the problem, but I did solve it the same way in each. The table below shows the time required for each program to do the required operations.

LanguageTime (sec)Yorick Ratio
Yorick 34 1.0
Octave 83 2.4
PDL 106 3.1
Euler 110 3.2
R 116 3.4
Scilab 199 5.8

The Yorick Ratio column shows the ratio of each language's time to the Yorick solution time, which was the fastest. So Octave, for example, took 2.4 times a long to solve this particular problem than did Yorick.

I've typically found this type of speed result to be true. Yorick often outperforms the other languages on the types of problems I work on, though just doing specific matrix tests sometimes doesn't reveal that. I was pleasantly surprised to find that the old GTK Linux version of the Euler Toolbox kept up well with the speed of the other languages. The difference in speed on this problem between the Euler Toolbox, R, and the PDL is quite insignificant.

Of course, this is just a single benchmark, but it includes file i/o, matrix math, and graphics. Certainly applying the languages to a different problem might yield different results. But if speed is something of a concern, this test might suggest the order in which you consider languages discussed here.

The bottom line is, for most problems there isn't a terribly significant speed difference between the languages tested. I suggest looking deeper to pick your language.


Some Syntactical Differences

If speed isn't a game changer, personal programming preferences and style might be. The examples shown below are of a simple and contrived problem, to compute miles per gallon given distance and gallons, and return the input data as well as the answer. The multiple return requirement is just to show how the different languages make that available.

All of the languages can simply pack compatibly sized objects into a larger matrix and return that. But in this contrived case I wanted to show how in principle you could return multiple arguments that may not be compatible enough to share a matrix. In all examples, the names passed and returned don't have to be the same, just match in number of arguments. Here's how the language functions look:

Function Declaration - Euler Toolbox

function getmpg(dis, gal)
  mpg = dis/gal;
  return {dis, gal, mpg};

To Use Function:
{dis, gal, mpg} = getmpg(100,20);

Very straight forward, much like Fortran or BASIC might look. Notice that when multiple arguments are returned, they are enclosed within curly brackets. Likewise, the call statement must use curly brackets with the expected number of returned items.

Function Declaration - Octave

function [dis, gal, mpg] = getmpg(dis, gal)
  mpg = dis/gal;

To Use Function:
[dis, gal, mpg] = getmpg(100,20);

As you can see, Octave doesn't look much different from Euler Toolbox, except the return arguments are indicated in the function declaration line, and there is no return statement. Whatever is listed in the declaration will be returned.

Function Declaration - Octave using Structure

function x = getmpg(dis, gal)
  x.dis = dis; = gal;
  x.mpg = dis/gal;

To Use Function:
x = getmpg(100,20);

To access values:
x.dis for distance for gallons
x.mpg for mpg

Octave can also return different data types in a single container using it's structure technique. While in this case all variables were scalars, they could be different in size and type and be returned in this manner. Similarly, Octave can return a cell array with cells referencing different types of elements, and a cell2mat routine has to be referenced to dereference the cell array references.

Function Declaration - Scilab

function [dis, gal, mpg] = getmpg(dis, gal)
  mpg = dis/gal;

To Use Function:
[dis, gal, mpg] = getmpg(100,20);

The general layout of functions in Scilab, as shown here, is quite like (exactly like in this small case) Octave. The function declaration line is the same, and the reference to the function is the same as in Octave

Function Declaration - Scilab using List

function x = getmpg(dis, gal)
  mpg = dis/gal;
  x = list(dis, gal, mpg);

To Use Function:
x = getmpg(100,20);

To access values:
dis = x(1); gal = x(2); mpg = x(3);

Scilab can also return dissimilar elements in a list. While in this case all returned variables are of the same type and size, with a list that's not a requirement. You can then access the returned variables of the list by index. While the list extraction nomenclature in Scilab is different than that in R, the list concept in both languages is similar.

Function Declaration - Yorick

struct data{double *dis, *gal, *mpg;}

func getmpg(dis, gal){
  mpg = dis/gal;
  ret = data(dis=&dis, gal=&gal, mpg=&mpg);
  return ret;

To Use Function:
x = getmpg(100,20);

To access values:

Yorick looks a bit different. First of all, Yorick can only return one argument, and if multiple values of different shape are desired, they can be passed back as a C-style struct. Note the use of C-style pointer nomenclature. You can get around the struct by having a function simply pass back an array that holds pointers to the multiple arguments you wish returned, as shown below:

Function Declaration - Yorick, w/o struct

func getmpg(dis, gal){
  mpg = dis/gal;
  ret = [&dis, &gal, &mpg];
  return ret;

To Use Function:
x = getmpg(100,20);

To access values:

Again, this is a contrived situation, because the 3 scalar values could easily be handed back in an array without the pointers. But this example shows how you could hand back multiple variables by reference that may not be scalars, rather arrays or matrices of different size. The passed arrays would then be dereferenced with the pointer (*) indicator and the appropriate pointer array index. Note that the the struct gives a name to each item, the simple pointer array does not. But either can be used to return more than one argument.

Function Declaration - R

getmpg <- func(dis, gal){
  mpg <- dis/gal;
  list(dis=dis, gal=gal, mpg=mpg);

To Use Function:
x = getmpg(100,20);

To access values:

R has no return statement, the last value calculated or listed before the end of the routine is what's returned. If multiple values are desired to be returned, a list can be created. The items in the list don't have to be named as in the above example. You could just use list(dis, gal, mpg). If not equated to names in a list, the arguments can be accessed by indexing the result, like x[1] for dis, x[2] for gal, etc.

Function Declaration - PDL Returning Array

sub getmpg{
  my ($dis, $gal) = @_;

  my $dis = shift;
  my $gal = shift;

  my $mpg = $dis/$gal;

  return($dis, $gal, $mpg);

To Use Function:
($dis, $gal, $mpg) = getmpg(100,20);


@x = getmpg(100,20)

Then to access values:

Perl, as you can see, is different. The sub statement is what declares a function (or subroutine). In most modern languages, passed values are automatically placed into local variables. In Perl, an array of arguments named @_ is always passed to subroutines. Programmers must either use the shift statement to get values or variables from the array into variables, or use the (.....) method of getting values from the @_ array.

A potential gotcha in Perl is that variables are by default global. So the my operator explicitly declares a variable to be local. If multiple values are to be handed back, they can be put into an array (between parentheses). If an array is returned, a single array variable may receive the return array, and the variable must begin with the @ symbol to designate it as an array variable. Individual variables ($ variables) can be used within parenthesis to directly unpack the array into individual variables rather than using an @ array variable that will have multiple elements.

The above example places the variables $dis, $gal, and $mpg into an array for the return value. The array elements however, can in general be different size or types of elements. In this case they are all single valued numeric variables.

Function Declaration - PDL Returning Hash

sub getmpg{
  my ($dis, $gal) = @_;

  my $mpg = $dis/$gal;

  return("dis"=>$dis, "gal"=>$gal, "mpg"=>$mpg);

To Use Function:
%x = getmpg(100,20);

Then to access values:

In the above illustration, I show another method of passing back multiple values from a perl routine. Multiple elements can be passed back in a hash construct, which is much like what other languages call a record or structure. Notice that as perl uses the sigil $ to indicate a single valued variable or piddle, a @ to indicate an array, it uses % to indicate a hash.

Creating a hash looks much like creating an array except that labels are associated with variables in the hash. Then when using the hash, the labels can be used to reference the elements rather than referencing by numeric index.

As indicated in the description of the contrived problem, each of these languages can make it simple to just return such simple scalar variables in an array, like return [dis, gal, mpg]. The nomenclature for each language is slightly different, but all allow such a simple solution. The following examples show how in each language you can put all 3 scalar variables into a simple array called x, so the single array variable x can be returned. Thus, no need in this case for multiple argument returns, lists, or structs.

Combine Variables into Array for Return

Euler Toolbox : x = [dis, gal,mpg];
Octave : x = [dis, gal, mpg]
Scilab : x = [dis, gal, mpg]
Yorick : x = [dis, gal, mpg]
R : x = c(dis, gal, mpg)
PDL : $x = pdl($dis, $gal, $mpg)

Here's just a sampling of the syntax used to do a few matrix operations in each of these languages:

Scaler Multiply Two Matrices

z = x .* y : Octave
z = x .* y : Scilab
z = x * y : Euler
z = x * y : Yorick
z <- x * y : R
$z = $x * $y ; PDL

Even in the simple example above, you can see a difference in nomenclature between the languages. First of all, Octave and Scilab assume that basic math operators are matrix operations. So if you simply want to multiply each element of x by the corresponding element in y, Octave and Scilab need the .* operator, the preceding dot designating the operation as a scalar one. The other languages default basic math operators as scalar operators, and use something different to indicate a matrix operation.

But whoa!. PDL variables are actually piddles or objects, and designated by a leading dollar sign. A leading ampersand (@) designates a variable as an array. See Review PDL for more information about the PDL's peculiar syntax

Matrix Multiply Two Matrices

z = x * y : Octave
z = x * y : Scilab
z = x . y : Euler
z = x(,+) * y(+,) : Yorick
z <- x %*% y : R
$z = $x x $y : PDL

Now things get interesting. Since most of the languages use a simple * as a scalar multiplier, how do they signify a matrix multiply? For Octave and Scilab, it's simple -- they just drop the leading dot. For Euler, dot is the multiply operator. And what's with Yorick and R? Here, the PDL actually does something simple, it uses x as the operator.

Scale Sub-matrix by Scaler

x(1:2,1:2) *= 10 : Octave
x(1:2,1:2) = x(1:2,1:2)*10 : Scilab
x[1:2,1:2] = x[1:2,1:2]*10 : Euler
x(1:2,1:2) *= 10 : Yorick
x[1:2,1:2] <- x[1:2,1:2]*10 : R
($tmp = $x->slice("0:1,0:1") *= 10 : PDL

This operation doesn't vary so much, except for the PDL. You can see that some languages use parenthesis for matrix indices, and some use brackets. Some (Octave, Yorick, and PDL) have a *= operator which does both the multiply and the store. Scilab, Euler and R don't have that convenience.

But again, what's up with the PDL? This strange nomenclature is the result of the fact that PDL piddles aren't really matrices, but objects. So in some cases, simple math statements don't work. Even indexing into a piddle uses the interesting slice object function operator. In PDL there's also a dice operator. One speaks of slicing and dicing piddles, if that makes you interested. The odd nomenclature shown above for scaling a sub-matrix portion makes more sense if broken into two statements:

$tmp = $x->slice("0:1,0:1");
$tmp *= 10;

That looks a bit less confusing. First a handle into the sub-matrix is created, then that handle is scaled. The nomenclature in the previous comparison list shows how one can combined both PDL statements into a single statement. Also note that while all of the other languages index the 1st element of an array or matrix with value 1, PDL starts with 0. PDL also reverses the more common [row,col] indexing with [col,row], columns being the first index. Some people think of it as [x,y].

There are certainly more syntactical differences between the languages, but in general you'll find that Octave, Scilab, and Euler Toolbox nomenclature are similar to one another. They all declare functions similar to the Euler sample shown earlier. Yorick code looks very much like C code. R uses brackets like C, but the language constructs are different.

The most unique is the PDL. Since the matrix holding container is an object, not a matrix as in other languages, PDL leads to some quite strange operations. The strange syntax has some advantages, but isn't so easy to learn. Even with these simple examples, you can see that the PDL offers more than one way to manipulate things. More to learn, but more likely you'll find a form of expression that suits you.


Other Than Syntax, What's Different

Each of these languages has something that gives it some unique character or capability. That's true of other languages as well, these just happen to be the ones I have access to.

The Octave Flavor

Octave, as you may have read, is highly compatible with the commercial language Matlab. It was created back in the 90s by John W. Eaton, and named after one of his professors (Octave Levenspiel). Many companies and schools use Matlab, not so many individuals do. Why? Matlab isn't cheap, and all of the languages presented here are free. So Octave is free. When I say Octave is highly compatible with Matlab, I mean that it's syntax for programming and even common i/o, graphics, and math functions are similar. If you know one, you can easily learn the other. You can even easily port your code from one to the other in most cases.

What else? Octave is heavily loaded with signal processing functions. If you're going to do signal analysis or filtering, it's a good choice. I worked for years doing time series analysis, and occasionally needed some specific filter functions. Octave was my go-to for this because it has a big collection of such functions.

You may have Octave available in your current Linux package installer. If not, try GNU Octave. If you happen to use Puppy Linux, try Slacko Archive for a package called mathslacko.sfs. A kind Puppy Linux user named Emil put together Gnuplot, Maxima, Octave, and R in an easy to use package for Slacko Puppy, and the documentation says that at least parts of it work in other versions.

Octave is the language of this selection that is most indicative of what I call The Linux way of thinking. By that I mean it makes the most of what's generally already available to augment its power. Rather than have an internally developed graphics pack, it uses Gnuplot, which is available on nearly every Linux system. Octave programs can easily be ran in batch mode using the same technique as with a BASH file, or any other shell scripting language. The first line of a typical Linux script file has what's called a bang statement which tells Linux what utility runs the script. It's called a bang statement because its second character is an exclamation point.

Here's a few examples of that concept:

#!/bin/sh : Indicates Bash Shell
#!/bin/tcsh : Indicates C Shell
#!/usr/bin/octave : Indicates Octave
#!/usr/bin/perl : Indicates Perl/PDL
#!/usr/bin/yorick -i : Indicates Yorick

In each of the above cases, the code following the bang statement would be interpreted by the indicated utility. None of the other languages integrate with Linux quite that simply, though Yorick comes close. Most, like the Yorick example, need not only the bang statement, but some additional parameters. Some also need some prep commands in the following code.

Octave basically works with 2 dimensional matrices. Functions can return one or more arguments. In addition, Octave has a list type container that can hold data of dissimilar size.

Octave is one of the most flexible in its support of file i/o. In addition to simple load and save functions for matrices, it has a decent complement of C style file functions, allowing the user to likely read and write nearly any file structure, ASCII or binary.

As mentioned before, Octave uses Gnuplot as it primary plotting utility. Gnuplot can produce 2d and 3d plots with considerable flexibility, and can also show images. The version I was using for this project (version 3.2.3) didn't give me the ability to use the mouse to select points on an image plot. It would do that on a line plot, but not an image plot. Since I needed that ability for this image processes task, I had my Octave program call an external program (PDL) to do this task, and get the results from the external utility.

The the user interface for most of these languages, including Octave, is bleak to some programmers. They present a blank screen within which you type commands. Of course, for all of them you can use an external editor to create programs. In octave you generally create what's called m files. One nice feature of Octave is that you can reference any m file function within an interactive session or within a program without explicitly loading it. Octave finds the functions if stored in the m file path. Most other languages require you to load the functions or libraries by specific command, either by hand or within functions that require them.


The Scilab Flavor

Scilab is another language considered reasonably compatible with the commercial language Matlab. Scilab was created by the French Institute for Research in Computer Science and Automation. If not already available in your package manager, you can get it at the Scilab Download site. It is designed to be a general purpose matrix language, with particular support for signal processing, statistics, and fluid dynamics. It includes full 2d and 3d graphics support.

Scilab also includes Xcos, which provides a graphical method of laying out dynamic process solutions, similar to a flowchart. It's very handy for engineers in some fields of endeavor.

Scilab doesn't just present, even in Linux, a blank screen ready for user input. Instead it includes a GUI, with a left side column showing contents of the current working directory, and a right side column with a variable browser and a command history window. In the largest column in the center of the screen is the command window for user input.

Though like Octave in many ways, Scilab doesn't auto-load functions when referenced, they must be loaded by the operator, or loaded explicitly within a function that needs them. So as with the other languages described here other than Octave (and PDL with the AutoLoader option), the general mode is to make library files than contain several functions, and load them when needed.

Scilab works with 2 dimensional matrices. Functions can return one or more arguments. It addition, Scilab has a list type container that can hold data of dissimilar size.

Scilab has many of the Octave style i/o functions, but they are named slightly differently, such as mopen to open a file instead of fopen. In fact, a number of functions common in name between Octave and Matlab have different names in Scilab. To help with that, Scilab includes a considerable number of help screens to give guidance on converting Matlab code to Scilab code, as well as tables which show which Scilab functions do what specific Matlab functions do. Needless to say, Scilab, while comparable in many ways, isn't as close syntactically to Matlab as is Octave.

I noticed that in working with Scilab, when it encounters a syntax error when loading a file, it specifies the line number of the error without counting any comment lines included in the code file. If several functions are in the same library file, the error line indicated is not from the beginning of the file, but from the beginning of the particular function with the error.

This makes coding difficult using an external editor. But Scilab includes a syntax color highlighting editor that counts lines the way the Scilab loader does. So it's better to use the Scilab editor to create code as it makes debugging easier.


The Euler Toolbox Flavor

I should note that my experience with Euler Toolbox is with the GTK Linux version. That version is a bit behind the Windows version. Windows is the O/S for which the developer currently does all his development. You can run the up to date Windows version in Linux via Wine or Windows in a Virtual Machine. I use the native Linux version because it is sufficient for my needs, is more convenient, and runs faster because it's a native version.

My Puppy Linux version doesn't have the Euler Toolbox in its archive, so I got it from the Debian archive. Puppy Linux has a utility that can unpack a Debian deb file, and it installed easily. The GTK Linux version is also available at the Sourceforge GTK Euler Toolbox site. The Windows version is available at the Euler Math Toolbox site.

I consider Euler Matrix Toolbox (EMT) to be sort of an Octave Lite language. It works with two dimensional matrices as does Octave, but does not have a list type container (at least in Linux). It can, as does Octave, allow a function to return more than one argument, each can be a matrix, and each a different size.

EMT is not highly integrated into Linux, as is Octave. It is possible to run EMT programs in a batch mode, but the only arguments you can hand to EMT when starting it is the name of Euler function files. So a batch program needing operator input would have to get it from a file or ask the user for it.

EMT has far fewer file i/o features. It handles ASCII files very well. As to binary, it can only read and write byte and integer files, not floating point. So files from other languages must be converted to ASCII for EMT to use them.

Unlike all of the other languages, EMT does not provide a function for passing commands to the Linux system. There is a function named exec ostensibly for this purpose, but though documented, it is not functional in Linux. A solution I've used is to create a Linux pipe (fifo), and use a receiver program (I use a simple bash script) that listens to the pipe. Whatever comes to it from the pipe is handed to the system for execution. I then created a simple EMT function that takes a string argument and writes it to the pipe. This combination does what I expected the non-functional exec function to do.

EMT has little string support. One thing I needed for the test program was the ability to have an array of file names, one file name for each frame I wanted to process. It was possible to read the list of file names in as character arrays, and have a matrix of those, with each row holding the characters of a file name. A simple function to re-concatenate the characters into a string gave me the feature I needed. The Windows version of EMT has much more string support, including the capability of keeping strings in arrays.

EMT has a pretty solid collection of math functions built in, including general matrix manipulation, linear algebra, polynomial solutions, interval and exact solution functions, and statistical functions. It also includes an FFT utility.

What EMT has it spades is graphics capability. It can do 2d and very impressive 3d plots, as well as display images. It gave me the ability to interact with an image that I needed for the image processing program. EMT also uses one of the easiest to learn language syntaxes, with few arcane operators.

EMT also has, at least with respect to the other languages tested here, a unique notebook feature. All commands entered into the EMT window (as well as their outputs) can be saved as a notebook file. Comments can be inserted anywhere within this notebook before saving, to document the activity. The notebook files can be reloaded, and the cursor keys will step up and down through the commands, skipping over the command outputs, making it easy to run and/or modify and run previous commands.

The creator of EMT, Dr. Rene Grothmann, is a professor. He created EMT to use for mathematics instruction, and the notebook files are a wonderful tool for that purpose. I've found that they are also a handy tool for product development. I can reload a notebook and be quickly back at a project with all of the exploratory commands I used, and notes. Of course, the language is also fully capable of doing industrial work as well.

What I like especially about EMT is that it is quite a small and easy to install package. With some of the larger packages, it seems that when trying to install them into Linux flavors that don't have them already in their archives, you may spend a long time hunting down requirements. Less so with EMT.


The Yorick Flavor

Yorick is perhaps my favorite flavor. Probably because it uses a distinctly C like style, and I've programmed a lot in C and Java. So the Yorick style for the most part seems natural to me.

Debian and some of its derivatives have Yorick available through their respective package managers. It's also available at the Yorick Homepage. I've successfully downloaded from there and found it easy to get working in Puppy Linux.

As with Octave, the Yorick screen interface is very simplistic. Yorick comes up with a blank screen, ready for the user to type instructions. In fact, Yorick doesn't even have history support to allow the user to up-arrow to previous commands. There is a utility commonly available in many Linux versions that can solve this problem. It's named rlwrap. You can use it with Yorick like this:

rlwrap -c yorick

Rlwrap runs Yorick, the interface looks the same, except now the interface provides a history function as well as a file name completion function.

The creator of Yorick is physicist David Munro, and Yorick reflects that in the collection of science applicable functions. Yorick does 2 dimensional matrices, but that's not the limit. It can go to 7 dimensions and perhaps beyond. That's why, by the way, that the Yorick matrix multiply operator looks so strange. When multiplying matrices of over 2 dimensions, there must be a way to specify what's actually to be multiplied.

While Octave and EMT primarily work with double precision floating point for all variables, Yorick has most of the data types available in C. You can have byte arrays, integer arrays, string arrays, and of course floating point arrays. This adds flexibility, but you must be careful that you know what type of variable you're doing arithmetic with. Dividing one integer variable into another integer value when you're thinking floating point may well give you the wrong answer.

Yorick functions can only return a single value. That can be a number, variable, matrix -- or a struct. The struct is useful if you need to return a collection of dissimilar things, as in C. The awkward thing is that the structs must be globally declared, but then can be used in functions. For example, to declare a struct to store 2 double precision matrices, which may be of different sizes,

struct data{double *x, *y;}

will do the trick. This struct declaration must occur outside of any function declaration, but is then available for use within a function.

The struct capability makes Yorick quite flexible, but clearly having a bit of C programming in your background is a plus.

Yorick has integrated graphics functions, all with cryptic names, like pli, plt, plm, for plot x versus y, plot text, plot mesh. Strange names, but there are plenty of routines (this is just a sample), so Yorick is quite plot capable. It is likely that some help file searching will be necessary.

Its best to make libraries of related functions, as Yorick requires an include command for each library. The include commands can be part of Yorick programs, so the programs take care of themselves. Yorick is batchable, and like Octave, EMT, and PDL, all functionally in a batch file is preserved. That is, a batch file can interact with the user and present graphics.


The R Flavor

R is another language that is very popular, and so is likely to be in many Linux package archives. If you can find it there, its likely the best way to install it. If not, you can get it at the Cran-R-Project. If you happen to use Slacko Puppy Linux, you can get the mathslacko.sfs file at ibiblio Slacko Packages.

Hopefully the few code snippets give you a flavor of R syntax. It uses brackets like C and Java, but is otherwise unique in it's expressions. In general, the matrix manipulation functions look and work like those of most any other matrix languages, other than the assignment operator used instead of an equals symbol.

In Linux, R also presents just a blank screen for user input. It does have built in history support. I believe the Windows version has a GUI interface.

R can have variables of almost any type, byte, integer, and float. It has conversion functions to convert from one to the other. As with the other languages that have such variable flexibility, you must be conscious of what type of variables you're doing math with. Dividing integers by integers may not give you what you expect.

R has a reasonable assortment of file i/o functions, and I've never ran across anything I couldn't read or write. The functions aren't as simply constructed as the C style i/o functions of Octave however. R i/o functions allow a number of options by name, which makes some calls a bit verbose. R can read and write both ASCII and binary files. A sample of reading an ASCII file follows:

x <- scan(fname,what=double(0),skip=3,sep=c("\t"));

Very flexible, but not easy to remember.

R, like Octave and PDL, has a large number of followers, and supplemental functions for all three languages are probably available to simplify your programming challenges. For example, there was a package for R that allowed me to use it to load image files for my project.

R is heavily loaded with statistical math routines, and is used a lot in the statistics field. I've used R for time series analysis and image processing as well, so it makes a good general purpose matrix language.

R has a plentiful assortment of 2d and 3d graphics routines, making it useful for generating plots of all manner. Like the i/o routines, the plot functions often have a lot of named options one can use to tailor a graph.

R can be ran in batch mode, but it's not as convenient as using Octave for that purpose. To the R developers, batch means batch. That is, R goes to a dark place and processes data, and writes it out to a file (or files). But in batch it does not communicate with user, and cannot present graphics as its running. That capability is only available from the interactive R environment.


The PDL Flavor

PDL is a matrix math extension than can be loaded into Perl. PDL is offered by the developers as a One Stop Shop. By that they mean that within Perl you have perhaps the most effective text handling tool available, as well as a very capable report generating language. With the PDL extension added, you also get a full featured math package. In other languages, you may often find yourself resorting to an external program to convert data into some kind of digestible form before getting at it with your chosen matrix language. With PDL, you have all that capability in one package.

If the PDL isn't already in your Linux distro, check out the CPAN site for how to install the PDL in your Linux system (you probably already have Perl). If PDL isn't in your Linux package manager, it may be challenging to get installed. It that event be sure you check out all of the module requirements of the PDL and be sure you have them installed before attempting to install PDL. Then you will be ready to install PDL. If not already in your archive, the easiest way (assuming you've installed all of the PDL requirements, is to use the cpan command. This command directly gets the code from the CPAN site, compiles it and installs it. The CPAN site also provides access to countless supplemental packages to further enhance the PDL. Usually it can be installed as follows:

cpan -i PDL

After entering this command, go grab a soda or cup of coffee, sit back, and relax. It will likely take awhile.

PDL provides a user interface utility called perldl. Perldl, like most of the other Linux matrix languages, presents a blank screen ready for user typing. The basic perldl doesn't have history support, but that's available with an additional PDL extension. You can, as I do, just run perldl with rlwrap as described in the Yorick section.

rlwrap -c perldl

You can also use the cpan directive, likely already in your system, to install the ReadLine supplement to PDL, giving it a history function. Just do the following:

cpan -i Term::ReadLine::Perl

PDL doesn't auto-load referenced functions in its basic form. You must put loading commands (the use command) in library files so that they'll have the functions they need. There is a PDL module named AutoLoader that you can use, or even have always loaded by default by editing the .perldlrc file. With this enhancement, PDL, like Octave, will auto-load functions when they are referenced. This works for individual function files, but libraries will still need to be loaded by use commands.

PDL is also a language that can work with variables of many different types. There are bytes, integers, strings, and floats. So the same precaution mentioned before applies to PDL, be sure you know what kind of variables you're doing math with.

Like Octave, PDL makes a lot of use of external utilities that already exist. One of the main plotting utilities used by PDL is PGPLOT, for example. Like Octave and R, PDL has a pretty large on-line repository of donated libraries, possibly helping you with your problem solving. And, like Octave, Perl (with PDL) is easily batchable. To run a Perl or PDL script in batch, make the first line:

As with a shell script, Perl can be ran with just the appropriate bang line, followed by the program. That's one of the things about Octave, Yorick, and Perl that I like most. I like to create programs that each do some particular thing to data, like transform it into another coordinate frame, or compute regression coefficients on the data and output results. Then in a shell script file I may run 2 or 3 scripts or programs in a row, each doing their magic, with the final one giving the result I am seeking. With this procedure, script languages can be mixed and matched to solve a problem, with each offering its best capability.

Note that while the .perdlrc file can be used to set what modules to autoload when perldl is used, batch executed files don't use the .perdlrc file. So a batch perl/pdl program will need to have all necessary use modules explicitly listed, including a use PDL; instruction.

Perl is also quite flexible in file i/o capability, but like R can be a bit of a struggle to get set up. ASCII i/o is pretty simple. Binary quite doable, but not so simple. If you use some kind of locally standard structure, you can set up your Perl routines and forget about them. If you have to fiddle with the structure of the i/o routines a lot, it can get tedious.

Perl is used a lot in the astronomy community for image processing work. There, the odd syntax is actually advantageous. For example, if you have a Perl piddle $x that contains an image,

$y = $x->slice("100:300,250:400");

doesn't do what you might expect. You may be thinking, as with other languages, that $y is a new data object composed of a sub-section of $x.


In the PDL nomenclature, $y is a handle to reference the sub-section of the $x piddle. Changing $y in some way will actually change that subsection of $x.

This ploy can make some image work very handy as well as saving memory. Odd though it looks, it can be advantageous. If you actually want a new matrix that is a subsection of the old one, not just a handle into it, use the copy operator:

$y = $x->slice("100:300,250:400")->copy;

In support of its imaging processing functionality, PDL includes the most comprehensive library of image file i/o functions. So if working with images is your desire, the PDL may be your best solution. Perl is also a language with a lot of variation is expression. If you like expressive freedom, again Perl and the PDL may be your best solution.

I've successfully used the PDL for time series processing, regression, and filtering, in addition to image processing. It, like the others, is well adapted for general purpose matrix processing problem solving.

There are couple of drawbacks to using Perl as your go-to matrix language. One is obvious from some of the code snippets -- there's likely some learning to do. The other is, while the PDL offers an interactive interface (perldl), it doesn't offer a development interface that's as handy for program creation and testing as some of the other languages.

With any of the others, you may work on code in one window, then in the matrix language window, reload and tryout functions by hand. A handy way to know that each function is doing what you expect.

With perldl, you can load your program and exercise portions of it, but you can't reload it. You have to get out of perldl and back in again to load your changes. Better than conventional programming, but not as handy as just being able to reload your code in place.

While that situation is true of libraries, it's not necessarily true if you use the AutoLoader extension and individual function files. You can set up your .perldlrc file to aways use AutoLoader, and to also set autoloading to always rescan for new code versions like this: $PDL::AutoLoader::Rescan=1, which makes PDL operate much like Octave. With this setup, you can modify a .pdl function file, similar to an Octave m file, and the next use of it in the perldl utility will reflect the changes.

Though a bit odd in ways, Perl with the PDL is probably the most flexible and far-reaching of matrix languages. Perl users know the term TIMTOWTDI, which means: There Is More Than One Way To Do It. That flexibility can take a bit of learning, but all you need is to pick among the flexible nomenclature that fits your style, and never look back.