R, the Free Statistics Language
The plot you see is an example of a mesh plot produced by the matrix
scripting language R.
R is a language created by and used by mathematicians. R is an open source
clone of the commercial language S. R is an object oriented language, and every
declared function is an object. The object oriented nature makes some syntax
seem peculiar, but it's because things are being done by object functions
instead of language intrinsics as with other most languages.
Object oriented languages can accomplish some things that are difficult or
impossible in non object oriented languages. As an example, the object nature
of R provides the ability to pass a function name to a function, and have that
passed function executed within the called function.
In Linux, R has a simple command line interface. The Windows version comes
with a GUI interface. R has a nice history mechanism, allowing the user to
scroll back through, and modify if desired, previous commands. The history is
not maintained from one invocation of R to another.
|
R is heavily populated with statistical functions, and also contains some
signal processing functions such as filtering, interpolation, and regression. R
programs can be ran in batch mode, but when that is done there is no user
interaction. In batch mode, input parameters must be from files, and output is
to a file. The user can create a file in their home directory named
.Rprofile and use it to auto-load any R modules of their own that they
often used. That makes R routines quickly accessible by entering the language
with just R, and then executing the functions.
|
|
R Syntax
R has the most convenient methodology for providing function calls with
variable arguments. The argument list in a created function simply has the
default values for optionally passed parameters defined in the function
declaration statement. No logic needs to be created in the function to
determine if parameters were passed or not. Most other math languages require
the programmer to do some coding within a function to deal with parameters that
may or not be passed. The defaulting or parsing of passed parameters in R is
done completely by the R system. The following function declaration illustrates
the method.
scalemat <- function(mat, sf=2){
m2 <- mat * sf
}
In the code example, the function scalemat will return a matrix
scaled by a provided scale factor (sf). In the example, the sf parameter is
defaulted to 2. Within the code the parameter sf can simply be used, and it
will either have the defaulted valued if not passed to the function, or the
passed value. Notice also that no return statement is used. The last
variable created in a function is automatically the one passed back to the
calling routine.
R has a list construct that can be used to package multiple arguments under
one name. The elements of the list can be named when created, and the list can
be returned by a function. The individual elements of the list can be accessed
by index or by name, if names were assigned. The following code snippet
illustrates the use of a list. Variable x is assigned a list of 3 elements.
Element a is a scalar, element b is an array, and element
c is a string.
x <- list(a=10, b=c(10,11,12), c="label")
u <- x$a
v <- x$b
w <- x$c
Unlike MATLAB, R does not auto-load user defined modules just because they
are referenced. Modules have to be loaded with a source command. I find
it works best to package related routines into module libraries so that when an
R module is loaded with the source command, all relevant routines are loaded at
once.
R uses some interesting syntax that takes a bit of getting used to. Even the
equal sign equation nomenclature commonly used in other languages is different
in R. A couple of example R equations are listed below:
x <- c(10,20,30,40)
y <- rbind(c(1,2),c(11,12),c(15,19))
As you can see, the <- operator is used to store into variables instead
of the more common equal sign. The first equation stores an array of numbers
into a variable named x. The c(...) operator is a function that creates the
array. Notice the second equation. This equation creates a 3 row, 2 column
matrix. The c(..) operator makes arrays, and the rbind operator combines
arrays and matrices into rows. There is a cbind operator that combines
arrays and matrices into columns.
Unlike MATLAB and Octave, R mathematical operations default to scalar
operations. Special operators are used to specify matrix operations. For
example, the following example illustrates multiplying a scalar cell by cell
multiply of matrix A by matrix B, then the matrix multiply operator.
Scalar multiply:
C <- A * B
Matrix Multiply
C <- A %*% B
For help, the user can type help(topic) for specific documented help
topics, or help.search("subject") for a list of possible help topics
pertinent to the supplied subject.
R comes with many function libraries, and even more can be obtained from the
Comprehensive R Archive Network
known as CRAN. The CRAN website offers documentation, FAQs, and downloads or
many contributed packages.
R Graphics
The graph at the upper left is an R color contour map of the sinx/x
function. R can make labeled line contours as well. The graph you see at the
upper right illustrates an R color contour map with a line contour map overlay.
This shows that the line contour and color contour features can be combined.
Users can make such maps with many options, including lines only, colors only,
and different color schemes.
R has an extensive integrated graphics library for doing 2D and 3D graphics.
Being statistical in nature, R also offers a number of plots that help in the
statistical analysis of data. For example, given a matrix with related data in
columns, a simple call to a routine called pair will produce a window full of
scatter graphs that plot each column versus each other column. This allows a
quick qualitative determination if any of the functions are correlated with one
another.
R also provides mechanisms for obtaining mouse position and button
information from graphics windows. R can present image graphics as well, and do
so with respectable speed. While R doesn't come with image I/O routines, the Comprehensive R Archive Network has
download-able R routines for loading and saving fits file formats. Fits is a
commonly used color capable graphic format used in astronomy.
As it happens, there are many utilities in Linux that can handle fits files,
along with about every other graphics format you might have heard of. The
utility convert from the imagemagick package is one that can
convert just about any kind of graphics format to any other graphics format. In
the process, it can also enhance, crop, or provide many other operations on the
image during conversion. So having even just the fits file format available for
R is sufficient in Linux, given the capabilities of the convert utility to
convert anything else to fits format.
|
I converted a little utility from PDL to R that steps through a sequence of
web cam astro-photos allowing me to select a reference point for frame
alignment, and if desired to crop images. The graphic display of the images is
reasonably fast, and mouse control is easy to use. I created this program in
several matrix languages,and found that not all did the task well. But R
handles the problem nicely, giving me a handy utility for cropping, aligning,
and stacking lunar and planetary images. See 6 Inch Reflector
Astrophotography for examples of the images this technique can produce.
|
|
Summary
In summary, I find R to be an excellent choice for a scripting matrix
language. The syntax takes a bit of getting used to, but the speed and
functionality of the language is impressive. It is well documented, and
supports a wide variety of graphics presentation methods. It handles a wide
range of data investigation techniques, including statistics, regression,
filtering, and signal processing. It has flexible enough I/O capabilities to
handle different data formats, making it quite applicable for data processing
tasks. It is even capable of being used for some image processing.
Below is my subjective evaluation of some characteristics of R.
Pros:
Freely available for MacOS, Windows, and Linux.
Very similar to the commercial language S.
Has a very large software archive (CRAN).
Especially good for working on statistical and time-series problems.
R is not limited to 2 dimensional arrays.
R has a richer variety of data types than many matrix languages, such as
character, logical, integer, complex, and double.
R supports more data forms than just multi-dimensional matrices, such as
arrays and lists.
R has a good enough collection of file i/o routines to allow a user to
move files to and from most any external utility.
R has the easiest method of creating variable number of argument functions
that I've ever seen.
R has 2D and 3D graphics support, and mouse clicks on graphs can return
information to the R script.
R has support to help in the generation of reports based upon analysis
results.
R works well interactively and can be ran in batch mode.
Cons:
The use of the <- symbol instead of the more common =
for data assignment can take a bit of getting used to.
While R is very fast at performing matrix operations, it slows down
considerably when loops are used extensively.
For interactive use, R is a bit slow on high density graphs, like
photographic images.
R doesn't directly support any graphic file formats, though Fits file
packages are available from the CRAN archive. I found that adding PNM
graphics file routines to be very easy.
Because more data types are available, there's a steeper learning curve
than with say, Octave.
|