Google Sheets Review
Sin(x)/x 3D Plot
The above Mexican Hat plot looks familiar, right? It's the sin(x)/x
function that occurs in mathematics on occasion. The 3D version is always
eyecatching.
So what's special about this particular plot?
It was made with Google Sheets, the freely available spreadsheet
offered by Google. If you didn't think Google Sheets had 3D plotting ability,
you'd be right, at the time of this article anyway. But Google Sheets does have
a considerable number of plotting functions, and one will plot all rows or
columns of a selected area of the spreadsheet. To get hidden line 3D plots,
I had only to make a small custom function that makes a special matrix out
of one I want to see in 3D, with the special matrix having the hidden line
feature. Then I'm able to use the built in multirow or column plot feature
of Google Sheets on the special matrix to make the above rendition.
The default Google Sheets plot routine makes each line a different color,
but it is possible to change the color of each line with the chart editing menu
to whatever color desired  one line at a time. In this case, instead I used
Mtpaint to make
the lines the color I wanted.
Ok, so a contrived function can make a 3D plot of an arcane mathematical
function. If that's what you might be thinking, think again. The routine takes
any topographic (matrix of elevations or values), and makes a corresponding 3D
hidden line matrix ready for plotting. Below is a 3D rendering of a section of
mountainous terrain:
Terrain 3D Plot
Google Sheets Review On A Linux Site?
Okay, but what's a Google Sheet article doing on a Linux site anyway?
I will endeavor to explain.
Recently I purchased a Chromebook. Prior to
purchase, I'd studied Chromixium enough to know
pretty much what the Chromebook would do for me, so I bit. I reviewed both
Chromixium and the Chromebook on this site. Chromixium is based on a minimal
Ubuntu Linux operating system, and likewise the Chromebook OS is built on a
similar minimal Linux system. So they fit well in the subject matter of this
site.
I began to explore Google Sheets as a way of stretching the limits of what
I could accomplish with the Chromebook. While I can't do a lot of programming
on the Chromebook directly, as it supports only webbased programming with
Javascript, CSS, and HTML, I can make heavy use of the Google Office
products. And Google Sheets gives me the ability to write custom
functions, even libraries of functions.
In fact, the ability to do all of these things with a Chromebook actually
reveals something profound about Google Docs and its associated products.
It isn't that I can do office, professional, student, and hobby stuff with
Google Sheets in a Chromebook. It's that I can do all that from virtually
any platform. Through Google Docs I have a platform independent way of
functioning. I can choose to change platforms without loosing either the
products I'd been useing or the files I've created with those products.
And there's more. I don't have to worry about updating the software to
keep current and stay supported. Google does that, as the software doesn't
reside on my computer anyway. Plus, I don't have to be concerned about
backing up my Google Doc data. Google does that. And something else that
I'm sure we've all been bitten by. When using a Google Docs product, I
won't ever get bit because I forgot to do a save before exiting.
Google automatically saves every modification to a document or spreadsheet
that I'm working on.
So by way of Google Sheets, I found I also have the ability to do the kind
of programming I'm used to even with my Chromebook, creating mathematical and
data processing tools. I can do that by making extensive use of the Google
Sheets scripting tools. I'll cover information about Google Sheets scripts in
this article, but many more of Google Sheets secrets are revealed in Google
Sheets Reference and Cheat Sheet: The unofficial cheat sheet reference for
Google's free online spreadsheet application.
I figure that Google Sheets, with its scripting capability, fits well in
this site just as the reviews of many other mathematical, Matrix Languages
available to Linux users. The table below lists matrix languages reviewed
here:
Granted, a spreadsheet isn't quite like these languages, but it shares many
characteristics, is available to Linux and Chromebook users, and many people
primarily use a spreadsheet to do the things I and others tend to do with
matrix languages. Since Google Sheets is a primary tool available to me on a
Chromebook, I decided to see to what degree I could accomplish what I was used
to doing with the matrix languages in the above table.
Google Sheets For Number Crunching?
Starting out, Google Sheets offers a lot. As the table below illustrates,
at the time of this writing Google Sheets has 350 intrinsic functions, divided
over the different categories as listed in the table.
Category  Number 
Array  15 
Database  12 
Date  23 
Engineering  13 
Filter  3 
Financial  44 
Google  15 
Info  17 
Logical  7 
Lookup  15 
Math  60 
Operator  15 
Parser  6 
Statistical  73 
Text  32 
Grand Total  350 
In the Financial, Math, and Statistical areas,
users have a lot of choices. But I primarily work in the matrix area, and
that part is a bit weak, compared to the matrix languages I'm familiar with.
In addition to a generous collection of intrinsic functions, Google Sheets
has additional features that extend the spreadsheets' capabilities. It has, for
example, pivot tables, which is a feature that provides powerful ways to
build summary sheets from large data sets. Plus, I can publish sheets,
such as this one that presents a collection of Astronomical
Tables that are popular with amateur astronomers. Included are data and
summary sheets for the Messier, Caldwell, and Herschel 400 astronomical target lists. The pivot tables feature was used to create the summary pages for the
different star object collections. Each summary is available as a tab at the
top of the main sheet, and each shows how many of each type of star object are
listed in the collection. Publishing it as a web page was just a click under
the spreadsheet File menu. This is just a very limited example of
what can be accomplished with the pivot tables feature. Also available in Google Sheets is a powerful custom formatting tool that
lets users set color, font, and other formatting aspects on a cell or array of
cellls based on a wide range of criteria. Plus, I can apply names to ranges of
the spread sheet for a much easier way to reference ranges in equations and
functions. Like most spreadsheets, Google Sheets is primarily cell
oriented. One enters equations into cells, and then copies them to all other
cells where the same equation is desired. For example: the equation
=sin(A1)*10 would automatically become =sin(B1)*10 if copied to
the cell to the right, or =sin(A2)*10 if copied to the cell beneath the
initial equation cell. By thus entering equations over ranges of a spreadsheet,
matrix operations can be accomplished. Matrix languages tend to be, as the name implies, matrix
oriented. A single command is used to specify an operation over a range
of matrix elements. A matrix may be named X for example, and the
equation: Y = sin(X)*10 would automatically apply the equation to
every element of the X matrix, storing the result in the Y
matrix. So matrix languages tend to be more terse, and equations treat entire
matrices as algebraic objects instead of each element (or spreadsheet
cell) as an object. In Google Sheets' defense, however, I must mention that in the minimal
collection of array operators available, there is one called
ARRAYFORMULA, which allows a user to write an equation that refers to a
range instead of an individual cell. So if I had, say, 100 cells in the range
A1 to A100, the ARRAYFORMULA operator could be used as follows:
=ARRAYFORMULA(sin(A1:A100)*10), which would propagate 100 values in a
column, starting at the ARRAYFORMULA cell. This operator is very handy for us old matrix language users, in that it
lets us express a range solution in a single equation. Very handy if I'm
tweaking parameters or trying different equations. Combining it with the Google
Sheets feature of letting the user apply names to ranges lets me express
equations similar to that of matrix languages. If, for example, the range
A1:A100 is named X using the range naming feature (Named
Ranges under the Data menu), the new formula would become
ARRAYFORMULA(sin(X)*10). Google Sheets includes matrix multiply, transpose, and inverse functions in
its collection, so many matrix solutions can be obtained by using these
operators and the ARRAYFORMULA range operator. But I found these not to be
enough like the approach I generally took to solve problems with matrix
languages. So I explored the Google Sheets scripting feature. I found it
worked well, and for the matrix solutions I sought, was very easy to
implement.
Extend Google Sheets With ScriptsAlong the top menu bar of a Google Sheets spreadsheet is an entry called
Tools. Clicking that reveals a number of utilities, one of which is
labeled Script Editor. That utility will open another tab to an editing
screen where I can create scripts tied to the spreadsheet from which it was
spawned. A typical function is already defined, to give one a start. The
scripts are written in Javascript, though rather than being executed on the
client computer, they are executed on Google servers. The new (as of 2014 or so) version of Google Sheets makes passing range
expressions, cell values, and fixed parameters easy for script users.
Individual cells and fixed parameters are passed to script functions much like
one is used to when writing functions in any language. Ranges are simply passed
as matrices, which Javascript can handle easily. For example, if I wanted a utility to scale a range of values by some
constant, the code could look like this:
function msca(m1,sf){ 
var nr = m1.length; 
var nc = m1[0].length; 
var m = new Array(nr); 
var i; 
var j;  
for(i=0 ; i<nr ; ++i){ 
m[i] = new Array(nc); 
for (j=0 ; j<nc ; ++j){ 
m[i][j] = m1[i][j] * sf; 
} 
} 

return m; 
} 
As you can see from the coding example, a script looks similar to any
routine you've likely written in many languages like FORTRAN or C, and received
parameters are simple and easy to work with. The routine can be passed a range
of any size, as it can determine the size before operating. The nr or
number of rows value is the length of the matrix, and the nc or number
of columns values is the length of the first row (m1[0]). I can refer to
the new function in my spreadsheet as follows: =msca(A1:C200,20) to
scale a range by the factor of 20, as an example. The constant 20 could be a
value in a cell, say D1, and called thus: =msca(A1:C200,D1).
The cool thing about user script functions is that they can be called by
other user script functions, not just referenced within the spreadsheet. So I
can write pretty complex operations, broken up into a number of easy to read
and maintain functions. Or, as I've done, create a fairly extensive library of
extensions (or multiple libraries).
How To Create Script Libraries
I found it easy to create custom scripts with the script editor, but I
wanted to create some general scripts that I might wish to use in many
spreadsheets. I didn't want to have to cut and paste scripts from one
spreadsheet to another, I wanted to be able to reference a library.
Sure enough, it's possible to do that. From within the Script Editor
under the File menu, I can select Manage Versions, which
lets me save a script as a library. This library can be used by other sheets.
To use a saved library in a new spreadsheet, I need the Project Key
for the library. This key is found under the Files, Project
Properties click in the script editor used to create the library. Project
Properties opens a small window with information about the particular script,
including a rather long sequence of characters next to the label: Project
key.
In a spreadsheet needing access to a saved library, I select the
Resources menu from the spreadsheet's script editor window. From the
list display, I select Libraries. This brings up a window that allows me
to enter Project keys from libraries that are desired. This only has to
be done once, because the library selections are saved with the
spreadsheet. If I revise a library from the script editor associated with
the spreadsheet where I developed the code, I might need to visit
spreadsheets using the library, select from the respective script editors
Resources selection, and select the latest revision of the revised
library.
To actually be able to call a function that's in a library listed in
Resources, I have to write a dummy function that references the library
function, and then return the result. For example, I have a library I called
matlib, and within it is a function named mrand. The mrand
function returns an array of uniform random numbers. To use it, first I add the
library Project Key to my Script Editor ResourcesLibraries menu. Then in my
script editor, I write a small function that calls the library mrand
function:
function myrand(nr,nc){ 
var m = matlib.mrand(nr,nc); 
return m; 
} 
The local spreadsheet defined myrand function references the library
version matlib.mrand and returns the result. The dummy function may use
the same name as the library function if desired, mrand in this case. I must
have such a local function for each library function I wish to reference from
my spreadsheet.
Since I've already added the matlib key to my resources, I can further
create any other functions that reference matlib functions. All matlib
routines are at my disposal because the matlib key is in my spreadsheet
resources.
Given how easy it was to write most routines and interface with the
spreadsheet, I created a healthy library of matrix routines to mimic much of
what I had previously done with matrix languages. I added routines to do the
basics, including add, subtract, scale, scalar multiply, matrix multiply,
inverse, and transpose matrices. Then I scanned through the matrix languages
with which I'm familiar, and added routines to accomplish some of what was
intrinsic in them, like matrix builders that will return an identity matrix, a
matrix of some constant, a matrix of sequential values, or a matrix of random
numbers.
Armed with this collection, I'm able to attack problems in much the way I
was accustomed, even though working in a spreadsheet. It gives me many more
extensions that work on spreadsheet ranges (matrices) instead of working
with cell equations.
So What Can You Do With Custom Scripts?
As the 3D plot illustrated, I was able to add some interesting graphics
ability to the spreadsheet. At the time of this writing, Google Sheets has no
contour plotting function either. But using the spreadsheets Custom
Formatting feature, I'm able to set up a color scheme for a 40x40 grid area
of the spreadsheet such that the colors used for the cells depended upon what
value the cells contain. I created a function that will take any array,
stretch or compress it to a 40x40 cell array, normalize it to match the values
in the custom formatting table, then return the values to the custom formatted
area of the spreadsheet.
The result? color contour maps, like the one
below:
Color Contour Plot
Not high resolution, for sure. But a usable color contour that when combined
with the 3D plotting feature, adds the ability to do some interesting work
outside of the more common financial exploration commonly pursued with Google
Sheets. As a matter of fact, the 3D terrain plot, shown again below, is a view
of the region presented in the contour plot, viewed from the left (or west)
side. Notice the low area (blue in the contour plot) to the left foreground in
the 3d plot, and the mountain peaks (white in the contour plot) to the
right.
Terrain 3D Plot
When all of that worked well, I added additional features, including
interpolation, derivative, integration, FFT, PSD, and filter functions. A
pretty good starting point for attacking the timeseries type problems I often
did with matrix languages.
As to speed, Google Sheets intrinsic functions are very fast, yielding
virtually instant results when operating over thousands of cells. The custom
functions are slower, but still return results in just a few seconds when
operating over, say, 5000 rows and half a dozen columns. In fact, the
requirement by Google is that scripts return in under 30 seconds, and I've had
few occasions when functions took that long.
I've read of scripts that are exceedingly slow, but invariably that
problem occurred when users referenced spreadsheet cells within loops, rather
than operating as my matrix utilities do, on entire matrices passed in bulk. My
utilities receive all data at the start, and return a completed matrix at the
end, doing all work on the bulkreceived data. That works well.
Why Google Sheets Over A Matrix Language?
So why bother with a spreadsheet when I have access to matrix languages?
Actually, there are a few reasons. First, the Google Sheets with scripting
ability provides a serious tool for Chromebook users, who generally do not
have access to the faster matrix languages. Chromebooks already offer
great portability, long battery life, and quick boot up. With Google Sheets,
they offer mathematics and programming as well.
Second, Google Sheets is available to me on all of my computers, as well
as is my most recent work. I can start a project on a desk top, add a little
from my laptop when it's more convenient, and even do the work on my Chromebook.
The omnipresent access is pretty handy.
Third, though not as much for me, work on Google Sheets is easily shared or
available to groups. I can see that if I were working on a team project, or was
teaching some math technique to a class, it would be very handy.
Google Scripts Guidelines
If you decide to check out Google Sheets for its scripting feature, you
might find some of the following guidelines helpful.
Work on bulk data
As the illustrated code suggests, pass in entire sections (ranges) of
the spreadsheet, and work through the elements of those. Loading values takes
time, but if there's only one initial load (arguments passed in) and one
return, then the code works much faster.
Mind the inner loops
Most of the work in a routine is done on the inner loops. Making inner
loops most efficient will pay dividends. As an illustration, I created
a function that lets me pass in an equation as a string, plus up to three
matrices that can be used in the equation. I find it easier to work with
than the aforementioned ARRAYFORMULA (though ARRAYFUNCTION operates
faster). My function could be called thus:
=solve(x sin sqr y cos sqr + sqrt ,a1:a100,b1:b100)
This equation takes the sine of each cell in the a array, squares it,
takes the cosine of each corresponding element of the b array, squares
it, adds them together, and takes the square root of the result. It will do
this through the range of the input arrays, and return a solution array of the
size of the input arrays. I find it easier to pass the arrays in, and then be
able to reference them with x, y, and z (if all three arrays are passed), than
using range nomenclature with ARRAYFORMULA. You may have noticed that
solve is a reverse polish (RPN) solver, which is much easier to
implement.
When I first created solve, I ended up with a lengthy switch
statement to handle all of the possible equation function references, and that
being referenced on the inner loop. The routine has 3 nested loops. Looping
over rows, then columns , then applying each equation element to the
cells in the columns. The delay caused by referencing the large switch
statement for each operation on each cell was intolerable.
Javascript, being the strange animal that it is, allowed me to move my big
switch statement up before the loops that were applying the equation, wherein
the switch statement is looped only once through the equation elements, with
the respective case statments storing an anonymous function for each of
the specified equation elements into a function array. Each element of the new
array is then a function to do the referenced equation operation, like +
for add, sqrt for square root. Sort of a poor man's compile.
Now the inner loop, where the equation is applied, consists of only running
functions in an array, no if's or switches. The looping portion of the code is
shown below:
for(i=0 ; i<nr ; ++i){
for(j=0 ; j<nc ; ++j){
for(k=0 ; k<nops ; ++k){
f[k]();
}
m[i][j] = st.pop();
}
}

In the code snippet the functions in the f array work with a data
array named st (used as an RPN stack), The first loop goes through the
nr rows, the next loop the nc columns, and the inner loop through
the nops number of operations specified in the equation. The use of
anonymous functions stored in the f array makes the inner loop run quite
fast. After execution, the m array, containing the results left on the
st stack after each set of operations is performed, is returned.
Library Caveats
Google has published guidelines for writing effective and efficient
custom functions. One thing they caution against is the use of libraries.
This is primarily a caution for Cell oriented functions. If for
every cell in a large array, a call is made to a function that must
reference a library function, the delays can become significant and frustrating.
However, I've found that there is little effect with the functions I've
written, in that they all work with bulkpassed arrays, and tend to be called
only once to operate on hundreds to thousands of cells. With this technique,
the overhead of calling a library function is swamped by the moving of the
bulk data to and from the function.
In fact, as of this writing, I've found that the primary delay in using
custom functions is the movement of the bulk data. Replacing the matrix return
statement in a complex custom function with a return of a Done message
reveals that the functions actually run very fast. The time delay is primarily
the return of bulk data from the Google server to the open spreadsheet.
Reducing Scripting Delays
So a suboptimal approach when using scripts would be to use several scripts
whose inputs are dependent on one another, causing the delays to add up.
Script 2 may have to wait while the data transfers to/from servers finish for
script 1, then script 3 waits for the delays from script 2, etc.
To get around this, write an uber script that is passed data and then
internally runs the other scripts, returning all desired results in a final
transfer. This way there is only one to and one from transfer
delay between spreadsheet and servers, and little if any delay between
script calls within the uber script.
Clear as mud? Here's an example. I was checking out the function of an
FFT PSD routine and an autocorrelation PSD routine, both of which return
a Power Spectral Density computation from input data. For the data, I was
running a script that produced uniform random numbers passed through a simple
filter, the filter being used to control the frequency distribution of the
random sequence. Below is a graph of what I was examining. The noisy (blue)
line is the FFT PSD results, and the smooth (red) line is the
autocorrelation PSD results.
Initially, I had a script that produced the random numbers, a script that
applied the filter, a script that did the FFT PSD on the filtered output, and
a script that did the autocorrelation PSD on the filtered output. The filter
script had to wait for the random numbers, and the PSD scripts had to wait
for the filter output. So there were three data movement delays if I
regenerated the random numbers, or at best two delays if I reused the
same random sequence but reran the filter script with a different
coefficient.
To speed the process up, I rewrote the filter using Google Sheets intrinsic
equations, so that the filter process happened almost immediately when the
filter coefficient was user modified. So the only pause was the two PSD
scripts, which ran in tandem since they didn't depend upon one another. The
overall time for an analysis run was cut in half.
But just as fast was an uber script that was passed the number of points I
wanted, and the filter value. The uber script internally called a random number
script, a filter script, and the two PSD scripts, returning the PSD data as
two columns. This script was as fast as the version reusing the same random
sequence and doing filtering with intrinsic equations. So more work could be
done, generating random numbers each time, yet run just as fast as long as only
one data transfer between servers and the spreadsheet was required.
So a general rule of thumb is: combine dependent scripts into an uber script
so they are executed in sequence within the uber script rather than waiting on
one another in the spreadsheet. Another point is that summary type scripts work
even better, in that the amount of returned data is less. Larger data returns
cause bigger delays.
Summary
If you have a Chromebook and have the itch to write some code, but
aren't into web page making  check out the scripting utility of Google
Sheets. You get all the fun of coding, but none of the headaches of creating
web pages. You can let the spreadsheet handle all of the display issues and
file handling  you get to have fun just creating solutions. As a plus,
you'll have access to your projects from virtually any computer, anywhere.
