PAVE User Guide

4. Using formulas

One of PAVE's most powerful features is its formula capability, which enables you to calculate and visualize derived variables from your datasets "on the fly". For example, you can calculate the ratio of a variable from one file to a variable from another file, and then visualize the ratio. It is easy to load formulas into PAVE using the Add/Delete/Select Formula_popup window, which appears automatically when you start PAVE. The window can also be brought up manually by choosing Edit/Select From Formula List from PAVE's Formulas menu. See Quick PAVE Jumpstart for more details on loading formulas.

All PAVE visualizations are generated using one or more formulas. A formula may be very simple. For example, the formula "O3a" refers to the variable "O3" in data set "a" - which is the first dataset that was loaded into PAVE. (Note that data sets are given sequential letters as they are loaded into PAVE, and are referred to by those letters in PAVE formulas.) An example of a formula to calculate the percent difference in O3 between datasets a and b is: "(O3a-O3b)*100/(O3b+0.001)".

Formulas must be in infix notation, and can contain the following operators, listed in their order of precedence:

  Highest  1) abs,  sqr,  sqrt, exp,  log,  ln,   sin,  cos,  tan, 
  Precedence  sind, cosd, tand, minx, miny, minz, maxx, maxy, maxz,
              mint, maxt, mean, sum,  min,  max, 
           2) **
           3) /, *
           4) +, -
           5) <, <=, > >=
           6) ==, !=
           7) &&
  Lowest   8) ||
  Precedence

Explanations of these operations are given below. If you wish to override the default operator precedence, or are uncertain as to which operator will take precedence, you can feel free to use parentheses in your formulas. This will force expressions within the parentheses to be evaluated first.

PAVE also has an occasionally used feature that allows you to specify a time step index after a variable name. For example, O3a:1 is the first hour of ozone. So, if you wanted to plot each cell averaged in time over the first six hours of your data, you could enter and plot the following formula:

     (O3a:1+O3a:2+O3a:3+O3a:4+O3a:5+O3a:6)/6

This is cumbersome and it also uses a lot of memory, but it may be useful for you.

There is another useful feature of the parser that not many people know about, that enables you to compute and visualize the the rate of change of a variable. For example, the formula d[O3a]/dt calculates the change in ozone concentration over time. A limitation of this feature is that the variable between the brackets must be an atomic variable, that is to say, it can not be a formula other than a basic variable from one of your datasets.

Formulas may also contain integer or floating point constants, or the following operands which are replaced by PAVE's formula parser to be the constant values noted:

E       2.7182818284590452354
PI      3.14159265358979323846
NROWS   number of rows in the formula's currently selected domain
NCOLS   number of rows in the formula's currently selected domain
NLEVELS number of levels in the formula's currently selected domain

The following operators are binary (they have an operand on both sides of the operator), and usually return an array of data by performing that operation on each cell of the operands' arrays. The only time these operators return a single number is when both operands (ops) are themselves a single number.

+       Returns the sum of the ops
-       Returns the difference of the ops
*	Returns the product of the ops
/	Returns ratio of the ops
**      Returns the left op raised to the power of the  
	right op, calculated using the C math library's 
	pow() function

The following operators are boolean binary operators. Boolean operators return either 0 or 1 for each cell of the resulting array of data, or in the case of two operands (ops) that are single numbers, just the single number 0 or 1. You may find these operators useful to "screen out" ranges of your data that are of particular interest. For example, if you are only concerned about the variable O3a when its value exceeds 0.080, you might look at the formula (O3a>0.080)*O3a. If O3a is less than or equal to 0.080, the result of the formula will be set to 0 in that cell. Otherwise, the value of O3a will be used in that cell.

<       Returns 1 if the left op is less than the right op,
        else 0
<=      Returns 1 if the left op is less than or equal to  
        the right op, else 0
>       Returns 1 if the left op is greater than the right
        op, else 0
>=      Returns 1 if the left op is greater than or equal 
        to the right op, else 0
!=      Returns 1 if the left op is not equal to the 
        right op, else 0
==      Returns 1 if the left op is equal to the right op, else 0
&&      Returns 1 if both ops are non-zero, else 0
||	Returns 1 if either op is non-zero, else 0

The following operators are unary (they have a single operand on the right side of the operator), and usually return a time-stepped matrix of data by performing that operation on each cell of the operand's array. The only time these operators return a single number is when the operand is itself a single number. The C math library routines called are listed with most of these operators. For further information on these routines, please check your man pages.

abs     fabs(op)
sqrt    sqrt(op) 
sqr     Returns the square of the op
log     log10(op) 
exp	exp(op) 
ln	log(op)
sin	sin(op) 
cos	cos(op) 
tan	tan(op)
sind	sin(op*(PI/180.0)) 
cosd	cos(op*(PI/180.0))
tand    tan(op*(PI/180.0))

The following unary operators return a single number in all cases. Their single operand must follow on the right hand side of the operator. The functionality is listed beside each operator name.

mean    average cell value for all cells in currently selected domain
sum     sum of all cell values in currently selected domain
mint    time step index with minimum value in currently selected domain
maxt    time step index with maximum value in currently selected domain
minx    x index with minimum value in currently selected domain
maxx    x index with maximum value in currently selected domain
miny    y index with minimum value in currently selected domain
maxy    y index with maximum value in currently selected domain
minz    z index with minimum value in currently selected domain
maxz    z index with maximum value in currently selected domain

where "currently selected domain" includes the currently selected rows, columns, layers, and time steps for the currently selected formula. Thus, the minimum and maximum values in the currently selected domain occur at the locations (minx,miny,minz,mint) and (maxx,maxy,maxz,maxt), respectively. An interesting use of the sum operator is to calculate the sum of the result of a boolean expression (e.g. sum(O3a>120)) to find the number of cell-hours that meet the boolean condition.

The unary min and max operators behave a little differently:

min     For each cell (i,j,k) in the currently selected domain,
        this calculates the minimum value for that cell
        over the currently selected time steps.  In other words,
        the minimum value in cells (i,j,k,tmin..tmax).

max     For each cell (i,j,k) in the currently selected domain,
        this calculates the maximum value for that cell
        over the currently selected time steps.  In other words,
        the maximum value in cells (i,j,k,tmin..tmax).

NOTE: currently the unary + and - operators [as in -1 or -(x+y)] are not supported, but hopefully these will be added later.

Next Chapter: Spatial and temporal data subsetting

Return To Table of Contents