Using arrays in pyrealm

Most of the functionality in pyrealm is designed to work with data arrays that can have multiple dimensions. The input data can be single scalar values - representing a point estimate - or multi-dimensional inputs - such as a time-series on a spatial grid. When inputs have multiple dimensions, these can be thought of as axes: a 3D array might have a time axis, an X axis and a Y axis. The shape of an array is then just the number of observations along each axis.

If all of the inputs vary over all of the axes, then the input data all will need to have the same shape. However if one of the inputs is constant along one of the axes then its shape only needs to match the other inputs along the axes that it varies over. As an example, a function might need the following two inputs:

  • The temperature, which is a 5 x 5 spatial grid on a time series with five observations and therefore has the shape (5, 5, 5).

  • The elevation, which uses the same 5 x 5 spatial grid but is constant with time and so has the shape (5, 5).

Any calculation with these inputs has a problem: which two axes in the temperature data does the elevation map onto? In earlier versions of pyrealm (<=1.0.0), users were required to manually make all inputs the same shape by repeating data along axes. However, from pyrealm 2.0.0, users just need to make sure that inputs have compatible shapes, as described below.

Note that scalar inputs (a single value) are a special case and are automatically assumed to be constant across all the other inputs.

Array shapes

In pyrealm, we make use of the array broadcasting rules of the numpy package, with an additional restriction on the number of dimensions. It is worth reading that link but, in summary, all array inputs to a pyrealm function must be mutually broadcastable by satisfying the following rules:

  • have the same number of dimensions, and

  • have the same shape in each dimension or only a single observation in that dimension.

Some examples:

❌ Two arrays with shapes (3, 5) and (5,) are not compatible because the first array has two dimensions and the second only has a single dimension.

❌ Two arrays with shapes (3, 5) and (5, 3) are not compatible. They are both two-dimensional but have different sizes along the two axes.

✅ Two arrays with shapes (3, 5) and (3, 5) are compatible. They have the same number of dimensions and the same shape.

✅ Two arrays with shapes (3, 5) and (1, 5) are also compatible. They are both 2 dimensional and the first axis in the second array has a single observation and so can be broadcast across the three observations in the first array.

✅ Using the earlier example with temperature and precipitation, if the order of the dimensions on the temperature array is (x, y, time), then the shape of elevation should be set to be (5, 5, 1). That final singleton dimension then give an unambiguous relationship between the two inputs.

Caution

If you are used to using the R language, note that numpy does not “recycle” values in the same way. R generally recycles shorter dimensions of any length to match longer dimensions but numpy does not do this: values are only “recycled” by broadcasting when there is a single observation on a dimension. If you wanted to repeat pairs of values along a dimension, then you will need to do so manually.

As an example of how to do this in practice, in the following example most of the arguments to PModelEnvironment are 2-dimensional in time and position. But the pressure is 1-dimensional — constant in time. Therefore, the time axis is added before passing it to PModelEnvironment.

import numpy as np
from pyrealm.pmodel.pmodel_environment import PModelEnvironment

n_time = 10
n_position = 100

# 2D arrays in time and position (a single dummy value is used)
temp = np.full((n_time, n_position), 20)
co2 = np.full((n_time, n_position), 400)
vpd = np.full((n_time, n_position), 1000)
fapar = np.full((n_time, n_position), 1)
ppfd = np.full((n_time, n_position), 800)

# 1D array in position - constant in time
patm = np.full(n_position, 101325)

# Declare a new axis to make it 2D with shape (1, n_position)
patm = patm[np.newaxis, :]

# Show the shapes for comparison
print("temp shape:", temp.shape)
print("patm shape:", patm.shape)

# Call PModelEnvironment with all the inputs
env = PModelEnvironment(tc=temp, co2=co2, patm=patm, vpd=vpd, fapar=fapar, ppfd=ppfd)
temp shape: (10, 100)
patm shape: (1, 100)