Load data

Mercer & Hall uniformity trial: 500 plots with measured wheat grain and straw yield, arranged in a grid.

Reference: Mercer, W. B., & Hall, A. D. (1911). The experimental error of field trials. Journal of Agricultural Science (Cambridge), 4, 107–132.

mhw <- read.csv("mhw.csv")
summary(mhw)
##        r               c          grain           straw      
##  Min.   : 1.00   Min.   : 1   Min.   :2.730   Min.   :4.100  
##  1st Qu.: 5.75   1st Qu.: 7   1st Qu.:3.638   1st Qu.:5.878  
##  Median :10.50   Median :13   Median :3.940   Median :6.360  
##  Mean   :10.50   Mean   :13   Mean   :3.949   Mean   :6.515  
##  3rd Qu.:15.25   3rd Qu.:19   3rd Qu.:4.270   3rd Qu.:7.170  
##  Max.   :20.00   Max.   :25   Max.   :5.160   Max.   :8.850

Four variables: row & column in the square field plot; grain and straw yield in pounds per plot.

Build a model for straw yield as a function of grain yield.

OLS solution in matrix form

Here the model is \(\mathbf{y} = \mathbf{X} \mathbf{\beta} + \mathbf{\varepsilon}\).

model.straw.grain <- lm(straw ~ grain, data=mhw)
coefficients(model.straw.grain)
## (Intercept)       grain 
##   0.8662797   1.4304977

The solution is \(\mathbf{\widehat{\beta}}_{\mathrm{OLS}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}\).

Let’s see how this is computed.

Design matrix X:

head(X <-model.matrix(model.straw.grain))
##   (Intercept) grain
## 1           1  3.63
## 2           1  4.07
## 3           1  4.51
## 4           1  3.90
## 5           1  3.63
## 6           1  3.16

Quadratic form (variance-covariance matrix):

(X2 <- t(X)%*%X)
##             (Intercept)    grain
## (Intercept)      500.00 1974.320
## grain           1974.32 7900.679

The inverse of the quadratic form:

(X2.inv <- solve(X2))
##             (Intercept)        grain
## (Intercept)  0.15077621 -0.037677836
## grain       -0.03767784  0.009541978

The \(y\) vector multiplied by the transpose of the design matrix:

str(y <- mhw$straw)
##  num [1:500] 6.37 6.24 7.05 6.91 5.93 5.59 5.32 5.52 6.03 5.66 ...
(X2y <- t(X)%*%y)
##                 [,1]
## (Intercept)  3257.40
## grain       13012.22

Now we can solve by OLS:

(beta.ols <- X2.inv%*%X2y)
##                  [,1]
## (Intercept) 0.8662797
## grain       1.4304977

This is the same as computed directly by lm.

Convert to a spatial object

Compute the plot length and width in meters, then make a grid covering the centre of each plot.

ha2ac <- 1/2.471054  # hectares per acre
ft2m <- .3048      # metres per foot
(field.area <- 10000*ha2ac); (plot.len <- sqrt(field.area)/20); (plot.wid <- sqrt(field.area)/25); (nrow <- length(unique(mhw$r))); (ncol <- length(unique(mhw$c)))
## [1] 4046.856
## [1] 3.180745
## [1] 2.544596
## [1] 20
## [1] 25
sx <- seq(plot.wid/2, plot.wid/2+(ncol-1)*plot.wid, length=ncol)
sy <- seq(plot.len/2+(nrow-1)*plot.len, plot.len/2, length=nrow)
xy <- expand.grid(x=sx, y=sy)
str(xy)
## 'data.frame':    500 obs. of  2 variables:
##  $ x: num  1.27 3.82 6.36 8.91 11.45 ...
##  $ y: num  62 62 62 62 62 ...
##  - attr(*, "out.attrs")=List of 2
##   ..$ dim     : Named int  25 20
##   .. ..- attr(*, "names")= chr  "x" "y"
##   ..$ dimnames:List of 2
##   .. ..$ x: chr  "x= 1.272298" "x= 3.816894" "x= 6.361490" "x= 8.906087" ...
##   .. ..$ y: chr  "y=62.024532" "y=58.843787" "y=55.663042" "y=52.482297" ...

Make a SpatialPointsDataFrame version of the dataset; use the grid as spatial coordinates.

require(sp)
## Loading required package: sp
mhw.sp <- mhw
coordinates(mhw.sp) <- xy
summary(mhw.sp)
## Object of class SpatialPointsDataFrame
## Coordinates:
##        min      max
## x 1.272298 62.34261
## y 1.590373 62.02453
## Is projected: NA 
## proj4string : [NA]
## Number of points: 500
## Data attributes:
##        r               c          grain           straw      
##  Min.   : 1.00   Min.   : 1   Min.   :2.730   Min.   :4.100  
##  1st Qu.: 5.75   1st Qu.: 7   1st Qu.:3.638   1st Qu.:5.878  
##  Median :10.50   Median :13   Median :3.940   Median :6.360  
##  Mean   :10.50   Mean   :13   Mean   :3.949   Mean   :6.515  
##  3rd Qu.:15.25   3rd Qu.:19   3rd Qu.:4.270   3rd Qu.:7.170  
##  Max.   :20.00   Max.   :25   Max.   :5.160   Max.   :8.850
spplot(mhw.sp, zcol="grain")

spplot(mhw.sp, zcol="straw", col.regions=topo.colors(64))