TITLE: Comparing coordinates of tree stems collected with GPS or
tape measures
DATE: 2022-08-27
AUTHOR: John L. Godlee
====================================================================


In 2018 and 2019 I set up some 100x100 m (1 ha) permanent
vegetation monitoring plots in Bicuar National Park, southwest
Angola. We measured the stem diameter of each tree stem >5 cm
diameter and attached a numbered metal tag to each of these stems
so we could track the growth and mortality of each stem over time.
At the same time as measuring the stem diameters and attaching the
tags, I also took a quick GPS point with a Garmin GPSMAP 65s
Handheld GPS unit.

There are two main reasons to record the location of each stem in
the plot. The first is to make it easier to relocate stems in the
future. Sometimes tags fall off or stems are completely vaporised
by fire. If we know the location of the stem, we can visit that
location and confirm what has happened to the stem since the last
survey. The second reason is that if we understand the spatial
distribution of stems, we can better understand competitive (or
facilitative) interactions among individuals, and look at how these
interactions affect biomass dynamics.

When I revisited the plots in 2022, my goal was to collect lots of
auxiliary data, to enrich our picture of the ecosystem. Alongside
collecting data on woody debris, herbaceous biomass, and tree leaf
traits, I also recorded the position of stems that had recruited
since the initial survey back in 2018/2019. Instead of using a GPS,
this time I used tape measures to record the position of each stem,
using an XY coordinate system.

 ![Diagram showing the XY grid coordinate system in a 1 ha
plot.](https://johngodlee.xyz/img_full/gps_xy/diag.png)

While we had the tape measures set up, I also re-recorded the
positions of existing stems so I could compare the two methods of
measuring the stem location. Others have encouraged me to use tape
measures, suggesting that the GPS isn't accurate enough, but tape
measures takes a long time in the field compared to a GPS. If the
accuracy of the GPS points is good enough, there's no need to waste
time with the tape measures.

To make the latitude-longitude coordinates comparable with the XY
coordinates, I converted the lat-long coordinates to XY coordinates
using a function I wrote in R. Hopefully the comments on the code
suffice to describe what's going on, but the basic idea is that the
function takes a dataframe of stem measurements in multiple plots
and an {sf} polygons object with multiple plot polygons, linked by
the plot_id column. For each plot, the function follows this
process:

1.  Find the southwest and southeast corners of the polygon
2.  Rotate both the polygon and the stem locations so that the
SW-SE axis of the polygon runs exactly E-W.
3.  Calculate the E-W and N-S distance from the southwest corner of
the plot to each stem location, in metres.

   #' Convert lat-long stem coordinates to XY grid coordinates
   #'
   #' In rectangular plots to convert latitude and longitude
coordinates to XY
   #' grid coordinates, using the southwest corner of the plot as
the origin point
   #' (0,0), with the X axis increasing west -> east.
   #'
   #' @param x dataframe of stem data
   #' @param polys sf object containing plot polygons, assumes all
are rectangular
   #' @param stem_plot_id column name string of plot IDs in x
   #' @param polys_plot_id column name string of plot IDs in polys
   #' @param longitude column name string of stem longitude in x
   #' @param latitude column name string of stem latitude in x
   #' @param x_crs coordinate reference system (CRS) or EPSG code
of stem
   #'     coordinates.
   #'
   #' @return dataframe of measurement IDs with x_grid and y_grid
filled
   #'     with estimated XY grid coordinates. NA values returned
if
   #'     longitude and latitude were missing.
   #'
   #' @examples
   #'
   #'
   #' @export
   #'
   gridCoordGen <- function(x, polys, stem_plot_id = "plot_id",
     polys_plot_id = stem_plot_id, measurement_id =
"measurement_id",
     longitude = "longitude", latitude = "latitude",
     x_crs = 4326) {

     # Split stems by plot_id
     x_split <- split(x, x[[stem_plot_id]])

     # For each plot
     grid_all <- do.call(rbind, lapply(seq_along(x_split),
function(y) {
       # Get plot ID
       plot_id <- unique(x_split[[y]][[stem_plot_id]])

       # Subset polys to plot ID
       poly_fil <- polys[polys[[polys_plot_id]] == plot_id,]

       # Convert corners to points
       poly_points <-
as.data.frame(sf::st_coordinates(poly_fil)[1:4,1:2])

       # Get UTM zone of corners
       utm_string <- UTMProj4(latLong2UTM(mean(poly_points[,1]),
mean(poly_points[,2])))

       # Convert polygons to UTM
       poly_utm <- sf::st_transform(poly_fil, utm_string)

       # Convert UTM polygons to points
       points_utm <- sf::st_cast(poly_utm, "POINT", warn =
FALSE)[1:4,]

       # Extract coordinates as dataframe
       coords_utm <-
as.data.frame(sf::st_coordinates(points_utm)[1:4,1:2])

       # Define points to match corners to
       match_point <- sf::st_sfc(sf::st_point(
           x = c(mean(coords_utm$X) - 1000, mean(coords_utm$Y) -
1000)))
       other_point <- sf::st_sfc(sf::st_point(
           x = c(mean(coords_utm$X) + 1000, mean(coords_utm$Y) -
1000)))

       # Set CRS
       sf::st_crs(other_point) <- utm_string
       sf::st_crs(match_point) <- utm_string

       # Get sw and se corner
       sw_corner <- points_utm[sf::st_nearest_feature(match_point,
points_utm),]
       se_corner <- points_utm[sf::st_nearest_feature(other_point,
points_utm),]

       # Convert to WGS for geosphere compatibility
       sw_wgs <- sf::st_coordinates(sf::st_transform(sw_corner,
4326))
       se_wgs <- sf::st_coordinates(sf::st_transform(se_corner,
4326))

       # Find rotation along SW,SE edge
       xy_bearing <- geosphere::bearing(sw_wgs, se_wgs)

       # Get centroid of polygon
       cent <- sf::st_centroid(sf::st_geometry(poly_utm))

       # Get location of sw corner
       sw_cent <- sf::st_geometry(sw_corner)

       # Convert angle to radians for rotation
       angle <- NISTunits::NISTdegTOradian(90 - xy_bearing)

       # Convert stem coords to sf object
       x_sf <- st_as_sf(
         x_split[[y]][,c(measurement_id, longitude, latitude)],
         coords = c(longitude, latitude), na.fail = FALSE)
       st_crs(x_sf) <- x_crs

       # Transform stems sf to UTM
       x_utm <- st_transform(x_sf, utm_string)

       # Rotate the stems same as polygon
       x_rot <- (sf::st_geometry(x_utm) - cent) * rot(angle) * 1 +
cent
       sw_cent_rot <- (sw_cent - cent) * rot(angle) * 1 + cent
       st_crs(sw_cent_rot) <- st_crs(points_utm)
       st_crs(x_rot) <- st_crs(points_utm)

       x_cent <- x_rot - sw_cent_rot

       x_cent_df <- as.data.frame(st_coordinates(x_cent))
       names(x_cent_df) <- c("x_grid", "y_grid")

       # Add cordinates to dataframe
       x_out <- cbind(st_drop_geometry(x_sf), x_cent_df)

       # Sub in empty stems
       if (any(!x_split[[y]][[measurement_id]] %in%
x_out[[measurement_id]])) {
         missed_stems <- x_split[[y]][
           !x_split[[y]][[measurement_id]] %in%
x_out[[measurement_id]],
           c(measurement_id)]

         x_out <- rbind(x_out, missed_stems)

         # NaN to NA
         x_out$x_grid <- fillNA(x_out$x_grid)
         x_out$y_grid <- fillNA(x_out$y_grid)
       }

       # Return
       x_out
     }))

     # Return
     return(grid_all)
   }

I then matched the measurements using the tag ID of each stem,
calculated the distance between them, and the angle between the
tape measure and the GPS using this function:

   # Define function to calculate angle of line, in radians
   calcAngle <- function(x, y) {
     dst_diff <- as.numeric(x - y)
     return(atan2(dst_diff[1], dst_diff[2]) + pi)
   }

First, here is a map where I laid all the location errors from each
of the 11 plots on top of each other.

 ![Map of all location
errors.](https://johngodlee.xyz/img_full/gps_xy/pretty_plot.png)

One really interesting thing about the plot above is that it looks
like the errors get larger in the top part of the plot, and also
that the errors on the X axis are larger than the errors on the Y
axis.

In the field we always started collecting the XY coordinates from
the SW (bottom-left) corner of the plot, doing 20 m lines of trees
running E-W up to the NE corner. I wonder if the errors are larger
at the top of the plot because we were getting tired towards the
end of the day. Or maybe they were due to tape measure stretch?
Also from this plot I get the impression that the errors are
generally larger on the E-W axis than on the N-S axis. I'd expect
the GPS to produce random errors, while the tape measures would
produce systematic errors, so this plot suggests actually that
there might be greater error in the tape measure coordinates than
in the GPS.

Now here is a simple rose plot where I plotted all the lines with
the tape measure position normalised to 0,0:

 ![Simple rose plot of all lines with tape measure at
0,0.](https://johngodlee.xyz/img_full/gps_xy/cnt_lines_rose.png)

Again, this potentially shows that there's greater error on the E-W
axis. It also highlights some particularly large errors of >40 m,
which I assume are reporting errors in the field.

And a histogram showing the distribution of error lengths:

 ![Histogram of error
lengths.](https://johngodlee.xyz/img_full/gps_xy/error_length_hist.p
ng)

The mean error length was 5.2 m +/- 4.245 m SD, median = 3.9 m,
with a range of 0.067 m to 51 m.

Next, I binned the error vectors by angle, into 20 degree bins, and
calculated the sum of error lengths within each bin:

   dat_length_bin <- dat_angle %>%
     mutate(angle_bin = cut_width(angle_deg, 20, boundary = 0,
         closed = "right")) %>%
     group_by(angle_bin) %>%
     summarise(
       line_length_sum = sum(line_length)) %>%
     mutate(angle_bin = as.numeric(gsub("\\[([0-9]+),.*", "\\1",
angle_bin)))

Then plotted a rose diagram of these summary data:

   ggplot() +
     geom_bar(data = dat_length_bin,
       aes(x = angle_bin, y = line_length_sum),
       stat = "identity", colour = "black", fill = "darkgrey",
width = 20,
       position = position_nudge(x = 10)) +
     coord_polar() +
     scale_y_continuous(expand = c(0.1, 0)) +
     scale_x_continuous(limits = c(0, 360), breaks = c(0, 90, 270,
360)) +
     theme_bw() +
     theme(
       axis.text.y = element_blank(),
       axis.ticks.y = element_blank())
     labs(x = expression("Angle"~(degree)), y = "Total length (m)")

 ![Rose diagram showing total error length per 20 degree angle
bin.](https://johngodlee.xyz/img_full/gps_xy/angle_length_rose.png)

It looks like the biggest and most frequent errors are where the
grid coordinate is west of the GPS coordinate. This systematic
error probably has something to do with how we set up the tapes in
the plots, and represents error in the grid coordinates rather than
the GPS coordinates, as I'd expect the GPS coordinate error to be
random noise. It suggests that the person measuring the X axis
coordinate consistently over-estimated the position on the tape
measure, likely because they weren't holding the tapes perfectly
perpendicular.

The main problem with this comparison between GPS and tape measure
is that neither of these methods are guaranteed to provide the
truth. There is random error in the GPS points, handheld GPS units
are only accurate to about 2-3 m. However, there are also errors in
the tape measure method:

Firstly, the edges of the plot might not be totally square. The XY
coordinate system relies on the corners of the plot having a
perfect 90 degree angle, and each edge being a perfectly straight
line, but in the field this is difficult to achieve. Sometimes the
plots come out slightly rhombic, despite our best efforts.

It's also common for the tape measures to bow slightly in or out of
the plot. If the edge bows in, the distance from the plot edge to a
stem will be less in the middle of the plot than at the edges of
the plot. This is a particularly problem in a grassy plot on a
windy day, as the grass moves the tape measure. The only way to
stop this is to place the tape measure very close to the ground and
to anchor the tape measure at regular intervals.

The tape measures may stretch slightly leading to an under-estimate
of the measurement.

The tape measures may not be laid out perfectly parallel to the
plot edges. A tape measure laid out at an angle will increase the
measured length. Using smaller strips along the X axis in the
field, 10 m rather than 20 m, can reduce this error, as angle
errors result in larger location erors with increased distance.

Finally, there may be reporting errors in the field, and there may
be transcription errors when typing the data up onto the computer.
Commonly decimal points are put in the wrong place, or 65.6 m gets
called out as 45.6 m, when we move to a new row.

Where the goal is just to aid relocation of stems in future
censuses, I would definitely advocate for using a GPS unit. It
saves a lot of time, and the location accuracy is good enough.
Where it's necessary to quantify neighbourhood effects, I think
tape measures can probably give better accuracy and higher
precision, but only if the measuring is done by a trained team
working meticuluously, and the plot is set up properly to begin
with.

Another option for measuring tree location that I've been toying
with for a long time is using a laser total station or a laser
level in conjunction with differential GPS. Something like the
Leica NA520 which can measure angle and distance to a reflective
target. First you would pick a spot where many stems are visible
and set up a survey tripod. Then using a differential GPS system
you would record the precise location of the tripod. This could
take a few minutes, but only has to be done once for each batch of
measurements. Then the GPS antenna would be swapped out for a laser
level. The distance and angle of each stem in view with respect to
the level would be recorded. The survey tripod would be moved to
the next surveying location and the process would be repeated. Then
the location of each stem can be calculated using the real-space
location of the survey tripod, the angle and distance of the stem
with respect to the tripod. You could even record the same stem
multiple times from different positions to further improve the
accuracy of the location, but I think even for measuring
neighbourhood effects it's only really necessary to have an
accuracy of about 50-100 cm.

 [Leica NA520]:
https://leica-geosystems.com/en-gb/products/levels/automatic-levels/
leica-na500-series

 ![Diagram of field setup for laser level GNSS
method.](https://johngodlee.xyz/img_full/gps_xy/diag_2.png)