...

New Journal of Physics

by user

on
Category: Documents
10

views

Report

Comments

Transcript

New Journal of Physics
New Journal of Physics
The open–access journal for physics
Disaggregation and scientific visualization of
earthscapes considering trends and spatial
dependence structures
S Grunwald
Soil and Water Science Department, University of Florida, 2169 McCarty Hall,
PO Box 110290, Gainesville, FL, USA
E-mail: [email protected]
New Journal of Physics 10 (2008) 125011 (15pp)
Received 17 April 2008
Published 1 December 2008
Online at http://www.njp.org/
doi:10.1088/1367-2630/10/12/125011
Abstract. Earth attributes show complex, heterogeneous spatial patterns
generated by exogenous environmental factors and formation processes. This
study investigates various strategies to quantify the underlying spatial patterns of
simulated fields resembling real earthscapes and to compare their performance
for describing them. The approach is to disaggregate the variability of earth
attributes into two components, deterministic trend m(xi ) and spatial dependence
ε(xi ), and determine the effects of m(xi ) and ε(xi ) on prediction accuracy under
various combinations of spatial fields of earth attributes encountered in different
earthscapes. We illustrate that cross-dependencies exist between spatial and
feature accuracy. Scientific visualization is used to transpose quantitative results
into visual space.
Contents
1. Introduction
2. Problem description
3. Analysis of data
3.1. Disaggregation of earthscapes . . . . . . . . . . . . . . . . . . . . . . . . . .
4. Results and discussion
5. Conclusions
References
New Journal of Physics 10 (2008) 125011
1367-2630/08/125011+15$30.00
2
3
5
5
8
13
14
© IOP Publishing Ltd and Deutsche Physikalische Gesellschaft
2
1. Introduction
Scientific visualization (SciVis) has been employed abundantly using high-resolution x-ray
computer-assisted tomography (CAT) to map fractures in clay-rich soils [1], macropores [2] and
microstructure in heterogeneous media [3] at fine spatial scales. Such scanning methods create
dense datasets to visualize earth features, but are limited to small areas (centimetre to metre
scale) of earthscapes. At landscape scales, striking models were presented reconstructing terrain
patterns [4] and soilscapes [5]–[7] in three dimensions. Bellamy et al [8] demonstrated how soil
carbon loss/gains across a whole country (England and Wales) can be combined with SciVis
methods. Extension of such visual spatial representations of earth patterns into the temporal
dimension is still rare. Spatio-temporal simulations and visualization models of soil carbon
change and water table fluctuations for large landscapes were presented in [9, 10].
Earthscapes (or soilscapes) show highly complex patterns with diverse earth/soil attributes
in geographic space and through time. Earthscape (or soilscape) is a term derived from earth
that describes the land surface of the world, especially soils, whereas the term scape refers to
the spatial extent. Landscape is the fundamental trait of a specific geographic area, including
its composition, physical environment and anthropogenic or social patterns [11]. Environmental
soil-landscape modeling is a science devoted to understanding the spatial distribution of soils
and coevolving landscapes as part of ecosystems that change dynamically through time [12].
Biogeochemical processes as well as natural and anthropogenic induced forcing functions
generate spatial and temporal patterns observable in earthscapes. Earth (soil) properties entail a
suite of physical (e.g. bulk density and hydraulic conductivity), chemical (e.g. soil phosphorus,
nitrogen, or carbon) and biological (e.g. microbial biomass phosphorus and peptidase activity)
attributes. In this study, we focus on the geospatial aspect of disaggregating the variability of
earth attributes to better understand their behavior.
Grunwald [13] suggested an ontology-based approach to describe earthscapes involving:
(i) conceptualization of a system; (ii) reconstruction using quantitative methods; and (iii) SciVis
addressing the physical, logical, implementation and cognitive universes. SciVis has been
suggested to improve our understanding of a complex ecosystem and associated bio-, topo-,
pedo- and lithospheres [13, 14]. Barraclough and Guymer [15] argued that advanced
visualization techniques communicate complex spatial information intuitively. Maps can
visually enhance the spatial and temporal understanding of phenomena of earthscapes.
According to [16], visual interfaces maximize our natural perception abilities, improve the
comprehension of huge amounts of data, allow the perception of emergent properties that were
not anticipated and facilitate understanding of both large-scale and small-scale earth features.
SciVis relies on accurate spatial description of earthscapes. At regional scales, observation
sets of earth attributes are limited due to labor and costs. Thus, prediction models have been
used extensively to fill this data gap [17]. Statistical methods, such as multivariate regression,
classification and regression trees, or neural networks, are used to predict earth attributes at
unsampled locations using exogenous factors (e.g. land use and topographic attributes) [18].
These factorial models are focused on establishing deterministic linkages between earth
attributes and environmental variables that can be measured more rapidly and at much higher
density. Remote sensing has been valuable in providing dense layers of land use, land cover,
terrain patterns and other environmental attributes complementing sparse, site-specific earth
attribute datasets [19]. Other prediction models explicitly consider the spatial autocorrelation
structure among observed earth attributes and interpolate them taking spatial dependence
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
3
structures into account. Biogeochemical earth attributes show varying degrees of spatial patterns
with short, long, linear and sometimes overlapping spatial structures that are complex to
reconstruct and visualize [20]. Soil sensors, such as visible/near-infrared diffuse reflectance
spectroscopy or electromagnetic meters, have been used extensively to create denser observation
sets across earthscapes. However, the accuracy and precision of in situ field observations still
do not match lab-based analytical measurements.
Fundamental questions pertaining to the quantitative description and visualization of earth
features remain. How much of the variability in earth attributes can be explained by exogenous
environmental factors and how much by spatial dependence? Are mixed models that model
both environmental correlation and spatial dependence structures more accurate than models
that consider one of them? In this paper, we study spatial and non-spatial phenomena in a set of
parameterized spatial fields that exhibit a variety of different patterns. These generated patterns
may represent different geographic earth features such as biogeochemical soil properties,
topographic and land use attributes that form earthscapes. We examine various strategies to
quantify the underlying patterns and to compare their performance for describing earthscapes.
We illustrate that cross-dependencies exist between spatial and feature accuracy. SciVis is used
to render results to allow ease of comparison.
2. Problem description
Earth attributes result from many interactive physical, chemical and biological processes that
are nonlinear and/or chaotic coevolving in space and time. The outcome is so complex that the
variation appears to be random. If we adopt a stochastic view, then at each point in geographic
space there will be not just one value for an attribute but a whole set of values. Thus, at a location
xi , an earth attribute (z) is treated as a random variable with mean (µ), variance (σ 2 ) and a
cumulative distribution function (cdf). The set of random variables, Z 1 (xi ), Z 2 (xi ), . . . , Z n (xi ),
constitutes a random function (RF) or a stochastic process [21]–[25]. A random variable is
a variable whose values are randomly generated according to some probabilistic mechanism.
The set of outcomes and their corresponding probabilities is sometimes referred to as the
probability distribution of a random variable [25]. Regionalized variable theory assumes that
the spatial variation of any variable can be expressed as the sum of three major components
(equation (1)) [22]: (i) a structural component m(xi ), having a constant mean or trend that is
spatially dependent, (ii) a spatially correlated component (ε(xi )), known as the variation of the
regionalized variable, and (iii) a spatially uncorrelated random noise or residual term (ε 0 ). The
deterministic component is dependent on some exogenous factors such as climate, vegetation,
organisms, topography and geology and can be described by a trend model (e.g. regression,
regression variant or process model).
Z (xi ) = m(xi ) + ε(xi ) + ε 0 ,
(1)
where Z (xi ) is the value of a random variable at xi ; m(xi ) is deterministic function describing
the ‘structural’ component of Z at xi ; ε(xi ) is stochastic, locally varying but spatially dependent
residual from m(xi )—the regionalized variable; ε0 is a residual, spatially independent noise
term; xi is geographic position (x, y and z coordinates).
Commonly, earth observations are spatially autocorrelated, meaning that observations
obtained close to each other are more likely to be similar than observations taken further apart
from each other. This spatial correlation of ε(xi ) is described by the semivariance γ . If γ is
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
4
Figure 1. Example of a conceptual experimental semivariogram with nugget
variance, sill variance and range.
plotted as a function of the lag distance h, the semivariogram is obtained. In equation (2), one
implicit assumption is that the semivariance depends only on the separation distance h and not
on the position xi and xi + h (stationarity assumption). γ̂ (h) is estimated as [21]:
N (h)
1 X
[z(xi ) − z(xi + h)]2 ,
γ̂ (h) =
2N (h) i=1
(2)
where γ̂ is the semivariance, h is the distance (lag) in metres, and N is the number of location
pairs separated by vector (or lag) hours.
From the semivariogram, the nugget variance (C0 ), sill variance (C1 ) and range (a) can
be derived to describe the spatial behavior of the observed variable [21, 23, 24] (figure 1).
The rate of the semivariogram increase reflects the degree of dissimilarity of ever more distant
samples. If the semivariogram reaches a limiting value, called the sill, it means that there is
a distance beyond which attribute values are uncorrelated. This distance is called the range.
The nugget variance captures (i) a microstructure, namely a component of the phenomenon
with a range shorter than the sampling support (true nugget effect), (ii) a structure with a range
shorter than the smallest interpoint distance, and (iii) measurement or positioning errors [24].
The nugget to sill ratio can be used to express the magnitude of the spatial dependence in a given
dataset [24, 26]. A variable is said to be autocorrelated—or regionalized—when the measure
made at one sampling site brings information on the values recorded at a point located a given
distance apart. The autocorrelation coefficient measures the degree of autocorrelation and can
be expressed by Moran’s I coefficient [27, 28] according to the following equation:
PP
n
wi j (yi − ȳ)(y j − ȳ)
P
I (d) =
for i 6= j,
(3)
W (yi − ȳ)2
where
• I is Moran’s I coefficient (positive values of I correspond to positive autocorrelation);
• d is distance class, which is a function of the separating distance between sampling points;
• yi and y j are the values of the variables with i and j varying from 1 to n (the number of
data points);
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
5
• ȳ is the mean of the ys;
• wi j is a weighting factor taking the value of 1 if the points belong to the same distance
class and zero otherwise; and
• W is the sum of the ws, i.e. the number of data pairs involved in the estimation of the
coefficient for the distance class d.
Ordinary kriging (OK) is a commonly used interpolation method that predicts values
at unsampled locations using a weighted linear combination of observed data in its
neighborhood [21, 23]. It takes into account the way in which a property varies in space through
the semivariogram model. The aim of kriging is to estimate the value of the random variable
Ẑ (x0 ) at unsampled points from observation data (equation (4); [21]). The weights are allocated
to the sample data within the neighborhood of the point (or block) to be predicted in such a way
as to minimize the kriging variance (equation (5); [21]).
Ẑ (x0 ) =
N
X
λi z(xi )
with
i=1
N
X
λi = 1 to ensure that the prediction is unbiased.
(4)
i=1
The expected error is E[ Ẑ (x0 ) − Z (x0 )] = 0. The prediction variance is
h
i
2
var[ Ẑ (x0 )] = E { Ẑ (x0 ) − Z (x0 )}
=2
N
X
λi γ (xi , x0 ) −
i=1
N X
N
X
λi λ y γ (xi , x j ).
(5)
i=1 j=1
Observations across earthscapes are limited due to labor and costs of data collection. An
adaptable approach that considers variation in parameter and spatial space is needed to derive
accurate predictions of earth attributes over a large region. The challenge is that it is typically
unknown to what proportion m(xi ), ε(xi ), and ε0 contribute to Z (xi ). To address this issue, it is
proposed to (i) disaggregate Z (xi ) and determine the effects of m(xi ) and ε(xi ) on prediction
accuracy under various combinations of spatial fields of earth attributes encountered in different
earthscapes; and (ii) use SciVis to transpose quantitative results into visual space.
3. Analysis of data
3.1. Disaggregation of earthscapes
To quantify earth patterns, simulated spatial fields that resemble specific spatial phenomena
within real-world earthscapes are generated. Each field consists of 121 nodes (xi ), where
i designates the spatial coordinates (x, y), with 100 × 100 m spacing following a Gaussian
distribution of values z. The first field consists of random (R) patterns of z 0 (xi ), which varies
between zero and one (z 1 (xi ) to z 121 (xi )), generated using a random number generator in ArcGIS
(Environmental Systems Research Institute (ESRI), Redlands, CA, USA). The variable z 0 (xi )
shows a mean of 0.47, median of 0.47, standard deviation of 0.28 and skewness coefficient of
0.13 providing an ideal test set to study spatial patterns. The 121 node values are rearranged
to generate linear (L), short-range (SR), long-range (LR), mixed long and short-range (LSR),
mixed long-range and random (LRR), mixed long-range and linear (LRL), mixed short-range
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
6
and random (SRR) and mixed short-range and linear (SRL) spatial patterns. To generate SR and
LR fields, respectively, the attribute values z 0 (xi ) are rearranged 250 times (i.e. 250 iterations
are performed) across the 121 nodes using an objective function to minimize the range a
(to generate the SR field) and maximize a (to generate the LR field) using a spherical function
to predict semivariograms (equation (6); [21]) in ISATIS version 8.0 (Geovariances Inc., Avon,
France).
(
)
3h 1 h 3
if 0 < h 6 a,
γ (h) = C0 + C1
−
(6)
2a 2 a
γ (h) = C0 + C1
if h > a,
and
γ (0) = 0.
To generate the L field attribute values, (z 1 (xi ) to z 121 (xi )) are ordered sequentially from
minimum to maximum values in north–south and west–east directions across all 121 nodes
to generate linear patterns. The mixed synthetic fields (LSR, LRR, LRL, SRR and SRL) are
derived by adding generated fields of R, L, SR and LR in various forms. For example, to
generate the synthetic field LSR, the fields of LR and SR are added. Note that all spatial fields
have values that follow a Gaussian distribution which avoids unstable predictions and stabilizes
variances [21].
We use semivariograms, spatial range, nugget to sill ratio (NSR) [21] and Moran’s I
coefficient [25, 26, 29] to characterize spatial dependence structures (ε(xi )) and the strength of
spatial autocorrelations for each simulated field. Moran’s I values (equation (3)) are compared
with random patterns (‘no spatial autocorrelation’). The test is based on the null hypothesis
H0 ‘there is no spatial autocorrelation’. Under the H0, the value of Moran’s I coefficient is
E(I ) = −(n − 1)−1 ≈ 0 with E(I ) being the expectation of I and n number of data points [28].
OK is used to predict values at unsampled locations projected onto a grid with 10 m spacing
covering an area of 100 ha (figure 2). The spatial portion of the analysis is conducted in ISATIS
version 8.0 (Geovariances Inc., Avon, France) and visualization in ArcGIS 9.2 (ESRI, Redlands,
CA). Spatial patterns of generated fields L, SR and LR vary between zero and one; and mixed
LSR, LRR, LRL, SRR and SRL vary between zero and two. Cross-validation is used to assess
the prediction performance using the mean error (ME) [21] (equation (7)), root mean square
error (RMSE) and coefficient of determination (R 2 ).
Pn
ẑ(xi ) − z(xi )
ME = i=1
(7)
n
with ẑ(xi ) being the predicted values, z(xi ) the observed values and n the number of
observations.
Next consider predictions derived using an earth sensing method, statistical or mechanistic
model to map z(xi ) in parameter space across an earthscape. These predictions model the m(xi )
component of equation (1). Various sets of observations at 121 node locations of earth attributes
derived at xi are considered. Each set represents a scenario along trajectories of prediction
accuracies. The first set (z 0 (xi )) is considered the most accurate, e.g. an analytical method used
in the laboratory to measure a given biogeochemical earth attribute deriving the ‘true’ value.
Additional sets assume that values deviate from the original set by 10% (z 10 (xi )), 20% (z 20 (xi )),
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
7
Figure 2. Spatial patterns of generated fields: linear, short-range and long-range;
random; short-range and random, and long-range and random; and mixed shortrange and linear, long-range and linear, long-range and short-range. The mixed
spatial patterns (3rd and 4th columns) were generated by superimposing patterns
shown in the 1st and 2nd columns.
30% (z 30 (xi )), 40% (z 40 (xi )), 50% (z 50 (xi )), 60% (z 60 (xi )), 70% (z 70 (xi )), 80% (z 80 (xi )),
90% (z 90 (xi )) and 100% (z 100 (xi )), respectively. These error fields are assumed to be stationary
and may represent deviations due to measurement error, prediction error (e.g. mechanistic or
statistical earth models), or both. For each attribute set (z 10 (xi ), . . . , z 100 (xi )), the R 2 , ME
and RMSE in cross-validation mode are derived to quantify the prediction performance of
attributes.
The final step is to consider a mixed model that predicts an earth attribute acknowledging
both spatial and non-spatial components of variation across an earthscape. First, the
deterministic trend is modeled for various attribute sets (z 10 (xi ), . . . , z 100 (xi )) and then the
spatial dependence structure of residuals is modeled to describe ε(xi ) for various simulated
spatial fields (L, LR, LRL, SRL, LSR, LRR and SR). Since kriging is an exact interpolator
the sequence of modeling first m(xi ) and then ε(xi ) of residuals cannot be reversed. For each
prediction model (m(xi ) and ε(xi )), the prediction accuracy is assessed in cross-validation mode
using R 2 , ME and RMSE.
Each error metric is rescaled to a range from zero to one with the aim to standardize them
(i.e. to make them comparable) where zero indicates optimized predictions and one indicates
poor predictions. Rescaling was implemented by setting the maximum ME to 1 and dividing
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
8
Table 1. Root mean square error (RMSE), coefficient of determination (R 2 ) and
mean prediction error (ME) for various sets of attribute values.
Scenario
RMSE
R2
MEa
z 10 (xi )
z 20 (xi )
z 30 (xi )
z 40 (xi )
z 50 (xi )
z 60 (xi )
z 70 (xi )
z 80 (xi )
z 90 (xi )
z 100 (xi )
0.055
0.110
0.165
0.220
0.275
0.331
0.386
0.441
0.496
0.551
0.990
0.960
0.909
0.839
0.748
0.637
0.505
0.354
0.182
0
0.0474
0.0948
0.1422
0.1896
0.2370
0.2844
0.3317
0.3791
0.4265
0.4739
a
It is assumed that deviations between predicted and original values are positive.
all MEs by the maximum ME. The same rescaling procedure was adopted for RMSE and
1 − R 2 , respectively. Note that an inverse approach for R 2 is implemented with rescaled 1 − R 2
of 0 indicating optimized predictions and value of 1 for poor predictions. All three previously
scaled metrics are then added and rescaled to the range of 0–1 to derive a synergy (S) value
assuming equal weighting of each error metric (ME, RMSE and R 2 ). SciVis is used to assign
each performance metric a specific color code to allow side-by-side comparison among m(xi )
and ε(xi ) for different spatial fields and parameter scenarios (z 10 (xi ), . . . , z 100 (xi )).
4. Results and discussion
A combination of rescaled error metrics (0–1) and SciVis allows the effects of trend and
spatial components on prediction performance to be compared. In essence, the variation of earth
attributes across a larger region is disaggregated into trend and spatial components. To identify
which modeling approach is best, depending on underlying spatial patterns, is of paramount
interest in earth science studies. Options include to model (i) only the trend component
m(xi ), (ii) only the spatial component ε(xi ) of observations z(xi ), or (iii) a combination
of trend component m(xi ) followed by modeling the spatial component ε(xi ) of residuals
(m(xi ) − z(xi )).
According to [26], the NSR classifies the spatial dependence of a variable as either strong
(NSR 6 0.25), moderate (0.25 < NSR < 0.75) or weak (NSR > 0.75). Based on the NSR, fields
L, LRL, LR, SRL, LSR and LRR show strong spatial dependence, field SR moderate spatial
dependence and SRR and R no spatial dependence, because no sill and nugget are generated for
those two fields (table 1). According to Moran’s I index, there is less than 1% likelihood that
the clustered patterns could be the result of a random chance for all fields, except fields SR and
R, which show neither spatially autocorrelated nor dispersed patterns.
As expected, for scenarios z 10 (xi ) to z 100 (xi ), the RMSE (0.055–0.551) and ME
(0.0474–0.4739) increase linearly, whereas R 2 shows the opposite trend. Low RMSEs (<0.089)
are only achieved within L, LRL and LR spatial fields and somewhat in fields with SR, SRL
and LSR. This suggests that linear spatial trends show strong spatial dependence, confirmed by
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
9
Table 2. Root mean square error (RMSE), coefficient of determination (R 2 ) and
mean prediction error (ME) for various sets of spatial patterns of earth variable
z(xi ) and their semivariogram models and parameters (nugget, sill and range).
Variables
z(xi )
L
LR
LRL
SR
SRL
LSR
LRR
SRR
R
Modela
Nugget
Partial sill
NSR
Range (m)
RMSE
R2
ME
Bes.
Sph.
Gau.
Sph.
Sph.
Exp.
Sph.
0.0001
0.0024
0.0145
0.0539
0.0263
0.0291
0.0506
0.0600
0.0976
0.1967
0.0271
0.3376
0.1564
0.1362
0.002
0.02
0.06
0.66
0.07
0.15
0.27
461
793
874
256
1671b
655
843
0.017
0.081
0.089
0.252
0.244
0.255
0.329
0.996
0.916
0.943
0.183
0.610
0.587
0.323
0.0013
0.0025
0.0095
0.0006
0.0008
0.0012
−0.0014
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
Moran’s I
(Z score)
0.32 (37.17)
0.44 (19.52)
0.26 (30.38)
0.08 (4.02)
0.17 (19.83)
0.11 (13.55)
0.09 (11.50)
0.0 (0.64)
0.0 (0.55)
a
Models: Bessel-J (Bes.), Spherical (Sph.), exponential (Exp.) and Gaussian (Gau).
Spatial model is poor with long range larger than the dimension of the field.
c
Semiogram shows poor nugget effect.
b
the high Moran’s I coefficients that reduce errors (RMSE and ME) and increase prediction
performance as indicated by R 2 . The introduction of short-range variation in fields diminishes
the prediction performance indicated by a decreasing R 2 and increasing RMSE and ME. In
fields that show mixed spatial patterns (R and SRR), interpolations are associated with large
errors. Mixed spatial models that contain random or short-range variations (SRL, LSR and LRR)
demonstrate large RMSE (>0.244) and low R 2 (<0.610) comparable with attribute predictions
that deviate more than 40% from the true values considering the RMSE (>0.220) and more than
60% considering the R 2 (<0.637). These findings suggest that depending on the underlying
spatial patterns of a given attribute found in an earthscape, it is important to model either the
spatial component (table 2) or the non-spatial trend (table 1). However, at this point it is not
clear to what extent the accuracy of predictions in spatial or parameter space is better or worse.
Thus, no decision on efficient use of sampling and analytical resources can be made but will be
addressed below.
The effects of m(xi ) and ε(xi ) on prediction accuracy under various combinations of spatial
fields of earth attributes are illustrated visually in figures 3–6. High rescaled values (maximum
of one) represent low accuracy in predictions and vice versa. The ME considers over- and
underestimations, the RMSE considers the squared deviations between predicted and observed
values, whereas the R 2 is focused on highlighting the unexplained variance as a ‘goodness-offit’ metric. As expected, rescaled values of 1 − R 2 increase steeply to model the deterministic
trend component m(xi ) under various parameter scenarios (z 10 (xi ) to z 100 (xi )) from 0.01 to
1.0, respectively (figure 3). For rescaled 1 − R 2 , there is no major effect of ε(xi ) on original
observations (z 0 (xi )) or residuals among different parameter scenarios since it was assumed
that parameter deviations are equally distributed at all observation sites. However, there are large
variations in ε(xi ) among different spatial fields (L, LR, LRS, SRL, LSR, LRR or SR). Overall,
rescaled 1 − R 2 is two orders of magnitude lower (<0.084) for L, LR, and LRL compared with
other spatial fields. The rescaled 1 − R 2 amounted to 0.408 (SRL), 0.452 (LSR), 0.662 (LRR)
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
10
Figure 3. Rescaled 1 − R 2 values (0–1) for trend m(xi ), spatial ε(xi ) and residual
spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi )) and
spatial fields with linear (L), long-range (LR), long-range and linear (LRL),
short-range and linear (SRL), long and short-range (LSR), long-range and
random (LRR) and short-range (SR).
and 0.724 (SR), respectively. Based on visual interpretation, a rescaled 1 − R 2 of z 70 (xi ) of
0.490 is in a comparable range to the spatial model (ε(xi )) derived from a field with LSR spatial
pattern (0.452). This suggests that a prediction model m(xi ) that is 70% off is similar in terms
of rescaled 1 − R 2 when compared with a spatial interpolation model (ε(xi )) in an earthscape
with dominant long and short-range patterns. Interestingly, the rescaled 1 − R 2 is similar for
modeling ε(xi ) of original values of fields (i.e. z(xi )) and (ε(xi )) of residuals across various
spatial fields.
The rescaled ME increases along profiles of increasing inaccuracy (z 10 (xi ) to z 100 (xi )), but
speckled patterns emerge for ε(xi ) considering different spatial fields (figure 4). It is important
to note that ε(xi ) residuals for rescaled ME are one order of magnitude higher for L, LR
and LRL, but much lower for SRL, LSR, LRR and SR when compared with rescaled 1 − R 2 .
This suggests that long-range and long-range and linear patterns contribute to enlarge the ME
when modeling the spatial dependence structure of residuals across various scenarios (z 10 (xi ) to
z 100 (xi )). The rescaled ME for m(xi ) and ε(xi ) residuals on fields with SR patterns mirror each
other along trajectories of accuracies (z 10 (xi ) to z 100 (xi )). This suggests that in an earthscape
with pronounced short-range patterns it is challenging to model spatial autocorrelation patterns,
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
11
Figure 4. Rescaled mean error (ME) values (0–1) for trend m(xi ), spatial ε(xi )
and residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to
z 100 (xi )) and spatial fields with L, LR, LRL, SRL, LSR, LRR and SR.
possibly requiring many observation sites to capture the underlying short-range variability of
earth attributes.
Figure 5 illustrates the behavior of rescaled RMSE for m(xi ) and various ε(xi ) models on
different spatial fields. Similar to ME, the rescaled RMSE increases for m(xi ) along different
parameter scenarios (z 10 (xi )to z 100 (xi )). In contrast with rescaled ME, the rescaled RMSE is
two orders of magnitude lower for ε(xi ) residuals on spatial fields L, LR and LRL. The reverse
trend occurs on spatial fields SRL and LSR, whereas LRR and SR behave similarly for rescaled
RMSE and ME.
Considering all three assessment metrics (1 − R 2 , ME and RMSE) into a rescaled synergy
value (S), the effects of m(xi ), ε(xi ) and mixed models (m(xi ) and ε(xi ) of residuals) are
contrasted (figure 6). Overall, L, LR and LRL show lowest rescaled S for ε(xi ) of residuals
(< 0.4) across all parameter scenarios (z 10 (xi ) to z 100 (xi )). The largest rescaled S is reached for
m(xi ) of parameter scenario z 100 (xi ). Other large rescaled S values occur on fields LRR and SR,
in particular in parameter scenarios z 50 (xi ) to z 100 (xi ). This illustrates the usefulness of rescaling
in combination with SciVis.
Assume that a mixed model at z 10 (xi ) is modeled with a trend m(xi ) of 0.070 and residuals
from the trend ε(xi ) of 0.015 on an L spatial field. Both trend and spatial models show high
prediction performance indicated by low rescaled S values. Yet consider the case where the
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
12
Figure 5. Rescaled root mean square error (RMSE) values (0–1) for trend m(xi ),
spatial ε(xi ) and residual spatial ε(xi ) models along trajectories of accuracies
(z 10 (xi ) to z 100 (xi )) and spatial fields with L, LR, LRL, SRL, LSR, LRR
and SR.
spatial residual model is on an SR field with the ε(xi ) residual of 0.302, three times larger than
the m(xi ) trend model with rescaled S of 0.070. Thus, modeling the spatial residuals on an
SR field would possibly inflate the overall error in predicting earth attributes. Further assume a
mixed model at z 50 (xi ) modeled with a trend m(xi ) of 0.416 and residuals from the trend ε(xi ) of
0.188 (L), 0.173 (LR), 0.188 (LRL), 0.300 (SRL), 0.301 (LSR), 0.531 (LRR) and 0.542 (SR).
These findings illustrate that caution is required when using mixed models on LRR and SR
spatial fields, because they would possibly introduce additional large errors into the prediction
process of earth attributes. If the parameter prediction model is more than 70% off from the true
earth attribute value z 70 (xi ), the proportion between trend and spatial errors reverses with higher
rescaled S value for m(xi ) with 0.750, but much lower values for ε(xi ) residual with 0.291 (L),
0.257 (LR), 0.291 (LRL), 0.399 (SRL) and 0.391 (LSR). Only residual ε(xi ) on spatial fields
LRR with 0.717 and SR with 0.724 show similar high rescaled S values when compared with the
trend m(xi ). This disproportionate split between spatial and deterministic trends is also found
for scenarios z 80 (xi ) to z 100 (xi ).
One may argue that it is expected that increase in measurement errors, in particular for
rough patterns when compared with smoother patterns, will lead to poorer predictions. This is
true but the intriguing aspect of this study is that visual comparisons (figures 2–6) allowed us
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
13
Figure 6. Rescaled synergy (S) values (0–1) for trend m(xi ), spatial ε(xi ) and
residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi ))
and spatial fields with L, LR, LRL, SRL, LSR, LRR and SR.
to distinguish between errors in representing spatial and non-spatial variability across various
error trajectories.
This approach allows modelers to make a decision regarding analytical procedures
(e.g. desired precision to measure earth/soil properties) under given circumstances
(e.g. landscape settings).
5. Conclusions
In this study, various parameter and spatial scenarios are used to demonstrate the effects of
modeling errors in predicting earth attributes. We use synthetic data with optimized Gaussian
distributions under controlled conditions (e.g. assuming a stationary error process). Although
stationarity in earth/soil attributes in nature is often not found, this study has value contrasting
different scenarios of possible spatial realizations of properties found in different earthscapes.
Studying spatial and non-spatial behavior and variability in a controlled environment enables us
to make informed decisions on analytical measurements and sampling strategies for capturing
the underlying variability in earth attributes. In this study, our aim is to understand and contrast
the complex spatial behavior of earth attributes, which is dependent on a multifactorial system
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
14
of environmental factors (e.g. topography, micro- and macro-climate, parent material/geology,
etc) and human impact (e.g. land use and non-point and point source pollution). This knowledge
can flow into future earth/soil studies that aim to map these patterns across earthscapes.
Depending on the underlying spatial variability either modeling the deterministic trend
m(xi ), spatial dependence structure ε(xi ), or mixed components of m(xi ) and ε(xi ) of residuals
is most successful in resembling earth patterns. Most problematic to model are random and
short-range spatial structures to describe earthscapes. The trend and spatial components of
earthscapes with long-range and linear patterns show higher accuracies to model earth attributes.
SciVis allows ease of comparison among many simulated output values that are rescaled
(standardized). The visual tables provide an overview of possible patterns encountered in real
earthscapes and will assist in making an informed decision on what strategy to use to model
variability within a specific earthscape. The significance lies in capturing the phenomena in the
form of trend and/or spatial components. As a conjecture, the findings from this study are not
limited to earth attributes, but can be employed to study vegetation species, wildlife species
or other environmental phenomena. Thus, mapping/visualization techniques presented in this
paper have value for many applications in (geo)physical sciences.
References
[1] Moreau E, Velde B and Terribile F 1999 Comparison of 2D and 3D images of fractures in a Vertisol Geoderma
92 55–72
[2] Perret J, Prasher S O, Kantzas A and Langford C 1999 Three-dimensional quantification of macropore
networks in undisturbed soil cores Soil Sci. Soc. Am. J. 63 1530–43
[3] Blair J M, Falconer R E, Milne A C, Young I M and Crawford J W 2006 Modeling three-dimensional
microstructure in heterogeneous media Soil Sci. Soc. Am. J. 71 1807–12
[4] Erskine R H, Green T R, Ramirez J A and MacDonald L H 2007 Digital elevation accuracy and grid cell size:
effects on estimated terrain attributes Soil Sci. Soc. Am. J. 71 1371–80
[5] Grunwald S, Barak P, McSweeney K and Lowery B 2000 Soil landscape models at different scales portrayed
in Virtual Reality Modeling Language (VRML) Soil Sci. 165 598–615
[6] Mendonca Santos M L, Guenat C, Bouzelboudjen M and Golay F 2000 Three-dimensional GIS cartography
applied to the study of the spatial variation of soil horizons in a Swiss floodplain Geoderma 97 351–66
[7] Grunwald S and Barak P 2003 3D geographic reconstruction and visualization techniques applied to land
resource management Trans. GIS 7 231–41
[8] Bellamy P H, Loveland P J, Bradley R I, Lark R M and Kirk G J D 2005 Carbon losses from all soils across
England and Wales 1978–2003 Nature 437 245–7
[9] Walter C, Viscarra Rossel R A and McBratney A B 2003 Spatio-temporal simulation of the field evolution of
organic carbon over the landscape Soil Sci. Soc. Am. J. 67 1477–86
[10] Ramasundaram V, Grunwald S, Mangeot A, Comerford N B and Bliss C M 2005 Development of an
environmental virtual field laboratory J. Comput. Educ. 45 21–34
[11] Young A 1998 Land Resources: Now and for the Future (Cambridge: Cambridge University Press)
[12] Grunwald S 2006 What do we really know about the space-time continuum of soil-landscapes? Environmental
Soil-Landscape Modeling—Geographic Information Technologies and Pedometrics ed S Grunwald
(New York: CRC Press) pp 3–37
[13] Grunwald S 2006 Three-dimensional reconstruction and scientific visualization of soil-landscapes?
Environmental Soil-Landscape Modeling—Geographic Information Technologies and Pedometrics
ed S Grunwald (New York: CRC Press) pp 373–92
[14] DeGloria S D 1993 Visualizing soil behavior Geoderma 60 41–55
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
15
[15] Barraclough A and Guymer I 1998 Virtual reality—a role in environmental engineering education? Water Sci.
Technol. 38 303–10
[16] Fisher P and Unwin D 2002 Virtual Reality in Geography (New York: Taylor and Francis)
[17] McBratney A B, Mendonca Santos M L and Minasny B 2003 On digital soil mapping Geoderma 117 3–52
[18] McKenzie N J and Ryan P J 1999 Spatial prediction of soil properties using environmental correlation
Geoderma 89 67–94
[19] Rivero R G, Grunwald S and Bruland G L 2007 Incorporation of spectral data into multivariate geostatistical
models to map soil phosphorus variability in a Florida wetland Geoderma 140 428–43
[20] Grunwald S, Reddy K R, Prenger J P and Fisher M M 2007 Modeling of the spatial variability of
biogeochemical soil properties in a freshwater ecosystem Ecol. Model. 210 521–35
[21] Webster R and Oliver M A 2001 Geostatistics for Environmental Scientists (Chichester, UK: Wiley)
[22] Burrough P A and McDonnell R A 1998 Principles of Geographical Information Systems—Spatial
Information Systems and Geostatistics (New York: Oxford University Press)
[23] Goovaerts P 1997 Geostatistics for Natural Resources Evaluation (Oxford: Oxford University Press)
[24] Chilès J P and Delfiner P 1999 Geostatistics Modeling Spatial Uncertainty (New York: Wiley)
[25] Isaaks E H and Srivastava R M 1989 An Introduction to Applied Geostatistics (Oxford: Oxford University
Press)
[26] Cambardella C A, Moorman T B, Novak J M, Parkin T B, Karlan D L, Turco R F and Konopka A E 1994
Field scale variability of soil properties in central Iowa soils Soil Sci. Soc. Am. J. 58 1501–11
[27] Moran P A P 1950 Notes on continuous stochastic phenomena Biometrika 37 17–23
[28] Rossi J P 1996 Statistical tool for soil biology. Autocorrelogram and Mantel test Eur. J. Soil Biol. 32 195–203
[29] Goodchild M F 1986 Spatial Autocorrelation—Concepts and Techniques in Modern Geography (Norwich,
UK: Geo Books)
New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)
Fly UP