Comments
Description
Transcript
New Journal of Physics
New Journal of Physics The open–access journal for physics Disaggregation and scientific visualization of earthscapes considering trends and spatial dependence structures S Grunwald Soil and Water Science Department, University of Florida, 2169 McCarty Hall, PO Box 110290, Gainesville, FL, USA E-mail: [email protected] New Journal of Physics 10 (2008) 125011 (15pp) Received 17 April 2008 Published 1 December 2008 Online at http://www.njp.org/ doi:10.1088/1367-2630/10/12/125011 Abstract. Earth attributes show complex, heterogeneous spatial patterns generated by exogenous environmental factors and formation processes. This study investigates various strategies to quantify the underlying spatial patterns of simulated fields resembling real earthscapes and to compare their performance for describing them. The approach is to disaggregate the variability of earth attributes into two components, deterministic trend m(xi ) and spatial dependence ε(xi ), and determine the effects of m(xi ) and ε(xi ) on prediction accuracy under various combinations of spatial fields of earth attributes encountered in different earthscapes. We illustrate that cross-dependencies exist between spatial and feature accuracy. Scientific visualization is used to transpose quantitative results into visual space. Contents 1. Introduction 2. Problem description 3. Analysis of data 3.1. Disaggregation of earthscapes . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Results and discussion 5. Conclusions References New Journal of Physics 10 (2008) 125011 1367-2630/08/125011+15$30.00 2 3 5 5 8 13 14 © IOP Publishing Ltd and Deutsche Physikalische Gesellschaft 2 1. Introduction Scientific visualization (SciVis) has been employed abundantly using high-resolution x-ray computer-assisted tomography (CAT) to map fractures in clay-rich soils [1], macropores [2] and microstructure in heterogeneous media [3] at fine spatial scales. Such scanning methods create dense datasets to visualize earth features, but are limited to small areas (centimetre to metre scale) of earthscapes. At landscape scales, striking models were presented reconstructing terrain patterns [4] and soilscapes [5]–[7] in three dimensions. Bellamy et al [8] demonstrated how soil carbon loss/gains across a whole country (England and Wales) can be combined with SciVis methods. Extension of such visual spatial representations of earth patterns into the temporal dimension is still rare. Spatio-temporal simulations and visualization models of soil carbon change and water table fluctuations for large landscapes were presented in [9, 10]. Earthscapes (or soilscapes) show highly complex patterns with diverse earth/soil attributes in geographic space and through time. Earthscape (or soilscape) is a term derived from earth that describes the land surface of the world, especially soils, whereas the term scape refers to the spatial extent. Landscape is the fundamental trait of a specific geographic area, including its composition, physical environment and anthropogenic or social patterns [11]. Environmental soil-landscape modeling is a science devoted to understanding the spatial distribution of soils and coevolving landscapes as part of ecosystems that change dynamically through time [12]. Biogeochemical processes as well as natural and anthropogenic induced forcing functions generate spatial and temporal patterns observable in earthscapes. Earth (soil) properties entail a suite of physical (e.g. bulk density and hydraulic conductivity), chemical (e.g. soil phosphorus, nitrogen, or carbon) and biological (e.g. microbial biomass phosphorus and peptidase activity) attributes. In this study, we focus on the geospatial aspect of disaggregating the variability of earth attributes to better understand their behavior. Grunwald [13] suggested an ontology-based approach to describe earthscapes involving: (i) conceptualization of a system; (ii) reconstruction using quantitative methods; and (iii) SciVis addressing the physical, logical, implementation and cognitive universes. SciVis has been suggested to improve our understanding of a complex ecosystem and associated bio-, topo-, pedo- and lithospheres [13, 14]. Barraclough and Guymer [15] argued that advanced visualization techniques communicate complex spatial information intuitively. Maps can visually enhance the spatial and temporal understanding of phenomena of earthscapes. According to [16], visual interfaces maximize our natural perception abilities, improve the comprehension of huge amounts of data, allow the perception of emergent properties that were not anticipated and facilitate understanding of both large-scale and small-scale earth features. SciVis relies on accurate spatial description of earthscapes. At regional scales, observation sets of earth attributes are limited due to labor and costs. Thus, prediction models have been used extensively to fill this data gap [17]. Statistical methods, such as multivariate regression, classification and regression trees, or neural networks, are used to predict earth attributes at unsampled locations using exogenous factors (e.g. land use and topographic attributes) [18]. These factorial models are focused on establishing deterministic linkages between earth attributes and environmental variables that can be measured more rapidly and at much higher density. Remote sensing has been valuable in providing dense layers of land use, land cover, terrain patterns and other environmental attributes complementing sparse, site-specific earth attribute datasets [19]. Other prediction models explicitly consider the spatial autocorrelation structure among observed earth attributes and interpolate them taking spatial dependence New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 3 structures into account. Biogeochemical earth attributes show varying degrees of spatial patterns with short, long, linear and sometimes overlapping spatial structures that are complex to reconstruct and visualize [20]. Soil sensors, such as visible/near-infrared diffuse reflectance spectroscopy or electromagnetic meters, have been used extensively to create denser observation sets across earthscapes. However, the accuracy and precision of in situ field observations still do not match lab-based analytical measurements. Fundamental questions pertaining to the quantitative description and visualization of earth features remain. How much of the variability in earth attributes can be explained by exogenous environmental factors and how much by spatial dependence? Are mixed models that model both environmental correlation and spatial dependence structures more accurate than models that consider one of them? In this paper, we study spatial and non-spatial phenomena in a set of parameterized spatial fields that exhibit a variety of different patterns. These generated patterns may represent different geographic earth features such as biogeochemical soil properties, topographic and land use attributes that form earthscapes. We examine various strategies to quantify the underlying patterns and to compare their performance for describing earthscapes. We illustrate that cross-dependencies exist between spatial and feature accuracy. SciVis is used to render results to allow ease of comparison. 2. Problem description Earth attributes result from many interactive physical, chemical and biological processes that are nonlinear and/or chaotic coevolving in space and time. The outcome is so complex that the variation appears to be random. If we adopt a stochastic view, then at each point in geographic space there will be not just one value for an attribute but a whole set of values. Thus, at a location xi , an earth attribute (z) is treated as a random variable with mean (µ), variance (σ 2 ) and a cumulative distribution function (cdf). The set of random variables, Z 1 (xi ), Z 2 (xi ), . . . , Z n (xi ), constitutes a random function (RF) or a stochastic process [21]–[25]. A random variable is a variable whose values are randomly generated according to some probabilistic mechanism. The set of outcomes and their corresponding probabilities is sometimes referred to as the probability distribution of a random variable [25]. Regionalized variable theory assumes that the spatial variation of any variable can be expressed as the sum of three major components (equation (1)) [22]: (i) a structural component m(xi ), having a constant mean or trend that is spatially dependent, (ii) a spatially correlated component (ε(xi )), known as the variation of the regionalized variable, and (iii) a spatially uncorrelated random noise or residual term (ε 0 ). The deterministic component is dependent on some exogenous factors such as climate, vegetation, organisms, topography and geology and can be described by a trend model (e.g. regression, regression variant or process model). Z (xi ) = m(xi ) + ε(xi ) + ε 0 , (1) where Z (xi ) is the value of a random variable at xi ; m(xi ) is deterministic function describing the ‘structural’ component of Z at xi ; ε(xi ) is stochastic, locally varying but spatially dependent residual from m(xi )—the regionalized variable; ε0 is a residual, spatially independent noise term; xi is geographic position (x, y and z coordinates). Commonly, earth observations are spatially autocorrelated, meaning that observations obtained close to each other are more likely to be similar than observations taken further apart from each other. This spatial correlation of ε(xi ) is described by the semivariance γ . If γ is New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 4 Figure 1. Example of a conceptual experimental semivariogram with nugget variance, sill variance and range. plotted as a function of the lag distance h, the semivariogram is obtained. In equation (2), one implicit assumption is that the semivariance depends only on the separation distance h and not on the position xi and xi + h (stationarity assumption). γ̂ (h) is estimated as [21]: N (h) 1 X [z(xi ) − z(xi + h)]2 , γ̂ (h) = 2N (h) i=1 (2) where γ̂ is the semivariance, h is the distance (lag) in metres, and N is the number of location pairs separated by vector (or lag) hours. From the semivariogram, the nugget variance (C0 ), sill variance (C1 ) and range (a) can be derived to describe the spatial behavior of the observed variable [21, 23, 24] (figure 1). The rate of the semivariogram increase reflects the degree of dissimilarity of ever more distant samples. If the semivariogram reaches a limiting value, called the sill, it means that there is a distance beyond which attribute values are uncorrelated. This distance is called the range. The nugget variance captures (i) a microstructure, namely a component of the phenomenon with a range shorter than the sampling support (true nugget effect), (ii) a structure with a range shorter than the smallest interpoint distance, and (iii) measurement or positioning errors [24]. The nugget to sill ratio can be used to express the magnitude of the spatial dependence in a given dataset [24, 26]. A variable is said to be autocorrelated—or regionalized—when the measure made at one sampling site brings information on the values recorded at a point located a given distance apart. The autocorrelation coefficient measures the degree of autocorrelation and can be expressed by Moran’s I coefficient [27, 28] according to the following equation: PP n wi j (yi − ȳ)(y j − ȳ) P I (d) = for i 6= j, (3) W (yi − ȳ)2 where • I is Moran’s I coefficient (positive values of I correspond to positive autocorrelation); • d is distance class, which is a function of the separating distance between sampling points; • yi and y j are the values of the variables with i and j varying from 1 to n (the number of data points); New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 5 • ȳ is the mean of the ys; • wi j is a weighting factor taking the value of 1 if the points belong to the same distance class and zero otherwise; and • W is the sum of the ws, i.e. the number of data pairs involved in the estimation of the coefficient for the distance class d. Ordinary kriging (OK) is a commonly used interpolation method that predicts values at unsampled locations using a weighted linear combination of observed data in its neighborhood [21, 23]. It takes into account the way in which a property varies in space through the semivariogram model. The aim of kriging is to estimate the value of the random variable Ẑ (x0 ) at unsampled points from observation data (equation (4); [21]). The weights are allocated to the sample data within the neighborhood of the point (or block) to be predicted in such a way as to minimize the kriging variance (equation (5); [21]). Ẑ (x0 ) = N X λi z(xi ) with i=1 N X λi = 1 to ensure that the prediction is unbiased. (4) i=1 The expected error is E[ Ẑ (x0 ) − Z (x0 )] = 0. The prediction variance is h i 2 var[ Ẑ (x0 )] = E { Ẑ (x0 ) − Z (x0 )} =2 N X λi γ (xi , x0 ) − i=1 N X N X λi λ y γ (xi , x j ). (5) i=1 j=1 Observations across earthscapes are limited due to labor and costs of data collection. An adaptable approach that considers variation in parameter and spatial space is needed to derive accurate predictions of earth attributes over a large region. The challenge is that it is typically unknown to what proportion m(xi ), ε(xi ), and ε0 contribute to Z (xi ). To address this issue, it is proposed to (i) disaggregate Z (xi ) and determine the effects of m(xi ) and ε(xi ) on prediction accuracy under various combinations of spatial fields of earth attributes encountered in different earthscapes; and (ii) use SciVis to transpose quantitative results into visual space. 3. Analysis of data 3.1. Disaggregation of earthscapes To quantify earth patterns, simulated spatial fields that resemble specific spatial phenomena within real-world earthscapes are generated. Each field consists of 121 nodes (xi ), where i designates the spatial coordinates (x, y), with 100 × 100 m spacing following a Gaussian distribution of values z. The first field consists of random (R) patterns of z 0 (xi ), which varies between zero and one (z 1 (xi ) to z 121 (xi )), generated using a random number generator in ArcGIS (Environmental Systems Research Institute (ESRI), Redlands, CA, USA). The variable z 0 (xi ) shows a mean of 0.47, median of 0.47, standard deviation of 0.28 and skewness coefficient of 0.13 providing an ideal test set to study spatial patterns. The 121 node values are rearranged to generate linear (L), short-range (SR), long-range (LR), mixed long and short-range (LSR), mixed long-range and random (LRR), mixed long-range and linear (LRL), mixed short-range New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 6 and random (SRR) and mixed short-range and linear (SRL) spatial patterns. To generate SR and LR fields, respectively, the attribute values z 0 (xi ) are rearranged 250 times (i.e. 250 iterations are performed) across the 121 nodes using an objective function to minimize the range a (to generate the SR field) and maximize a (to generate the LR field) using a spherical function to predict semivariograms (equation (6); [21]) in ISATIS version 8.0 (Geovariances Inc., Avon, France). ( ) 3h 1 h 3 if 0 < h 6 a, γ (h) = C0 + C1 − (6) 2a 2 a γ (h) = C0 + C1 if h > a, and γ (0) = 0. To generate the L field attribute values, (z 1 (xi ) to z 121 (xi )) are ordered sequentially from minimum to maximum values in north–south and west–east directions across all 121 nodes to generate linear patterns. The mixed synthetic fields (LSR, LRR, LRL, SRR and SRL) are derived by adding generated fields of R, L, SR and LR in various forms. For example, to generate the synthetic field LSR, the fields of LR and SR are added. Note that all spatial fields have values that follow a Gaussian distribution which avoids unstable predictions and stabilizes variances [21]. We use semivariograms, spatial range, nugget to sill ratio (NSR) [21] and Moran’s I coefficient [25, 26, 29] to characterize spatial dependence structures (ε(xi )) and the strength of spatial autocorrelations for each simulated field. Moran’s I values (equation (3)) are compared with random patterns (‘no spatial autocorrelation’). The test is based on the null hypothesis H0 ‘there is no spatial autocorrelation’. Under the H0, the value of Moran’s I coefficient is E(I ) = −(n − 1)−1 ≈ 0 with E(I ) being the expectation of I and n number of data points [28]. OK is used to predict values at unsampled locations projected onto a grid with 10 m spacing covering an area of 100 ha (figure 2). The spatial portion of the analysis is conducted in ISATIS version 8.0 (Geovariances Inc., Avon, France) and visualization in ArcGIS 9.2 (ESRI, Redlands, CA). Spatial patterns of generated fields L, SR and LR vary between zero and one; and mixed LSR, LRR, LRL, SRR and SRL vary between zero and two. Cross-validation is used to assess the prediction performance using the mean error (ME) [21] (equation (7)), root mean square error (RMSE) and coefficient of determination (R 2 ). Pn ẑ(xi ) − z(xi ) ME = i=1 (7) n with ẑ(xi ) being the predicted values, z(xi ) the observed values and n the number of observations. Next consider predictions derived using an earth sensing method, statistical or mechanistic model to map z(xi ) in parameter space across an earthscape. These predictions model the m(xi ) component of equation (1). Various sets of observations at 121 node locations of earth attributes derived at xi are considered. Each set represents a scenario along trajectories of prediction accuracies. The first set (z 0 (xi )) is considered the most accurate, e.g. an analytical method used in the laboratory to measure a given biogeochemical earth attribute deriving the ‘true’ value. Additional sets assume that values deviate from the original set by 10% (z 10 (xi )), 20% (z 20 (xi )), New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 7 Figure 2. Spatial patterns of generated fields: linear, short-range and long-range; random; short-range and random, and long-range and random; and mixed shortrange and linear, long-range and linear, long-range and short-range. The mixed spatial patterns (3rd and 4th columns) were generated by superimposing patterns shown in the 1st and 2nd columns. 30% (z 30 (xi )), 40% (z 40 (xi )), 50% (z 50 (xi )), 60% (z 60 (xi )), 70% (z 70 (xi )), 80% (z 80 (xi )), 90% (z 90 (xi )) and 100% (z 100 (xi )), respectively. These error fields are assumed to be stationary and may represent deviations due to measurement error, prediction error (e.g. mechanistic or statistical earth models), or both. For each attribute set (z 10 (xi ), . . . , z 100 (xi )), the R 2 , ME and RMSE in cross-validation mode are derived to quantify the prediction performance of attributes. The final step is to consider a mixed model that predicts an earth attribute acknowledging both spatial and non-spatial components of variation across an earthscape. First, the deterministic trend is modeled for various attribute sets (z 10 (xi ), . . . , z 100 (xi )) and then the spatial dependence structure of residuals is modeled to describe ε(xi ) for various simulated spatial fields (L, LR, LRL, SRL, LSR, LRR and SR). Since kriging is an exact interpolator the sequence of modeling first m(xi ) and then ε(xi ) of residuals cannot be reversed. For each prediction model (m(xi ) and ε(xi )), the prediction accuracy is assessed in cross-validation mode using R 2 , ME and RMSE. Each error metric is rescaled to a range from zero to one with the aim to standardize them (i.e. to make them comparable) where zero indicates optimized predictions and one indicates poor predictions. Rescaling was implemented by setting the maximum ME to 1 and dividing New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 8 Table 1. Root mean square error (RMSE), coefficient of determination (R 2 ) and mean prediction error (ME) for various sets of attribute values. Scenario RMSE R2 MEa z 10 (xi ) z 20 (xi ) z 30 (xi ) z 40 (xi ) z 50 (xi ) z 60 (xi ) z 70 (xi ) z 80 (xi ) z 90 (xi ) z 100 (xi ) 0.055 0.110 0.165 0.220 0.275 0.331 0.386 0.441 0.496 0.551 0.990 0.960 0.909 0.839 0.748 0.637 0.505 0.354 0.182 0 0.0474 0.0948 0.1422 0.1896 0.2370 0.2844 0.3317 0.3791 0.4265 0.4739 a It is assumed that deviations between predicted and original values are positive. all MEs by the maximum ME. The same rescaling procedure was adopted for RMSE and 1 − R 2 , respectively. Note that an inverse approach for R 2 is implemented with rescaled 1 − R 2 of 0 indicating optimized predictions and value of 1 for poor predictions. All three previously scaled metrics are then added and rescaled to the range of 0–1 to derive a synergy (S) value assuming equal weighting of each error metric (ME, RMSE and R 2 ). SciVis is used to assign each performance metric a specific color code to allow side-by-side comparison among m(xi ) and ε(xi ) for different spatial fields and parameter scenarios (z 10 (xi ), . . . , z 100 (xi )). 4. Results and discussion A combination of rescaled error metrics (0–1) and SciVis allows the effects of trend and spatial components on prediction performance to be compared. In essence, the variation of earth attributes across a larger region is disaggregated into trend and spatial components. To identify which modeling approach is best, depending on underlying spatial patterns, is of paramount interest in earth science studies. Options include to model (i) only the trend component m(xi ), (ii) only the spatial component ε(xi ) of observations z(xi ), or (iii) a combination of trend component m(xi ) followed by modeling the spatial component ε(xi ) of residuals (m(xi ) − z(xi )). According to [26], the NSR classifies the spatial dependence of a variable as either strong (NSR 6 0.25), moderate (0.25 < NSR < 0.75) or weak (NSR > 0.75). Based on the NSR, fields L, LRL, LR, SRL, LSR and LRR show strong spatial dependence, field SR moderate spatial dependence and SRR and R no spatial dependence, because no sill and nugget are generated for those two fields (table 1). According to Moran’s I index, there is less than 1% likelihood that the clustered patterns could be the result of a random chance for all fields, except fields SR and R, which show neither spatially autocorrelated nor dispersed patterns. As expected, for scenarios z 10 (xi ) to z 100 (xi ), the RMSE (0.055–0.551) and ME (0.0474–0.4739) increase linearly, whereas R 2 shows the opposite trend. Low RMSEs (<0.089) are only achieved within L, LRL and LR spatial fields and somewhat in fields with SR, SRL and LSR. This suggests that linear spatial trends show strong spatial dependence, confirmed by New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 9 Table 2. Root mean square error (RMSE), coefficient of determination (R 2 ) and mean prediction error (ME) for various sets of spatial patterns of earth variable z(xi ) and their semivariogram models and parameters (nugget, sill and range). Variables z(xi ) L LR LRL SR SRL LSR LRR SRR R Modela Nugget Partial sill NSR Range (m) RMSE R2 ME Bes. Sph. Gau. Sph. Sph. Exp. Sph. 0.0001 0.0024 0.0145 0.0539 0.0263 0.0291 0.0506 0.0600 0.0976 0.1967 0.0271 0.3376 0.1564 0.1362 0.002 0.02 0.06 0.66 0.07 0.15 0.27 461 793 874 256 1671b 655 843 0.017 0.081 0.089 0.252 0.244 0.255 0.329 0.996 0.916 0.943 0.183 0.610 0.587 0.323 0.0013 0.0025 0.0095 0.0006 0.0008 0.0012 −0.0014 c c c c c c c c c c c c c c c c Moran’s I (Z score) 0.32 (37.17) 0.44 (19.52) 0.26 (30.38) 0.08 (4.02) 0.17 (19.83) 0.11 (13.55) 0.09 (11.50) 0.0 (0.64) 0.0 (0.55) a Models: Bessel-J (Bes.), Spherical (Sph.), exponential (Exp.) and Gaussian (Gau). Spatial model is poor with long range larger than the dimension of the field. c Semiogram shows poor nugget effect. b the high Moran’s I coefficients that reduce errors (RMSE and ME) and increase prediction performance as indicated by R 2 . The introduction of short-range variation in fields diminishes the prediction performance indicated by a decreasing R 2 and increasing RMSE and ME. In fields that show mixed spatial patterns (R and SRR), interpolations are associated with large errors. Mixed spatial models that contain random or short-range variations (SRL, LSR and LRR) demonstrate large RMSE (>0.244) and low R 2 (<0.610) comparable with attribute predictions that deviate more than 40% from the true values considering the RMSE (>0.220) and more than 60% considering the R 2 (<0.637). These findings suggest that depending on the underlying spatial patterns of a given attribute found in an earthscape, it is important to model either the spatial component (table 2) or the non-spatial trend (table 1). However, at this point it is not clear to what extent the accuracy of predictions in spatial or parameter space is better or worse. Thus, no decision on efficient use of sampling and analytical resources can be made but will be addressed below. The effects of m(xi ) and ε(xi ) on prediction accuracy under various combinations of spatial fields of earth attributes are illustrated visually in figures 3–6. High rescaled values (maximum of one) represent low accuracy in predictions and vice versa. The ME considers over- and underestimations, the RMSE considers the squared deviations between predicted and observed values, whereas the R 2 is focused on highlighting the unexplained variance as a ‘goodness-offit’ metric. As expected, rescaled values of 1 − R 2 increase steeply to model the deterministic trend component m(xi ) under various parameter scenarios (z 10 (xi ) to z 100 (xi )) from 0.01 to 1.0, respectively (figure 3). For rescaled 1 − R 2 , there is no major effect of ε(xi ) on original observations (z 0 (xi )) or residuals among different parameter scenarios since it was assumed that parameter deviations are equally distributed at all observation sites. However, there are large variations in ε(xi ) among different spatial fields (L, LR, LRS, SRL, LSR, LRR or SR). Overall, rescaled 1 − R 2 is two orders of magnitude lower (<0.084) for L, LR, and LRL compared with other spatial fields. The rescaled 1 − R 2 amounted to 0.408 (SRL), 0.452 (LSR), 0.662 (LRR) New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 10 Figure 3. Rescaled 1 − R 2 values (0–1) for trend m(xi ), spatial ε(xi ) and residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi )) and spatial fields with linear (L), long-range (LR), long-range and linear (LRL), short-range and linear (SRL), long and short-range (LSR), long-range and random (LRR) and short-range (SR). and 0.724 (SR), respectively. Based on visual interpretation, a rescaled 1 − R 2 of z 70 (xi ) of 0.490 is in a comparable range to the spatial model (ε(xi )) derived from a field with LSR spatial pattern (0.452). This suggests that a prediction model m(xi ) that is 70% off is similar in terms of rescaled 1 − R 2 when compared with a spatial interpolation model (ε(xi )) in an earthscape with dominant long and short-range patterns. Interestingly, the rescaled 1 − R 2 is similar for modeling ε(xi ) of original values of fields (i.e. z(xi )) and (ε(xi )) of residuals across various spatial fields. The rescaled ME increases along profiles of increasing inaccuracy (z 10 (xi ) to z 100 (xi )), but speckled patterns emerge for ε(xi ) considering different spatial fields (figure 4). It is important to note that ε(xi ) residuals for rescaled ME are one order of magnitude higher for L, LR and LRL, but much lower for SRL, LSR, LRR and SR when compared with rescaled 1 − R 2 . This suggests that long-range and long-range and linear patterns contribute to enlarge the ME when modeling the spatial dependence structure of residuals across various scenarios (z 10 (xi ) to z 100 (xi )). The rescaled ME for m(xi ) and ε(xi ) residuals on fields with SR patterns mirror each other along trajectories of accuracies (z 10 (xi ) to z 100 (xi )). This suggests that in an earthscape with pronounced short-range patterns it is challenging to model spatial autocorrelation patterns, New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 11 Figure 4. Rescaled mean error (ME) values (0–1) for trend m(xi ), spatial ε(xi ) and residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi )) and spatial fields with L, LR, LRL, SRL, LSR, LRR and SR. possibly requiring many observation sites to capture the underlying short-range variability of earth attributes. Figure 5 illustrates the behavior of rescaled RMSE for m(xi ) and various ε(xi ) models on different spatial fields. Similar to ME, the rescaled RMSE increases for m(xi ) along different parameter scenarios (z 10 (xi )to z 100 (xi )). In contrast with rescaled ME, the rescaled RMSE is two orders of magnitude lower for ε(xi ) residuals on spatial fields L, LR and LRL. The reverse trend occurs on spatial fields SRL and LSR, whereas LRR and SR behave similarly for rescaled RMSE and ME. Considering all three assessment metrics (1 − R 2 , ME and RMSE) into a rescaled synergy value (S), the effects of m(xi ), ε(xi ) and mixed models (m(xi ) and ε(xi ) of residuals) are contrasted (figure 6). Overall, L, LR and LRL show lowest rescaled S for ε(xi ) of residuals (< 0.4) across all parameter scenarios (z 10 (xi ) to z 100 (xi )). The largest rescaled S is reached for m(xi ) of parameter scenario z 100 (xi ). Other large rescaled S values occur on fields LRR and SR, in particular in parameter scenarios z 50 (xi ) to z 100 (xi ). This illustrates the usefulness of rescaling in combination with SciVis. Assume that a mixed model at z 10 (xi ) is modeled with a trend m(xi ) of 0.070 and residuals from the trend ε(xi ) of 0.015 on an L spatial field. Both trend and spatial models show high prediction performance indicated by low rescaled S values. Yet consider the case where the New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 12 Figure 5. Rescaled root mean square error (RMSE) values (0–1) for trend m(xi ), spatial ε(xi ) and residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi )) and spatial fields with L, LR, LRL, SRL, LSR, LRR and SR. spatial residual model is on an SR field with the ε(xi ) residual of 0.302, three times larger than the m(xi ) trend model with rescaled S of 0.070. Thus, modeling the spatial residuals on an SR field would possibly inflate the overall error in predicting earth attributes. Further assume a mixed model at z 50 (xi ) modeled with a trend m(xi ) of 0.416 and residuals from the trend ε(xi ) of 0.188 (L), 0.173 (LR), 0.188 (LRL), 0.300 (SRL), 0.301 (LSR), 0.531 (LRR) and 0.542 (SR). These findings illustrate that caution is required when using mixed models on LRR and SR spatial fields, because they would possibly introduce additional large errors into the prediction process of earth attributes. If the parameter prediction model is more than 70% off from the true earth attribute value z 70 (xi ), the proportion between trend and spatial errors reverses with higher rescaled S value for m(xi ) with 0.750, but much lower values for ε(xi ) residual with 0.291 (L), 0.257 (LR), 0.291 (LRL), 0.399 (SRL) and 0.391 (LSR). Only residual ε(xi ) on spatial fields LRR with 0.717 and SR with 0.724 show similar high rescaled S values when compared with the trend m(xi ). This disproportionate split between spatial and deterministic trends is also found for scenarios z 80 (xi ) to z 100 (xi ). One may argue that it is expected that increase in measurement errors, in particular for rough patterns when compared with smoother patterns, will lead to poorer predictions. This is true but the intriguing aspect of this study is that visual comparisons (figures 2–6) allowed us New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 13 Figure 6. Rescaled synergy (S) values (0–1) for trend m(xi ), spatial ε(xi ) and residual spatial ε(xi ) models along trajectories of accuracies (z 10 (xi ) to z 100 (xi )) and spatial fields with L, LR, LRL, SRL, LSR, LRR and SR. to distinguish between errors in representing spatial and non-spatial variability across various error trajectories. This approach allows modelers to make a decision regarding analytical procedures (e.g. desired precision to measure earth/soil properties) under given circumstances (e.g. landscape settings). 5. Conclusions In this study, various parameter and spatial scenarios are used to demonstrate the effects of modeling errors in predicting earth attributes. We use synthetic data with optimized Gaussian distributions under controlled conditions (e.g. assuming a stationary error process). Although stationarity in earth/soil attributes in nature is often not found, this study has value contrasting different scenarios of possible spatial realizations of properties found in different earthscapes. Studying spatial and non-spatial behavior and variability in a controlled environment enables us to make informed decisions on analytical measurements and sampling strategies for capturing the underlying variability in earth attributes. In this study, our aim is to understand and contrast the complex spatial behavior of earth attributes, which is dependent on a multifactorial system New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 14 of environmental factors (e.g. topography, micro- and macro-climate, parent material/geology, etc) and human impact (e.g. land use and non-point and point source pollution). This knowledge can flow into future earth/soil studies that aim to map these patterns across earthscapes. Depending on the underlying spatial variability either modeling the deterministic trend m(xi ), spatial dependence structure ε(xi ), or mixed components of m(xi ) and ε(xi ) of residuals is most successful in resembling earth patterns. Most problematic to model are random and short-range spatial structures to describe earthscapes. The trend and spatial components of earthscapes with long-range and linear patterns show higher accuracies to model earth attributes. SciVis allows ease of comparison among many simulated output values that are rescaled (standardized). The visual tables provide an overview of possible patterns encountered in real earthscapes and will assist in making an informed decision on what strategy to use to model variability within a specific earthscape. The significance lies in capturing the phenomena in the form of trend and/or spatial components. As a conjecture, the findings from this study are not limited to earth attributes, but can be employed to study vegetation species, wildlife species or other environmental phenomena. Thus, mapping/visualization techniques presented in this paper have value for many applications in (geo)physical sciences. References [1] Moreau E, Velde B and Terribile F 1999 Comparison of 2D and 3D images of fractures in a Vertisol Geoderma 92 55–72 [2] Perret J, Prasher S O, Kantzas A and Langford C 1999 Three-dimensional quantification of macropore networks in undisturbed soil cores Soil Sci. Soc. Am. J. 63 1530–43 [3] Blair J M, Falconer R E, Milne A C, Young I M and Crawford J W 2006 Modeling three-dimensional microstructure in heterogeneous media Soil Sci. Soc. Am. J. 71 1807–12 [4] Erskine R H, Green T R, Ramirez J A and MacDonald L H 2007 Digital elevation accuracy and grid cell size: effects on estimated terrain attributes Soil Sci. Soc. Am. J. 71 1371–80 [5] Grunwald S, Barak P, McSweeney K and Lowery B 2000 Soil landscape models at different scales portrayed in Virtual Reality Modeling Language (VRML) Soil Sci. 165 598–615 [6] Mendonca Santos M L, Guenat C, Bouzelboudjen M and Golay F 2000 Three-dimensional GIS cartography applied to the study of the spatial variation of soil horizons in a Swiss floodplain Geoderma 97 351–66 [7] Grunwald S and Barak P 2003 3D geographic reconstruction and visualization techniques applied to land resource management Trans. GIS 7 231–41 [8] Bellamy P H, Loveland P J, Bradley R I, Lark R M and Kirk G J D 2005 Carbon losses from all soils across England and Wales 1978–2003 Nature 437 245–7 [9] Walter C, Viscarra Rossel R A and McBratney A B 2003 Spatio-temporal simulation of the field evolution of organic carbon over the landscape Soil Sci. Soc. Am. J. 67 1477–86 [10] Ramasundaram V, Grunwald S, Mangeot A, Comerford N B and Bliss C M 2005 Development of an environmental virtual field laboratory J. Comput. Educ. 45 21–34 [11] Young A 1998 Land Resources: Now and for the Future (Cambridge: Cambridge University Press) [12] Grunwald S 2006 What do we really know about the space-time continuum of soil-landscapes? Environmental Soil-Landscape Modeling—Geographic Information Technologies and Pedometrics ed S Grunwald (New York: CRC Press) pp 3–37 [13] Grunwald S 2006 Three-dimensional reconstruction and scientific visualization of soil-landscapes? Environmental Soil-Landscape Modeling—Geographic Information Technologies and Pedometrics ed S Grunwald (New York: CRC Press) pp 373–92 [14] DeGloria S D 1993 Visualizing soil behavior Geoderma 60 41–55 New Journal of Physics 10 (2008) 125011 (http://www.njp.org/) 15 [15] Barraclough A and Guymer I 1998 Virtual reality—a role in environmental engineering education? Water Sci. Technol. 38 303–10 [16] Fisher P and Unwin D 2002 Virtual Reality in Geography (New York: Taylor and Francis) [17] McBratney A B, Mendonca Santos M L and Minasny B 2003 On digital soil mapping Geoderma 117 3–52 [18] McKenzie N J and Ryan P J 1999 Spatial prediction of soil properties using environmental correlation Geoderma 89 67–94 [19] Rivero R G, Grunwald S and Bruland G L 2007 Incorporation of spectral data into multivariate geostatistical models to map soil phosphorus variability in a Florida wetland Geoderma 140 428–43 [20] Grunwald S, Reddy K R, Prenger J P and Fisher M M 2007 Modeling of the spatial variability of biogeochemical soil properties in a freshwater ecosystem Ecol. Model. 210 521–35 [21] Webster R and Oliver M A 2001 Geostatistics for Environmental Scientists (Chichester, UK: Wiley) [22] Burrough P A and McDonnell R A 1998 Principles of Geographical Information Systems—Spatial Information Systems and Geostatistics (New York: Oxford University Press) [23] Goovaerts P 1997 Geostatistics for Natural Resources Evaluation (Oxford: Oxford University Press) [24] Chilès J P and Delfiner P 1999 Geostatistics Modeling Spatial Uncertainty (New York: Wiley) [25] Isaaks E H and Srivastava R M 1989 An Introduction to Applied Geostatistics (Oxford: Oxford University Press) [26] Cambardella C A, Moorman T B, Novak J M, Parkin T B, Karlan D L, Turco R F and Konopka A E 1994 Field scale variability of soil properties in central Iowa soils Soil Sci. Soc. Am. J. 58 1501–11 [27] Moran P A P 1950 Notes on continuous stochastic phenomena Biometrika 37 17–23 [28] Rossi J P 1996 Statistical tool for soil biology. Autocorrelogram and Mantel test Eur. J. Soil Biol. 32 195–203 [29] Goodchild M F 1986 Spatial Autocorrelation—Concepts and Techniques in Modern Geography (Norwich, UK: Geo Books) New Journal of Physics 10 (2008) 125011 (http://www.njp.org/)