Soil Science Society of America Journal 64:1674-1680 (2000)
© 2000 Soil Science Society of America
DIVISION S-4-SOIL FERTILITY & PLANT NUTRITION
Soil Sampling Strategies for Precision Agriculture Research under Sahelian Conditions
J.W. van Groenigena,
M. Gandahb and
J. Boumac
a Soil Sci. Div., Int. Inst. for Aerospace Survey and Earth Sci. (ITC), P.O. Box 6, 7500 AA Enschede, the Netherlands, currently at Univ. of California-Davis, Dep. of Agron. and Range Sci., 1 Shield Av., Davis, CA 95616-0515 USA
b Institut National de Recherche Agronomique du Niger (INRAN), B.P. 429, Niamey, Niger
c Dep. of Environ. Sci., Wageningen Agricultural Univ., P.O. Box 37, 6700 AA Wageningen, The Netherlands
groenigen{at}ucdavis.edu
 |
ABSTRACT
|
|---|
The cost of soil samples to characterize field variability is a key problem in precision agriculture. This study was conducted to investigate whether yield maps can be used to optimize soil sampling for characterizing soil variables that determine yield variability. Using an inexpensive, low-tech scoring technique, yield maps of pearl millet [Pennisetum glaucum (L.) Br.] were produced for a zero-input farm in Niger. The soil was classified as a typic Plintustalf. The Spatial Simulated Annealing (SSA) algorithm was used to optimize three sampling schemes. Scheme 1 optimized coverage over the whole area. Scheme 2 covered the whole yield range. Scheme 3 covered the low-producing areas. Yield varied from 0 to 2500 kg ha-1, measured per planting hill. Using correlation coefficients, Scheme 2 found significant correlations between five soil variables and yield. Scheme 1 found only one significant correlation and explained 37% of the variation in yield using multivariate regression of yield on soil variables. Scheme 2 explained 70% of the variation in yield. Differences between Scheme 3 and Scheme 1 proved to be significant for distance to shrubs, relief, soil pH, and cation-exchange capacity (CEC). We concluded that shrubs are the main factor influencing millet yield by means of catching eroded materials and improving soil fertility. The possibilities of planting shrubs to improve soil fertility should be investigated. Variograms of relief and yield suggested that spatial correlation is largely confined to distances of 3 to 5 m. Since Scheme 2 was most effective in establishing soilyield relationships, we concluded that yield maps can be used to optimize soil sampling.
Abbreviations: CEC, cation-exchange capacity MMSD, minimization of the means of the shortest distance OM, organic matter SSA, spatial simulated annealing
 |
INTRODUCTION
|
|---|
IN RECENT YEARS, an increased interest among soil scientists regarding soil variability on the field scale has been observed. Geostatistics plays an important part in the quantitative evaluation of such spatial variability. Interpolation techniques such as ordinary (block) kriging and indicator kriging have been applied for optimal interpolation of point observations (e.g., Van Uffelen et al., 1997; Stein et al., 1997). Also, stochastic simulations have been used to reproduce the spatial variability that has been modeled from the observations on simulated maps (e.g., Gómez-Hernández and Srivastava, 1990; Goovaerts, 1997).
Apart from these applications and adaptations of existing geostatistical tools, precision agriculture poses some more specific challenges to geostatisticians. One of these is the increasing availability of maps of data that can be helpful for purposes such as mapping of soil properties or yield prediction. Examples of such data are maps of soil tillage resistance (Van Bergeijk and Goense, 1997), remote sensed imagery (Booltink and Verhagen, 1997), and yield maps collected using low-tech (Stoorvogel, 1995) or high-tech (Bouma, 1997) approaches.
Franzen et al. (1998) found that partitioning the area according to topography might decrease the number of samples needed to characterize the soil variability compared with a regular grid. In this study, we combined such partitioning of the area with geostatistical analysis and interpolation. Furthermore, we present an algorithm that optimizes spreading of the observations across the study area.
Spatial variability of yield and soils is an important aspect of farming in the zero-input subsistence millet farming systems of the Sahelian Zone. Soils are highly variable in terms of both chemical and physical fertility. In a study in northern Niger, Bouma et al. (1996) mentioned five reasons for this: (i) relief and crusting, which cause crucial redistribution of water infiltration (Gaze et al., 1997); (ii) termites, which may enrich the soil locally; (iii) local effects of trees and shrubs; (iv) differences associated with landscape position; and (v) soil fertility gradients around villages.
Brouwer et al. (1993) reported a yield stabilizing effect of this variability in fields in the same area over years. In relatively wet years, yields were highest at the highest sites, due to catchment of fertile eroded particles. In dry years these spots were nonproductive due to lack of water, while the lower areas ensured a subsistence level of millet yield.
Since both fertility and water supply are often low, even moderate spatial variability of these factors can have profound influence on crop yields. Stein et al. (1997) found a millet yield range of 0 to 2885 kg ha-1, measured in 5 by 5 m blocks. They were able to explain 30% of the yield variability by multiple regression on soil variables. Gandah et al. (1998), using the same support size for yield sampling, were able to explain only 5 to 28% of the yield variability by regression.
These results, combined with observations in the field, suggested that the support size of yield sampling should be smaller to explain most of the yield variability. This study investigates whether yield maps based on a smaller support size can be used to optimize soil sampling for this purpose.
 |
Materials and methods
|
|---|
Study Area
The study area is located on a subsistence farmers' millet field near the village of Tchigo Tagui,
80 km northeast of the city of Niamey in western Niger. The field is located on a laterite plateau with eolian sand deposits. Size of the field is
1.3 ha. The soils can be classified as typic Plinthustalfs (Soil Survey Staff, 1996). The rainy season is from May to September, and mean annual rainfall is
480 mm (Sivakumar et al., 1993). The field is located in an area where several other studies on millet yield variability were situated (Bouma et al., 1996; Brouwer et al., 1993; Gandah et al., 1998; Gaze et al., 1997; Stein et al., 1997).
Spatial Simulated Annealing
Spatial Simulated Annealing is an algorithm that was designed for optimization of spatial sampling schemes (Van Groenigen and Stein, 1998). Its features include the incorporation of preliminary observations and taking into account sampling constraints and boundaries. Furthermore, SSA allows the use of several quantitative optimization criteria, among them minimization of the ordinary kriging variance and estimation of the experimental variogram (Van Groenigen et al., 1998). In this study, use was made of the MMSD (minimization of the mean of the shortest distances) criterion, which aims at a uniform spreading of the observations across the area of interest. This is done by minimizing the expectation of the distance between an arbitrarily chosen location, and its nearest observation point. This leads to the following minimization function for the MMSD criterion
 | (1) |
where S denotes the sampling scheme, A represents the areasof interest, and
is a random location vector in A. Vs(
) represents the location vector of the nearest sampling point
s
S. As this minimization function cannot be solved analytically, we estimated it using the following function:
 | (2) |
where the location vectors
1e, ...,
nee denote the nodes of a fine raster grid over A. ne denotes the number of location vectors, and j denotes the individual location vectors.
The optimization process starts with an initial sampling scheme S0, consisting of randomly drawn locations over A. Subsequently, an alternative sampling scheme S1 is derived from S0 by a transformation of one of the locations over a random vector. The probability of S1 being accepted as a basis for further optimization depends on the Metropolis criterion (Kirkpatrick et al., 1983)
 | (3) |
This criterion ensures that occasionally intermediate inferior sampling schemes are also accepted, thereby avoiding premature ending of the optimization process in local minima. As the process continues, parameter c decreases. In this way, the sampling scheme freezes into its optimal configuration.
For an extensive discussion on SSA and the implementation of the MMSD criterion, see Van Groenigen and Stein (1998).
Soil Sampling
Within the field, three plots of 10 by 10 m each were selected for soil sampling. The plots were selected across the field, spaced
100 m apart.
Three different soil sampling schemes were applied in order to test the benefits of the different approaches. Samples were taken at three depths (00.1, 0.10.2, and 0.20.4 m) and were analyzed for pH, texture, P-Bray, CEC, and organic matter (OM). We measured the pH in a 1:2.5 water suspension. In order to make the case study realistic in terms of financial constraints usually encountered in precision agriculture, each sampling scheme consisted of a total of 27 observations across the three plots. The sampling schemes are listed below:
- The first sampling scheme (S1) aimed at even spreading of the sampling points across the total area (the three plots). For nine locations per plot, SSA yielded a regular square grid.
- Sampling scheme S2 stratified the area according to yield. Nine yield ranges were defined, from zero to maximum yield, and the area was partitioned according to yield range. In each stratum, three observations were distributed optimally using SSA.
- Sampling scheme S3 focused specifically on low-producing areas. This may be especially useful for precision agriculture purposes when a detailed survey of low-spots is desired for remedial action. To define low-producing areas, use was made of the threshold value of 250 kg ha-1 yr-1. This threshold was mentioned in Stein et al. (1997) as the minimum yield required for a family of 10 persons with 8 ha of cropping land. Using yield data for the three plots, the area that produced below this threshold value was delineated, and the 27 observations were distributed across this area using the MMSD criterion.
In addition to these samples, a detailed relief map was produced, as former studies showed a strong influence of relief on yield (Stein et al., 1997; Brouwer et al., 1993). Using a level, relief was measured on a square 1 by 1 m grid.
Since earlier studies suggested an influence of shrubs on yield variability, a variable distance to nearest shrub was included.
Scoring
In order to get a fast and inexpensive prediction of millet yield, a semiquantitative scoring technique as presented in Buerkert et al. (1995) was used. Scoring was performed on 4 July, 8 August, and 4 Sept. 1997 in the three plots. Scoring values were assigned to individual hills, ranging from 0 (no plants at all) to 8 (maximal aerial biomass in the field, estimated by observer). For the first two scoring dates, scoring ranges were 0 to 4 and 0 to 6, respectively, because of a lack of discriminative features. The millet was harvested around 7 November.
After the last scoring, several hill samples were collected to establish the relationship between scoring and yield. For each of the nine scoring classes of 4 September, eight hills were harvested, and the heads were threshed in a traditional manner and weighted separately.
 |
Results
|
|---|
Yield Maps
Figure 1
gives as an example the scoring results of Plot 3 at the three scoring dates. Scoring data are represented using Thiessen polygons, as the semiquantitative character prohibits direct interpolation using kriging.

View larger version (46K):
[in this window]
[in a new window]
|
Fig. 1 Results of a semiquantitative scoring at three dates for Plot 3. Scoring values ranged from 0 (no plants at all) to 8 (maximal aerial biomass). Results are presented using Thiessen Polygons
|
|
The results of the scoring calibration are shown in Fig. 2
. Based on this data, scoring should not be used directly as a (semi-) quantitative measure that is linearly related to the yield, but should be calibrated using real yield measurements. Scoring values of 0 to 3 all resulted in a mean yield very close or equal to 0. Scoring classes 5 to 7 did not differ significantly from each other.
Using this calibration data, the scoring data per hill were transformed into estimated yield data. This was interpolated using ordinary kriging, resulting in the predicted yield maps shown in Fig. 3a through 3c
. Table 1
shows the variograms of the yield. These can be characterized by a relatively high nugget effect (54 and 48% for Plots 2 and 3, respectively). Furthermore, ranges of 3.4 and 4.2 m confirm suppositions of high variability at very short distances.
A survey of any shrubs on the plots was made, as previous studies showed that shrubs can have a considerable influence on soil and yield variability (Brouwer et al., 1993). The position of the shrubs is shown in Fig. 3a through 3c.
Soil Data
Table 1 shows fitted variogram models for relief, measured at the 1 by 1 m grid. Starting at the boundaries of the plots, this resulted in a data set of 11 x 11 = 121 observations for each plot. Although the variograms differ much more in character than those of the yield predictions, ranges are still shorter than 5 m. Figure 3d through 3f show the interpolated microtopography for the three plots using ordinary kriging. The maximum difference in relief within a plot is 0.13 m (measured at Plot 1).
The three sampling schemes were established as follows:
- Sampling scheme S1 is shown in Fig. 4a through 4c
. As it was not designed using the yield data, it missed several yield extremes, notably all hot spots of Plot 2. Table 2
summarizes the soil data of S1. Texture varies from sand in the upper soil to loamy sand (due to clay illuviation) in the subsoil. This resulted in an increasing CEC with depth. Furthermore, organic matter content and P-Bray decreased with depth. Coefficients of variation were greatest for distance to the nearest shrub (56%).
- Sampling scheme is shown in Fig. 4d through 4f. As this scheme aimed at covering a wide range of yields, most of the hot spots were sampled. All low producing areas (e.g., the lower left corner of Plot 3) were covered. Most descriptive statistics of S2 were close to that of S1. However, standard deviation of relief was much higher in S2, and mean distance to shrubs was much smaller.
- The area with a predicted yield lower than the threshold of 250 kg ha-1 yr-1 was delineated in the three plots, and sampling points of S3 were evenly distributed. Figure 4g through 4i show the resulting sampling scheme. As the predicted yield in Plot 3 was considerably lower than in Plots 1 and 2, most of the observations (15) were taken there. Descriptive statistics are presented in Table 2. pH was lower at all depths than in S1. Furthermore, texture of S3 was slightly heavier, organic matter content was lower, and distance to shrubs was higher.

View larger version (41K):
[in this window]
[in a new window]
|
Fig. 4 The three sampling scheme for the three plots, superimposed on yield estimates. (ac) Scheme 1 optimized uniform spreading of the 27 observations over the plot, (df) Scheme 2 optimized coverage of the yield range, and (gi) Scheme 3 optimized coverage of the low-producing areas
|
|
Correlation and Regression Analysis
To test the performances of S1 and S2 in assessing soilyield relations, we calculated correlation coefficients between yield and soil variables. Yield at sampling locations was predicted using ordinary kriging. For S1, only distance to shrubs was significant, with
, and no other variables show a significant correlation with yield (Table 3)
. The negative correlation of yield with distance to shrubs can be explained by the relative higher fertility (in terms of OM and P-Bray) closer to the shrubs.
A stepwise linear regression was performed to find significant relations for S1 and S2. The entrance level for variables in the model was 0.05. For yield, the Lilliefors test for normality was not rejected (with
). The selected model for S1 was:
 | (4) |
where Silt[0.1-0.2 m] denotes the silt content in the second layer. This model (with
) yielded an adjusted r2 of 0.372. These results are only slightly better than those from Stein et al. (1997), who found an unadjusted r2 of 0.377 in a much less detailed survey.
For S2, one correlation coefficient was highly significant
, one significant at
, and three more significant with
. The significant correlation coefficient for distance to shrubs confirms the findings of S1. Relief and OM were positively correlated with yield. This is probably caused by the higher relief around shrubs due to accumulation of wind-deposited material. In the second layer, silt was significant correlated with yield. Organic matter was highly significant correlated.
The selected regression model for S2 was
 | (5) |
This model yielded an adjusted r2 of 0.705, which is much higher than that obtained by S1. In this model distance to shrubs was not included, despite its highly significant correlation coefficient (Table 3). This might be explained by its highly significant correlation with relief. In fact, univariate regression of yield gave a highly significant model for distance to shrubs:
 | (6) |
Although this model is highly significant, it only explains 25% of the yield. Therefore, the multivariate regression was preferred.
Student t Test
To find the main soil variables responsible for low yield, a Student t test was performed to show significant differences in soil variables between S1 and S3. Table 4
shows the probabilities that the means of the two populations were similar. Three variables (pH[0-0.1 m], pH[0.1-0.2 m], and CEC[0-0.1 m]) showed highly significant differences, whereas four additional ones (Relief, Distance, pH[0.2-0.4 m], and OM[0.2-0.4 m]) showed significant differences. The significance of distance to shrubs and relief coincide with findings of the regression analysis. The CEC[0-0.1 m] was significantly higher for S3. Finally, pH is highly significant different at all depths, reflecting the general depletion of the low producing area.
 |
Discussion and conclusions
|
|---|
Using the SSA algorithm, three different sampling schemes of limited size were constructed. Scheme S1, which aimed at uniform spreading of the sampling points over the area, yielded a significant (negative) correlation of distance to shrubs with yield. Scheme S2, which directed sampling to low and high extremes in yield, showed significant correlations for four additional variables. Moreover, using multivariate regression analyses, the explained yield variation increased from 37% using S1 to 70% using S2. Therefore, we concluded that the approach taken for constructing S2 should be preferred for relating yield to soil variables in case of limited observations.
Differences between S3, which aimed specifically at low-producing areas, and S1 were used to detect the main limiting variables for yield. Distance to shrubs and relief were probably the most important factors. Other significant variables (higher CEC and lower pH in S3) were probably related to the effects of soil erosion due to absence of shrubs.
These findings are in line with existing management techniques practiced by local farmers. Shrubs are seen as a valuable asset, useful in locally improving soil fertility by catchment of airborne eroded particles and by enrichment with organic matter from the shrub. At the start of the growing season, shrubs are trimmed to limit competition with the millet crops. Since rainfall at the beginning of the 1997 growing season was relatively good, this is also in line with the yield-stabilizing effect of soil variability, as reported by Brouwer (1996). In drier years, the yield pattern may be reversed, and distance to shrubs may have a positive correlation with yield. However, in order to optimize the yield-stabilizing effect of the shrubs, planting of shrubs should be considered as a low-tech soil management practice. More research should be dedicated to optimal placement of such shrubs.
The sampling schemes used in this study were deliberately kept small in order to stay close to financial constraints generally met in precision agriculture research. Although costs of laboratory analysis are still out of reach for the marginal farmers in the study area, we think that the techniques developed can be of great value to researchers dealing with the highly variable soils of the Sahelian Zone. Moreover, the type of auxiliary data used to optimize sampling is typical for precision agriculture, and these techniques may be applied in other types of precision agriculture. As an example, remote-sensed data related to crop yield might be treated in the same way as the predicted yield maps in this study.
Since the number of observations was relatively small, relatively simple statistical techniques were used to relate yield to soil variables. In particular, the selection of points for Scheme 2 may lead to overestimation of the effects of the regressed variables. In our opinion, this constrained approach represents the practice of precision agriculture rather well. If a higher number of observations can be collected, geostatistics can be applied for interpolation of soil variables. Auxiliary data like yield maps may then serve as covariable in cokriging.
 |
ACKNOWLEDGMENTS
|
|---|
The authors are very grateful to Joost Brouwer and Alfred Stein for their advice on many aspects of this study. Furthermore, we would like to thank Niek van Duivenbooden and Charles Bielders for their assistance at ICRISAT Sahelian Centre and Tjerk Kuster for his part in the scoring. We are very grateful to Hassan Ousmane for assistance during fieldwork. Finally, we would like to thank three anonymous reviewers for some very helpful comments and suggestions.
Received for publication December 4, 1998.
 |
REFERENCES
|
|---|
- Booltink, H.W.G., and J. Verhagen. 1997. Integration of remote sensing, modeling and field measurements towards an operational decision support system for precision agriculture. p. 921929. In J.V. Stafford (ed.) Precision agriculture '97. First European Conference on Precision Agriculture.Warwick, UK. 710 Sept. 1997. BIOS, Oxford, UK.
- Bouma, J. 1997. Precision agriculture: introduction to the spatial and temporal variability of environmental quality. p. 513. In CIBA Foundation Symposium 210. John Wiley and Sons, New York.
- Bouma J., Verhagen J., Brouwer J., Powell J.M. Using systems approaches for targetting site specific management on field level. In: Kropff M.J., et al. , ed. Applications of systems approaches at the field level. Dordrecht, the Netherlands: Kluwer Academic Press, 1996:25-36.
- Brouwer, J. 1996. Water and nutrients alternate in limiting agricultural production in the Sahel. In Proc. of the 1st Int. Conf. of the West and Central African Soil Science Assoc. Ouagadougou, Burkina Faso. 612 Dec. 1993. Centre National de la Recherche Scientifique et de la Technologie, Ouagadougou, Burkina Faso.
- Brouwer J., Fussel L.K., Herrmann L. Soil and crop growth micro-variability in the West African semi-arid tropics: a possible risk reducing factor for subsistence farmers. Agric. Ecosyst. Environ. 1993;45:229-238.
- Buerkert A., Stern R.D., Marchner H. Post stratification clarifies treatment effects on pearl millet growth in the Sahel. Agron. J. 1995;87:752-763.[Abstract/Free Full Text]
- Franzen D.W., Cihacek L.J., Hofman V.L., Swenson L.J. Topography-based sampling compared with grid sampling in the northern great plains. J. Prod. Agric. 1998;11:364-370.
- Gandah M., Bouma J., Brouwer J., van Duivenbooden N. Use of a scoring technique to assess the effect of field variability on yield of pearl millet grown on three alfisols in Niger. Neth. J. Agric. Sci. 1998;46:39-52.
- Gaze S.R., Simmonds L.P., Brouwer J., Bouma J. Measurement of surface redistribution of rainfall and modeling its effect on water balance calculations for a millet field on sandy soil in Niger. J. Hydrol. 1997;188189:267-284.
- Gómez-Hernández J., Srivastava R. ISIM3d: an Ansi-c three dimensional multiple indicator conditional simulation program. Comput. Geosci. 1990;16:395-440.
- Goovaerts P. Geostatistics for natural resources evaluation. New York: Oxford Univ. Press, 1997.
- Kirkpatrick S., Gelatt C.D., Vecchi M.P. Optimisation by simulated annealing. Sci. 1983;220:671-680.[Abstract/Free Full Text]
- Sivakumar M.V.K., Maidoukia A., Stern R.D. Agroclimatology of West-Africa: Niger. Niamey, Niger: ICRISAT, 1993 ICRISAT Info. Bull. 5..
- Soil Survey Staff. Keys to soil taxonomy, 7th ed Blacksburg, VA: USDA Soil Conservation Service. Pocahontas Press, 1996.
- Stein A., Brouwer J., Bouma J. Methods for comparing spatial variability patterns of millet yield and soil data. Soil Sci. Soc. Am. J. 1997;61:861-870.[Abstract/Free Full Text]
- Stoorvogel J.J. Linking GIS and models: Structure and operationalisation for a Costa Rican case study. Geoderma 1995;43:19-29.
- Van Bergeijk, J., and D. Goense. 1997. Soil tillage resistance as a tool to map soil type differences. p. 605616. In J.V. Stafford (ed.) Precision agriculture `97.First European Conference on Precision Agriculture. 710 Sept. 1997. BIOS, Oxford, UK.
- Van Groenigen J.W., Siderius W., Stein A. Constrained optimisation of soil sampling for minimisation of the kriging variance. Geoderma 1998;87:239-259.
- Van Groenigen J.W., Stein A. Constrained optimisation of spatial sampling using continuous simulated annealing. J. Environ. Qual. 1998;27:1078-1086.[Abstract/Free Full Text]
- Van Uffelen C.G.R., Verhagen J., Bouma J. Comparison of simulated crop yield patterns for site specific management. Agric. Syst. 1997;54:207-222.
This article has been cited by other articles:

|
 |

|
 |
 
F. Walley, T. Yates, J.-W. van Groenigen, and C. van Kessel
Relationships Between Soil Nitrogen Availability Indices, Yield, and Nitrogen Accumulation of Wheat
Soil Sci. Soc. Am. J.,
September 1, 2002;
66(5):
1549 - 1561.
[Abstract]
[Full Text]
[PDF]
|
 |
|