ANNEX 5: formulae for calculating optimal allocation of survey effort among strata.
Optimal allocation formulae
Buckland et al. (1993; chapter 7.2.3) provide formulae for optimal sample allocation for line transect surveys. They assume
- the squared coefficient of variation of total count in a stratum, is inversely proportional to expected total count, E(n), and that this relationship is the same in all strata. Put another way, var(n) is proportional to E(n).
- the squared coefficient of variation of effective strip width, is also inversely proportional to E(n), and that this relationship is the same in all strata.
- effective strip width,is the same over strata, but is estimated separately in each stratum (so that the estimates,
, are independent).
Given that
(1)
where is the estimate of density in stratum v, is the total count in stratum v and is the estimate of effective strip width for stratum v. Under the above assumptions,
(2)
where L is total line length, and A_{v} is the area of stratum v.K is a constant, given by
where and .
For fixed L, is minimized when
(3)
where is the ratio of sample effort in stratum v divided by total sample effort.
Since we will be systematically placing sample locations on a grid, a useful quantity is the optimal relative density of sample locations in a stratum. This is given by
(4)
where is the density in the reference stratum (usually the one with the highest density).
We originally considered stratifying both by density and by ease of access (i.e., by cost of sampling). In this case, Lagrange multipliers can be used to show that, under the same assumptions as outlined for (3), the optimal allocation is
(5)
where C_{v} is the cost of visiting a sample unit in stratum v. (Since the formula is scale-invariant, it does not matter how C_{v} is measured.)
Assumption 1, above, implies the relationship - i.e. that total count forms an overdispersed Poisson distribution. This has been found to apply to a wide variety of wildlife abundance studies (Buckland et al. 1993). However, other relationships are possible. For example, one may suspect that counts follow negative binomial distribution, leading to . In an analysis of elephant dung recce-transect data from the Gamba, Walsh et al (2001) tested three related models. They did not work directly with n or e, so the results are not exactly comparable with those presented here. However, they found that a "Taylor’s power law" relationship of the form was the best fit to their data, with for their data.
The optimal allocation equation (5) can be further generalized to allow for any of the above forms for var(n). Assume . Assumptions 2 and 3 are as before. Again using Lagrange multipliers, it can be shown that
(6)
where l is the survey effort per unit length of line at a sample location and is the Lagrange multiplier. In general, (3) and (4) no longer hold; however when b_{4} is in the plausible range of 1 - 2, the optima should not be too far from these. (This statement needs to be confirmed, using simulation!) Specifically, when b_{4} = 2 (as in, for example, the negative binomial distribution), the first term of (6) disappears, leaving the optima of (3) and (4). (Trivially, (3) and (4) also hold when b_{4} = 0 or 1.)
Assumption 2 is a mild one, given that is likely to make up only a small proportion of the total variance of the final density estimate, for any feasible extensive program. At the three pilot sites, made up between 5% and 14% of (Thomas and Buckland, unpublished). Assumption 3 is unrealistic, but again violation of it is unlikely to have much affect on the optimal allocation. Buckland et al. (1993) show that if this assumption is moderated, to assume that a commonis estimated across strata, then the optimal allocation falls somewhere between (3) and
(7)
Given that will make up a small proportion of the total CV, whether pooled by stratum or not, we expect the optimal allocation to be much closer to (3) than (7).
Trend sensitivity formulae
Walsh and White (1999) provide a simple method for assessing the sensitivity of a survey program to detect a chance in abundance between two time points. Their method assumes that the difference between density estimates is normally distributed, that the coefficient of variation of density is the same in the two years, and that the test for trend is being performed at =0.05. Under these circumstances, the percent change in abundance detectable is given by
(8)
Optimal allocation
Here, we use the pilot data to estimate b_{1} and b_{2}, from (2), and therefore the optimal allocations, from pilot survey data.
We first examine the assumption that ("Model 1"). Unfortunately, with only 3 pilot sites, it is difficult to adequately judge the fit of the data. Table 1 gives the data, and Figure 1 shows a scatterplot of against n for the pilot sites, with a fitted linear regression line constrained to go through the origin. A plausible alternative distribution for var(n) is negative binomal, which implies the relationship ("Model 2", the dashed line in Figure 1). With so few data, it is not possible to distinguish between the alternatives: Figure 2a), using n_{t} appears to favour Model 1, while Figure 2b), using n_{c}, appears to favour the Model 2.
It would be informative to combine this pilot data with those of Walsh et al (2001) to form a combined analysis. Meanwhile, as both Model 1 and Model 2 produce the same optima (see above), we will proceed as if Model 1 is the correct model. Choosing the incorrect model will, however, affect the predicted variances.
Buckland et al. (1993; section 6.3.2) give two methods for estimating b_{1} under Model 1:
(9)
where k_{v} is the number of sample locations at site v (v = 1, 2, 3), and , and
(10)
From the pilot data, the two estimates are 7.34 and 9.21 for transect data, and 25.6 and 29.4 for the combined data.
Estimates of b_{2v} are given in Table 2 - they are not at all similar between sites, but as stated earlier this is unlikely to affect the optimization. These site estimates can be combined to form a common estimate in a similar way to the b_{1v}s in equation 9. The combined estimates are b_{2t} = 0.54 and b_{2t} = 2.06.
An estimate of , pooled in a similar way, is 2.25m. (One could argue that this estimate should be weighted by n rather than k). The estimates of K, using the estimate of b_{1} from equation 5, are therefore K_{t} = 8.87 and K_{c} = 31.11.
The optimal allocation, calculated according to (3) and (4), is shown in Table 3. The expected variance at differing sampling intensities, and predicted trend sensitivity is presented in the text.
Table 1. Total count and estimated count variances from the pilot sites. Total count for transect data alone, n_{t}, (after truncation at 4m) and total combined adjusted count, n_{c} are shown. k is the number of sample locations at each site; line lengths per site are 1km for transect data alone and 4km for the combined adjusted count. For more information about how these were calculated, see Thomas and Buckland (unpublished).
Site | k | n_{t} | | | n_{c} | | |
Odzala | 44 | 418 | 3666.2 | 8.77 | 1599.4 | 55885.0 | 34.94 |
Lope | 44 | 153 | 1014.0 | 6.63 | 614.5 | 9214.9 | 14.99 |
Ituri | 14 | 70 | 1225.5 | 5.00 | 248.3 | 7349.7 | 29.61 |
Table 2. Estimates of the constant for each site, using either n_{t} or n_{c}.
Site | _{} | _{} |
Odzala | 0.04 | 0.16 |
Lope | 0.77 | 3.09 |
Ituri | 1.40 | 4.96 |
Table 3. Optimal allocation of effort to each stratum, using equations (3) and (4). is the proportion of effort to allocate to each stratum, and is the relative density of sample locations in the two strata, relative to the Medium/High stratum.
Stratum | Area (km^{2}) | D (ele/km^{2}) | N | | |
Medium/High | 521500 | 0.35 | 182528 | 0.57 | 1.00 |
Low | 876445 | 0.07 | 61351 | 0.43 | 0.44 |
None | 879 544 | 0.00 | 0 | - | - |
MIKE sites | 127 855 | 0.35 | 44 749 | - | - |
Total | 2 405 352 | - | 281014 | 1.0 | - |
ANNEX 6: accessibility of survey sites in central africa
An initial map showing accessibility is given in the following figure (figure 1). The survey region has been divided into three strata – accessible, difficult to access and not accessible. Areas that are not accessible are excluded from the survey programme altogether. Here, we will assume that accessible sites take on average 1 week to reach (total travel time, not including the 1 week surveying at the site), and difficult to access sites take 3 weeks travel time. This means that total survey time including the 1 week surveying at each site is 2 weeks for accessible sites and 4 for inaccessible.
When the whole study area cannot be surveyed in a given year, we propose that survey blocks be assembled. These would be large contiguous areas (at least 50,000 km ^{2}) where accessibility is not precluded from any area of the survey block. These areas would also contain a wide range of conditions within the range of elephant occurrence, including projected high and low-density zones, and both protected and unprotected areas. The survey blocks would then constitute the study area. Figure 2 gives an example of three selected survey blocks in the Central African Forest region and figure 3 gives an example of allocated sampling points in one survey block with two strata (high and medium density and low density of elephants).
Exclusion of areas of swamp. Swamp-land is not accessible by survey teams, and should be excluded from the survey. Large areas of swamp can be added to the not-accessible stratum from remote sensing data, and this should be done before the final survey design is made. Small areas of swamp will not be visible in remote sensed data, but will be encountered by field crews on the ground. Suitable instructions must be included in the field protocol for relocating transects or foregoing surveys on impassable ground. If a record of the amount of such ground is kept, then this can be incorporated into the estimation.
Figure 2: example of survey blocks in the Central African Forest region
Figure 3: example of survey effort allocation within a survey block
ANNEX 7: list of consultants technical reports
These reports have been produced by consultants to the MIKE Pilot Project and are available upon request.
Buckland, S.T. 2000. MIKE Central African Pilot Project: proposals for survey design
Buckland, S.T. and F.M. Underwood. 2000. Analysis of data and survey design for the mike central African pilot project: first interim report
Thomas, L., Buckland, S.T. 2001. Analysis of data and survey design for the MIKE central African pilot project. Third Report - Part 1: Analysis of Pilot Data
Thomas, L., Buckland, S.T. 2001. Analysis of data and survey design for the MIKE central African pilot project. Third Report - Part 2: Recommended Sampling Design
Thomas, L., Buckland, S.T. 2001. Analysis of data and survey design for the MIKE central African pilot project. Third Report - Part 3: Analysis of Ituri Transects
Thomas, L., Buckland, S.T. 2001. Analysis of data and survey design for the MIKE central African pilot project. Third Report - Part 4: Spatial modelling
Thomas, L. 2001. Predicting variation in encounter rates at pilot sites.
Vanleeuwe, H. 2000. Training of MIKE Field-team, testing, and implementing of MIKE survey methods, Odzala – Congo.