SAMPLING THEORY USING VIRTUAL FORESTS
Warren G. Abrahamson1 & Michael R. Weaver2
1Department of Biology and 2Department of Information Services & Resources
Bucknell University, Lewisburg, PA 17837
This week’s exercise explores the fundamentals of sampling theory using virtual forests. Two Pennsylvania old-growth forests were digitized enabling you to employ haphazard, systematic, and random sampling methods while investigating sampling protocols based on sampling areas versus distances. As you sample these virtual forests, you will discover that the efficiencies of sampling methods vary for different types of natural communities and that species differ in spatial distribution. Finally, you will determine the species diversity of both virtual forests and will make predictions about their successional trends. The obvious advantage of this simulation model is that you can quickly sample a forest community using different sampling criteria and methodologies.
Ecologists and conservation biologists frequently need to know how a community of organisms is structured. That is, what species compose the community; how abundant is each species; how do the species interact; and are some species increasing in abundance while others are decreasing in abundance over time? Such information is invaluable when biologists develop conservation plans for natural areas or recovery plans for threatened or endangered species. Furthermore, measures of species abundances within a community taken at one point in time provide a baseline against which future measures of species abundances within that community can be compared. Such timelines of community data allow ecologists to measure species changes within communities and to better understand succession within a given natural community or the impacts of specific land-management plans.
The need to assess community structure has generated a number of quantitative field methods as well as an appreciation of which methodology works best in a given situation. These methods are designed to generate reliable estimates of the abundance and distribution of each species within a community. Such data make it possible to compare species or groups of species within a community or to contrast species composition and abundance among communities.
Sampling methods are invaluable for numerous biological investigations. Such methods are used to determine the efficacy of new medicines, the responses of cells to various treatments, and the structure of a natural community. It would be extremely time consuming, for example, to count and measure every individual of each species within a community in order to determine the abundances and distributions of each species within the community. Sampling methods enable us to estimate reliable information by use of samples. However, it is critical that the samples be taken without bias and in sufficiently large number so that the resulting data can be summarized to give valid estimates of the desired parameters.
The specific sampling method used to assess community structure depends on the nature of the organisms and the community to be sampled. For example, we would use “mark and recapture” or “capture per unit effort” methods to determine abundance estimates for mobile or secretive animals (e.g., deer, mice), whereas we would employ area, distance, or line-transect methods for sessile or sedentary animals (e.g., corals) and for plant communities (e.g., forest trees).
In this exercise, you will investigate the application of two quantitative field methods best suited to study communities of sessile or sedentary animals and most types of vegetation: (1) the area-sample method and (2) the distance-sample method. As you apply these two methods to simulated forest communities, you will become familiar with the subtleties of each method and will come to understand when one or the other method would be best used. Because you will sample known forest communities, you will compare your abundance estimates against actual abundance values for each tree species. As a consequence, you will use the degree of deviation of your estimates to compare sampling methods and to determine the reliability of your estimates for common species versus rare species.
When using area samples, ecologists must determine the appropriate quadrat size to use on the basis of the size and density of the individuals within the community being sampled. Quadrats must be large enough to contain a number of individuals, but small enough that the individuals present can be separated, counted, and measured. For example, quadrat sizes for herbaceous vegetation might be 1 m2, while for shrubs 10-20 m2, and 100 m2 for forest trees.
Quadrat shape is also important as it affects the ease of establishing quadrats and the efficiency of sampling. For example, circular ‘quadrats’ are more easily established than square quadrats; and elongated rectangular ‘quadrats’ furnish more variety of species than an equal number of square quadrats of the same area. This latter relationship holds because a rectangle encompasses more environmental variety due to environmental gradients (e.g., slopes, soil-moisture variation) than a square of the same area. However, because rectangular quadrats have more perimeter than square quadrats of the same area, accuracy tends to decline as quadrats become more elongated due to edge effect (Barbour et al. 1999). The best quadrat size and shape depends on the application. In our exercise, we will use square quadrats.
Where to Sample
Three approaches could be used to sample within communities of organisms but as you will learn, these approaches are not equal in their ability to generate reliable estimates of species abundance. (1) Haphazard or convenience sampling selects samples that are readily available – such samples are almost never random samples. The extent to which community statistics generated from such sampling can be generalized to the community as a whole depends on the degree to which the samples represent the whole. The more homogeneous the community from which our samples are drawn, the more likely haphazard sampling will reliably represent the community. However, the more heterogeneous the community, the more likely such sampling will offer a biased, unrepresentative estimates. In contrast, (2) random sampling ensures that all individuals within a community have an equal chance of being sampled. While this approach is likely to generate reliable estimates of community parameters with sufficient sampling, random sampling can be difficult under field conditions. It could require, for example, that each individual or area within a community be assigned a number and that the numbers to be sampled be selected by a truly random process. A variation of random sampling, referred to as stratified-random sampling, subdivides the community into any number of homogeneous regions, each of which is then randomly sampled. Under field conditions, (3) replicated systematic sampling is often applied because it avoids bias better than haphazard sampling and it is easier to apply than random sampling. With systematic sampling the procedure selects, for example, every 30th individual or perhaps areas to be sampled every 30 meters along equidistant transect lines placed across the sampled community. Systematic sampling is not equivalent to random sampling, however, for if there is periodic ordering within the chosen samples systematic sampling may have a larger error than a random sample.
Measures of Species Abundance Density, Frequency, Dominance, and Importance
Several standard measures of absolute and relative abundance are used to assess the contribution of each species to a community (Barbour et al. 1999). These measures include: density, the number of individuals within a chosen area (e.g., m2, hectare); relative density, the density of one species as a percentage of total density; frequency, the percentage of total quadrats or points that contains at least one individual of a given species; relative frequency, the frequency of one species as a percentage of total frequency; dominance, the total basal area of a given species per unit area within the community; relative dominance, the dominance of one species as a percentage of total dominance; and importance, expressed as the relative contribution of a species to the entire community expressed as a combination of relative density, relative frequency, and relative dominance (see the Appendix for mathematical definitions of each measure).
Think carefully about the meaning of each of these measures – each offers a different insight into the abundance of the species composing a community. Saplings, for example, typically have a much higher density but much lower dominance than mature trees. Density tells us the number of individuals per unit area but density is not necessarily proportional to dominance because dominance for a given species expresses the area occupied by the species per unit area (e.g., per m2). A species composed of primarily large individuals can have high dominance but it will likely have low density, and unless regularly distributed, it will also have low frequency. Frequency, which is often independent of density, expresses one measure of the distribution of individuals within the community. A clumped species can have high density but also low frequency because it occurs in a limited portion of the community. In contrast, a species that is individually and regularly distributed over the landscape will have a high frequency but can have low density. Relative importance, as a combination of relative values for density, frequency, and dominance, is used as a summary of the influence that an individual species may have within the community. Recognize that two species with the same relative importance can have markedly different values for relative density, frequency, or dominance as any differences can be overshadowed by the addition process (Barbour et al. 1999).
Measures of Distribution
Individuals of a species can be randomly distributed across a community (i.e., the location of one individual of a given species has no relationship with the location of other individuals of that species). Individuals of other species might be singly and regularly distributed through out the community (an extreme example is the uniform spacing of orchard trees), while the individuals of still other species could be clumped (i.e., the presence of one individual of a given species increases the probability of finding another individual of that species nearby). Thus, ecologists recognize three primary patterns of distribution: (1) random, (2) regular (uniform) or hyperdispersed, and (3) clumped (aggregated) or underdispersed (Barbour et al. 1999).
There are a number of reasons why plants show clumped distributions. Many plants are highly clonal (i.e., they can propagate by vegetative means as do goldenrods and aspens) so once a seedling establishes at a given site, the plant spreads to produce numerous, spatially separated (but genetically identical), aboveground stems. In addition, environmental gradients are common in nature so that a site that is good for one individual of a given species is likely to be good for other individuals of that species. Yet there are forces in nature that counteract clumping. Competition among individuals for water in deserts or light in forests can favor regular spacing. Similarly plants that are clumped are more likely to be found by their herbivores or pathogens (Barbour et al. 1999).
Measures of Richness, Evenness, and (Species) Diversity
Species richness [the number of species occurring within a specific area or community], species evenness or equitability [the distribution of individuals among species], and species diversity [typically measured as a combination of species richness and species evenness; that is, species richness weighted by species evenness, see Appendix] are measures unique to the community level of ecological organization (Barbour et al. 1999). These statistics reflect the biological structure of a community. A community with high species richness and diversity, for example, will likely have a complex network of trophic pathways. In contrast, a community with low species richness and diversity will likely have fewer species and trophic interactions. Interactions among species (e.g., energy transfer, predation, competition) within the food webs of communities with high species diversity are theoretically more complex and varied than in communities of low species diversity. Indices of species richness and species diversity are often used in a comparative manner, that is, to compare communities growing under different environmental conditions or to contrast seral stages of a succession.
Information about Communities
Mohn Mill Natural Area
The 154-ha Mohn Mill area (N41°4’, W77°8’) straddles the boundary of Union and Lycoming Counties at their intersection with Clinton County. Elevations in the Mohn Mill area range from approximately 420 to 570 m above mean sea level with bottomlands and gently sloping to steep slopes. The sandstone-derived soils include loams, sandy loams, and stony to very stony loams.
The Mohn Mill area has experienced many natural disturbances during recent decades including canopy-damaging windstorms as well as ice and wet snow events (W. G. Abrahamson personal observation). In addition, chestnut blight occurred in the region during the 1930’s, eliminating chestnut from forest canopy. Gypsy moth outbreaks occurred within the Mohn Mill site from 1979 to 1982 and again during of 1996. The site was logged approximately 100 years ago between 1904 and 1912 during the period identified as the “clear-cut or hemlock-chemical wood” era (Abrams and Ruffner 1995). Currently, there is evidence of considerable browsing by white-tailed deer, which likely inhibits the regeneration of oaks.
The Mohn Mill area is a Pennsylvania Department of Conservation and Natural Resources (DCNR) proposed wild-plant sanctuary primarily because of the presence of the federally endangered northeastern bulrush (Scirpus ancistrochaetus Schuyler). The small, seasonal ponds that harbor the northeastern bulrush occur within an oak-canopied forest matrix that is likely crucial to the long-term survival of this plant. Although the Mohn Mill site is protected from logging, a recent study of the successional trends at Mohn Mill showed that the site is experiencing a replacement of oaks by more shade-tolerant red maple (Abrahamson and Gohn 2004). Click here for a copy of the Abrahamson and Gohn publication.
The 200-ha Snyder-Middleswarth Natural Area (N40°48’, W77°19’), located in Snyder County, includes one of the few stands of old-growth hemlock-yellow birch forest remaining in Pennsylvania and is among the largest such stands existing within Pennsylvania state forests. Thanks originally to inaccessibility and in 1965 to its preservation as a National Natural Landmark, a 135-ha portion of this forest has never been logged. Old-growth forests have become increasingly rare in North America since the time of European colonization and are particularly rare in central Pennsylvania because of intensive logging for timber and charcoal production during the past 150 years. As a consequence, eastern old-growth forest exist in small stands that are isolated from other old-growth forests by an intervening matrix of successional forests.
The Snyder-Middleswarth old-growth forest is located in a narrow and steep ravine between two east and west-running ridges; Buck Mountain lies to the north and Thick Mountain to the south. The ravine, created by Swift Run, has well-developed north-facing and south-facing slopes as well as a bottomland. Elevations in the area range from 450 m to 550 m, with slopes varying in steepness from 1-68%. The predominant soils are extremely stony and sandy well-drained loams that have weathered from sandstone and shale. As a consequence, these soils have low to moderate available water capacity and have little pH buffering capacity.
There are recurrent natural disturbances within this forest. Windstorms, especially those associated with snow or ice events, have toppled a number of the larger hemlock and yellow birch during the past three decades (W. G. Abrahamson, personal observation). The crowns of slope and ridge top trees frequently show evidence of wind and/or ice damage.
Humans are also having impacts on the old-growth forest. Acid precipitation has seriously impacted the area by enhancing the acidity of soils and of Swift Run. The water that enters Swift Run moves through Tuscarora sandstone and soils derived from this hard sandstone. Because these substrates are unable to buffer the strongly acidic precipitation, the portion of Swift Run within the old-growth forest area has a pH too low for fish (e.g., native brook trout) to survive (low pH releases aluminum, which in turn is toxic to fish and other aquatic organisms). The unnamed creek that joins Swift Run near the parking area has a substantially higher pH because its waters percolate through Juniata sandstone and its derived soil, which has greater buffering ability. As a result, fish such a brook trout do occur as far upstream as the confluence of these creeks.
Humans have introduced several herbivorous insects to North America that threaten the stability of the Snyder-Middleswarth Natural Area. The continued domination of the old-growth forest by hemlock could be appreciably impacted by the hemlock woolly adelgid. This exotic herbivore was first reported in southeastern Pennsylvania in the late 1960s and it has been observed in the Snyder-Middleswarth old-growth forest since 2003 (W. G. Abrahamson personal observation). Gypsy moth outbreaks have occurred periodically within central Pennsylvania since the mid-1970s and have impacted the oak canopies of the south-facing and ridge tops during multiple growing seasons. There is evidence of browsing by white-tailed deer, which likely inhibits regeneration. A recent study of the Snyder-Middleswarth Natural Area detailed the patterns of vegetation and succession within the site (Zawadzkas and Abrahamson 2003). Click here for a copy of the Zawadzkas and Abrahamson publication.
We will examine a number of questions in the following exercises using a computer simulation model, EcoSampler, to sample these two communities. EcoSampler can be accessed using a Windows-platform computer running Internet Explorer or Mozilla, or a Macintosh computer running Safari or Internet Explorer.
Assignment 1 Haphazard, Random, and Systematic Sampling This assignment will investigate the differences among haphazard, random, and systematic sampling. Click ‘Begin.’
Questions to Discuss in Your Write Up of Assignment 1
Assignment 2 Distance-sampling Versus Area-sampling Methods this assignment investigates distance sampling and compares it to area sampling, which you used in assignment 1. There are a number of so-called plot-less sampling techniques, including the random-pairs and point-quarter techniques, that utilize measurements of distances between individuals, or measurements of distances from randomly or systematically chosen points to the nearest individuals, instead of sampling within prescribed quadrats. We will use the point-quarter technique because it is easier and more efficient. This technique is well suited for sampling communities with widely spaced individuals or communities in which individuals are large in size (e.g., trees). The technique is easily adapted to sample animal populations such as nest densities (e.g., wood rat nests) or populations of sessile or sedentary animals (e.g., sea anemones and barnacles).
Questions to Discuss in Your Write Up of Assignment 2
There are several calculation methods for dispersion but we will use the Morisita Index of Dispersion for our estimates (Morisita 1959, Brower et al. 1998, see Appendix). If the dispersion is random, then the Morisita Index is approximately 1.0; if perfectly uniform, the Morisita Index approximates 0; and if maximally aggregated (i.e., all individuals in one plot), the Morisita Index = the number of quadrats or points sampled. The distribution of organisms in nature is seldom uniform as in an orchard or a cornfield. Instead, the dispersion of organisms is frequently aggregated. A random dispersion (one in which the position of an individual is completely independent of the position of any other individual in its population) can be approached in some species.
Succession refers to the replacement of one community by another and biologists contrast two types: (1) primary succession, which describes development starting from a new site never before colonized by living organisms and (2) secondary succession, which applies to the plant succession that takes place on sites that have already supported life. In a forest, secondary succession occurs each time a canopy tree falls and new plants compete for the resources (i.e., light, nutrients) previously used by the fallen tree. Secondary succession can also be observed after logging or when an agricultural field is fallowed.
Although tree sizes do not precisely represent tree ages, tree-size data can provide insight into the successional status of tree species. The following graphic models illustrate the expected size-class distributions for hypothetical species with stable, successfully invading, unsuccessfully invading, and senile size-class structures. Actual size-class data are compared to these expectations to understand the history and to predict future success or failure of species in a given forest stand.
Questions to Discuss in Your Write Up of Assignment 3
Assignment 4 Species Diversity While dispersion is a characteristic of populations, species diversity is a characteristic unique to the community level of biological organization and is an expression of community structure. For example, a community has high species diversity if it is composed of many equally abundant species. On the other hand, if a community is composed of a very few species, or if only a few species are abundant, then that community’s species diversity is low. High species diversity potentially indicates a complex community because a greater variety of species likely facilitates more interactions among species. Species interactions involving energy transfer (food webs), predation, competition, and niche partitioning are theoretically more varied in a community of high species diversity. There are many estimates of species diversity but we will use the popular Shannon-Wiener Index to estimate species diversity in our virtual forests.
Question to Discuss in Your Write Up of Assignment 4
Assignment 5 Variation in Patterns due to Topography and Edaphic Factors This assignment investigates the variation in vegetative patterns due to topography and edaphic factors. There can be marked differences between adjacent north-facing and south-facing slopes. South-facing slopes in the northern hemisphere receive more solar radiation than north-facing slopes. At the latitude of Pennsylvania, midday insolation on a 20º slope is, on average, 40% greater on a south-facing slope than on a north-facing slope year-round. This difference has a striking effect on heat budget and moisture of the two sites – south-facing slopes are warmer, their evaporation rate is typically 50% higher, and their soil moisture is lower (Smith and Smith 2001). Contrast the vegetation that occurs on the south-facing and north-facing slopes; and compare the vegetation that dominates the Swift Run bottom land with that on the ridge top.
Questions to Discuss in Your Write Up of Assignment 5
Sampling Theory Lab Reports
Your lab reports should include the answers to each of the 18 questions above. Do NOT include the saved output tables. Lab write-ups must be done individually, typed, single-spaced, and printed on both sides of the paper (please help conserve our forest resources). Reports are due in lab next week.
Abrahamson, W. G. and A. C. Gohn. 2004. Classification and successional changes on mixed-oak forests at the Mohn Mill Area,
Abrams, M. D. and C. M. Ruffner. 1995. Physiographic analysis of witness-tree distribution (1765-1798) and present forest cover through north central Pennsylvania. Canadian Journal of Forest Research 25: 659-668.
Barbour, M. G., J. H. Burk, W. D. Pitts, F. S. Gilliam, and M. W. Schwartz. 1999. Terrestrial plant ecology, 3rd edition. Benjamin Cummings.
Brower, J. E., J. H. Zar, and C. N. von Ende. 1998. Field and laboratory methods for general ecology, 4th edition. Wm. C. Brown Co., Publishers,
Curtis, J. T. and G. Cottam. 1962. Plant ecology workbook: laboratory, field and reference manual. Burgess Publishing Company,
Morisita, M. 1959. Measuring the dispersion of individuals and analysis of the distributional patterns. Mem. Fac. Sci. Kyushu Univ., Ser E (Biol.) 2: 215-.
Smith, R. L. and T. M. Smith. 2001. Ecology and field biology, 6th edition. Benjamin Cummings.
Zawadzkas, P. P. and W. G. Abrahamson. 2003. Composition and tree-size distributions of the
We thank Steve Jordan and Matt McTammany for their insightful comments and suggestions throughout the development of EcoSampler. This exercise was stimulated by the methods exercises in Curtis and Cottam's Plant Ecology Workbook (1962).
Contacts for Problems or Questions
Warren G. Abrahamson firstname.lastname@example.org; http://www.facstaff.bucknell.edu/abrahmsn/
Mike R. Weaver email@example.com
Definitions of Abundance Measures
Area Method: (Brower et al. 1998)
In the formula, the term ‘unit area’ refers to the size of the area, in the same units as those for the mean area per plant, on the basis of which density will be expressed. For example, if density is to be expressed per hectare, but the mean area per plant is in units of m2, the unit area value would be 10,000 (the number of m2 in a hectare).
Definitions of Species Diversity and Dispersion (Brower et al. 1998)
Shannon-Wiener Index ()
Where: is the number of individuals of one species divided by the total number of all individuals in the sample (=ni/N), ln = natural logarithm, and s = total number of species in the community.
Morisita’s Index of Dispersion ()
Where: is the number of quadrats, is the total number of individuals counted in all quadrats, and is the squares of the numbers of individuals per quadrat, summed over all quadrats.