Title: | Distance Sampling Detection Function and Abundance Estimation |
---|---|
Description: | A simple way of fitting detection functions to distance sampling data for both line and point transects. Adjustment term selection, left and right truncation as well as monotonicity constraints and binning are supported. Abundance and density estimates can also be calculated (via a Horvitz-Thompson-like estimator) if survey area information is provided. See Miller et al. (2019) <doi:10.18637/jss.v089.i01> for more information on methods and <https://examples.distancesampling.org/> for example analyses. |
Authors: | Laura Marshall [cre], David Miller [aut], T.J. Clark-Wolf [aut], Len Thomas [ctb], Jeff Laake [ctb], Eric Rexstad [rev] |
Maintainer: | Laura Marshall <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0.0 |
Built: | 2024-11-06 12:36:45 UTC |
Source: | https://github.com/distanceDevelopment/Distance |
Distance
is a simple way to fit detection functions and estimate
abundance using distance sampling methodology.
Underlying Distance
is the package mrds
, for more advanced
analyses (such as those involving double observer surveys) one may find it
necessary to use mrds
.
Examples of distance sampling analyses are available at http://examples.distancesampling.org/.
For help with distance sampling and this package, there is a Google Group https://groups.google.com/forum/#!forum/distance-sampling.
Bugs can be reported at https://github.com/DistanceDevelopment/Distance/issues.
David L. Miller [email protected]
"_PACKAGE"
Key References:
Miller D.L., E. Rexstad, L. Thomas, L. Marshall and J.L. Laake. 2019. Distance Sampling in R. Journal of Statistical Software, 89(1), 1-28. doi:10.18637/jss.v089.i01
Background References:
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Add a line or lines to a plot of the detection function which correspond to a a given covariate combination. These can be particularly useful when there is a small number of factor levels or if quantiles of a continuous covariate are specified.
ddf |
a fitted detection function object. |
data |
a |
... |
extra arguments to give to |
ndist |
number of distances at which to evaluate the detection function. |
pdf |
should the line be drawn on the probability density scale; ignored for line transects |
breaks |
required to ensure that PDF lines are the right size, should
match what is supplied to original |
All covariates must be specified in data
. Plots can become quite busy
when this approach is used. It may be useful to fix some covariates at their
median level and plot set values of a covariate of interest. For example
setting weather (e.g., Beaufort) to its median and plotting levels of
observer, then creating a second plot for a fixed observer with levels of
weather.
Arguments to lines
are supplied in ... and aesthetics like
line type (lty
), line width (lwd
) and colour (col
) are
recycled. By default lty
is used to distinguish between the lines. It
may be useful to add a legend
to the plot (lines are plotted
in the order of data
).
invisibly, the values of detectability over the truncation range.
This function is located in the mrds
package but the
documentation is provided here for easy access.
David L Miller
## Not run: # example using a model for the minke data data(minke) # fit a model result <- ds(minke, formula=~Region.Label) # make a base plot, showpoints=FALSE makes the plot less busy plot(result, showpoints=FALSE) # add lines for sex one at a time add_df_covar_line(result, data.frame(Region.Label="South"), lty=2) add_df_covar_line(result, data.frame(Region.Label="North"), lty=3) # add a legend legend(1.5, 1, c("Average", "South", "North"), lty=1:3) # point transect example data(amakihi) result <- ds(amakihi, truncation=150, transect="point", formula=~OBs) plot(result, showpoints=FALSE, pdf=TRUE) add_df_covar_line(result, data.frame(OBs=na.omit(unique(amakihi$OBs))), pdf=TRUE) ## End(Not run)
## Not run: # example using a model for the minke data data(minke) # fit a model result <- ds(minke, formula=~Region.Label) # make a base plot, showpoints=FALSE makes the plot less busy plot(result, showpoints=FALSE) # add lines for sex one at a time add_df_covar_line(result, data.frame(Region.Label="South"), lty=2) add_df_covar_line(result, data.frame(Region.Label="North"), lty=3) # add a legend legend(1.5, 1, c("Average", "South", "North"), lty=1:3) # point transect example data(amakihi) result <- ds(amakihi, truncation=150, transect="point", formula=~OBs) plot(result, showpoints=FALSE, pdf=TRUE) add_df_covar_line(result, data.frame(OBs=na.omit(unique(amakihi$OBs))), pdf=TRUE) ## End(Not run)
Extract the AIC from a fitted detection function.
## S3 method for class 'dsmodel' AIC(object, ..., k = 2)
## S3 method for class 'dsmodel' AIC(object, ..., k = 2)
object |
a fitted detection function object |
... |
optionally more fitted model objects. |
k |
penalty per parameter to be used; the default |
David L Miller
## Not run: library(Distance) data(minke) model <- ds(minke, truncation=4) model_hr <- ds(minke, truncation=4, key="hr") # extract the AIC for 2 models AIC(model, model_hr) ## End(Not run)
## Not run: library(Distance) data(minke) model <- ds(minke, truncation=4) model_hr <- ds(minke, truncation=4, key="hr") # extract the AIC for 2 models AIC(model, model_hr) ## End(Not run)
Also known as the Common 'Amakihi, a type of Hawaiian honeycreeper
A data.frame
with 1487 rows and 12 variables
Region.Label
strata names (seven strata)
Area
size of study area (set to 0)
Sample.Label
transect ID
Effort
number of visits to point
object
object ID
distance
radial distance (m)
Month
month survey conducted (not used)
OBs
observer ID (note capitalisation of variable name)
Sp
species code (COAM) for all detections
MAS
Time after sunrise (min)
HAS
Time after sunrise (hours)
Study.Area
name of study area
Example for investigating covariates in the detection function. Note high colinearity between two measures of time since sunrise. Convergence problems can result from models with several factor covariates.
Marques, T.A., L. Thomas, S.G. Fancy and S.T. Buckland. (2007) Improving estimates of bird density using multiple-covariate distance sampling. The Auk 124 (4): 1229–1243. doi:10.1642/0004-8038(2007)124[1229:IEOBDU]2.0.CO;2
Performs a bootstrap for simple distance sampling models using the same data
structures as dht
. Note that only geographical stratification
as supported in dht
is allowed.
bootdht( model, flatfile, resample_strata = FALSE, resample_obs = FALSE, resample_transects = TRUE, nboot = 100, summary_fun = bootdht_Nhat_summarize, convert_units = 1, select_adjustments = FALSE, sample_fraction = 1, multipliers = NULL, progress_bar = "base", cores = 1, convert.units = NULL )
bootdht( model, flatfile, resample_strata = FALSE, resample_obs = FALSE, resample_transects = TRUE, nboot = 100, summary_fun = bootdht_Nhat_summarize, convert_units = 1, select_adjustments = FALSE, sample_fraction = 1, multipliers = NULL, progress_bar = "base", cores = 1, convert.units = NULL )
model |
a model fitted by |
flatfile |
Data provided in the flatfile format. See |
resample_strata |
should resampling happen at the stratum
( |
resample_obs |
should resampling happen at the observation ( |
resample_transects |
should resampling happen at the transect
( |
nboot |
number of bootstrap replicates |
summary_fun |
function that is used to obtain summary statistics from
the bootstrap, see Summary Functions below. By default
|
convert_units |
conversion between units for abundance estimation, see
"Units", below. (Defaults to 1, implying all of the units are "correct"
already.) This takes precedence over any unit conversion stored in |
select_adjustments |
select the number of adjustments in each
bootstrap, when |
sample_fraction |
what proportion of the transects was covered (e.g., 0.5 for one-sided line transects). |
multipliers |
|
progress_bar |
which progress bar should be used? Default "base" uses
|
cores |
number of CPU cores to use to compute the estimates. See "Parallelization" below. |
convert.units |
deprecated, see same argument with underscore, above. |
The function summary_fun
allows the user to specify what summary
statistics should be recorded from each bootstrap. The function should take
two arguments, ests
and fit
. The former is the output from
dht2
, giving tables of estimates. The latter is the fitted detection
function object. The function is called once fitting and estimation has been
performed and should return a data.frame
. Those data.frame
s
are then concatenated using rbind
. One can make these functions
return any information within those objects, for example abundance or
density estimates or the AIC for each model. See Examples below.
It is often the case that we cannot measure distances to individuals or groups directly, but instead need to estimate distances to something they produce (e.g., for whales, their blows; for elephants their dung) – this is referred to as indirect sampling. We may need to use estimates of production rate and decay rate for these estimates (in the case of dung or nests) or just production rates (in the case of songbird calls or whale blows). We refer to these conversions between "number of cues" and "number of animals" as "multipliers".
The multipliers
argument is a list
, with 3 possible elements (creation
and decay
). Each element of which is either:
data.frame
and must have at least a column named rate
, which abundance
estimates will be divided by (the term "multiplier" is a misnomer, but
kept for compatibility with Distance for Windows). Additional columns can
be added to give the standard error and degrees of freedom for the rate
if known as SE
and df
, respectively. You can use a multirow
data.frame
to have different rates for different geographical areas
(for example). In this case the rows need to have a column (or columns)
to merge
with the data (for example Region.Label
).
a function
which will return a single estimate of the relevant
multiplier. See make_activity_fn
for a helper function for use with the
activity
package.
Model selection can be performed on a per-replicate basis within the bootstrap. This has three variations:
when select_adjustments
is TRUE
then adjustment terms are selected
by AIC within each bootstrap replicate (provided that model
had the
order
and adjustment
options set to non-NULL
.
if model
is a list of fitted detection functions, each of these is
fitted to each replicate and results generated from the one with the
lowest AIC.
when select_adjustments
is TRUE
and model
is a list of fitted
detection functions, each model fitted to each replicate and number of
adjustments is selected via AIC.
This last option can be extremely time consuming.
If cores
>1 then the parallel
/doParallel
/foreach
/doRNG
packages
will be used to run the computation over multiple cores of the computer. To
use this component you need to install those packages using:
install.packages(c("foreach", "doParallel", "doRNG"))
It is advised that
you do not set cores
to be greater than one less than the number of cores
on your machine. The doRNG
package is required to make analyses
reproducible (set.seed
can be used to ensure the same answers).
It is also hard to debug any issues in summary_fun
so it is best to run a
small number of bootstraps first in parallel to check that things work. On
Windows systems summary_fun
does not have access to the global environment
when running in parallel, so all computations must be made using only its
ests
and fit
arguments (i.e., you can not use R objects from elsewhere
in that function, even if they are available to you from the console).
Another consequence of the global environment being unavailable inside
parallel bootstraps is that any starting values in the model object passed
in to bootdht
must be hard coded (otherwise you get back 0 successful
bootstraps). For a worked example showing this, see the camera trap distance
sampling online example at
https://examples.distancesampling.org/Distance-cameratraps/camera-distill.html.
summary.dht_bootstrap
for how to summarize the results,
bootdht_Nhat_summarize
and bootdht_Dhat_summarize
for an examples of
summary functions.
## Not run: # fit a model to the minke data data(minke) mod1 <- ds(minke) # summary function to save the abundance estimate Nhat_summarize <- function(ests, fit) { return(data.frame(Nhat=ests$individuals$N$Estimate)) } # perform 5 bootstraps bootout <- bootdht(mod1, flatfile=minke, summary_fun=Nhat_summarize, nboot=5) # obtain basic summary information summary(bootout) ## End(Not run)
## Not run: # fit a model to the minke data data(minke) mod1 <- ds(minke) # summary function to save the abundance estimate Nhat_summarize <- function(ests, fit) { return(data.frame(Nhat=ests$individuals$N$Estimate)) } # perform 5 bootstraps bootout <- bootdht(mod1, flatfile=minke, summary_fun=Nhat_summarize, nboot=5) # obtain basic summary information summary(bootout) ## End(Not run)
When using bootdht
one needs to use a summary function to
extract results from the resulting models per replicate. This function is
the simplest possible example of such a function, that just extracts the
estimated density (with stratum labels).
bootdht_Dhat_summarize(ests, fit)
bootdht_Dhat_summarize(ests, fit)
ests |
output from |
fit |
fitted detection function object (unused). |
Further examples of such functions can be found at http://examples.distancesampling.org.
data.frame
with two columns ("Dhat
" and "Label
"), giving the
estimate(s) of density of individuals per stratum from each bootstrap
replicate. This data.frame
can be examined for example, with
quantile
to compute confidence intervals.
bootdht
which this function is to be used with and
bootdht_Nhat_summarize
which does the same job
but returns abundance results.
When using bootdht
one needs to use a summary function to
extract results from the resulting models per replicate. This function is
the simplest possible example of such a function, that just extracts the
estimated abundance (with stratum labels).
bootdht_Nhat_summarize(ests, fit)
bootdht_Nhat_summarize(ests, fit)
ests |
output from |
fit |
fitted detection function object (unused). |
Further examples of such functions can be found at http://examples.distancesampling.org.
data.frame
with two columns ("Nhat
" and "Label
"), giving the
estimate(s) of abundance of individuals per stratum from each bootstrap
replicate. This data.frame
can be examined for example, with
quantile
to compute confidence intervals.
bootdht
which this function is to be used with and
bootdht_Dhat_summarize
which does the same job
but for abundance results.
Data from a line transect survey of capercaillie in Monaughty Forest, Moray, Scotland.
A data.frame
with 112 observations on the following 9 variables.
Sample.Label
name of single transect
Effort
transect length (km)
distance
perpendicular distance (m)
object
object ID
size
only individual birds detected
detected
whether detected
observer
single observer data
Region.Label
stratum name
Area
size of Monaughty Forest (ha)
ds
is correctThis is an internal function that checks the data.frame
s supplied
to ds
are "correct".
checkdata( data, region.table = NULL, sample.table = NULL, obs.table = NULL, formula = ~1 )
checkdata( data, region.table = NULL, sample.table = NULL, obs.table = NULL, formula = ~1 )
data |
as in |
region.table |
as in |
sample.table |
as in |
obs.table |
as in |
formula |
formula for the covariates |
Throws an error if something goes wrong, otherwise returns a
data.frame
.
David L. Miller
Data simulated from models fitted to 1992/1993 Southern Hemisphere minke whale data collected by the International Whaling Commission. See Branch and Butterworth (2001) for survey details (survey design is shown in figure 1(e)). Data simulated by David Borchers.
data.frame
with 99 observations of 9 variables:
Region.Label
stratum label ("North"
or "South"
)
Area
stratum area (square nautical mile)
Sample.Label
transect identifier
Effort
transect length (nautical mile)
object
unique object ID
distance
observed distance (nautical mile)
Cluster.strat
strata based on cluster size: 1, 2 and 3+
size
cluster size
Study.Area
name of study area
Branch, T.A. and D.S. Butterworth. (2001) Southern Hemisphere minke whales: standardised abundance estimates from the 1978/79 to 1997/98 IDCR-SOWER surveys. Journal of Cetacean Research and Management 3(2): 143-174
Hedley, S.L., and S.T. Buckland. (2004) Spatial models for line transect sampling. Journal of Agricultural, Biological, and Environmental Statistics 9: 181-199. doi:10.1198/1085711043578.
It is often the case that effort, distances and prediction area are
collected in different units in the field. Functions in Distance
allow for an argument to convert between these and provide an answer that
makes sense. This function calculates that conversion factor, given
knowledge of the units of the quantities used.
convert_units(distance_units, effort_units, area_units)
convert_units(distance_units, effort_units, area_units)
distance_units |
units distances were measured in. |
effort_units |
units that effort were measured in. Set as |
area_units |
units for the prediction area. |
convert_units
expects particular names for its inputs – these should
be singular names of the unit (e.g., "metre" rather than "metres"). You can
view possible options with units_table
. Both UK and US
spellings are acceptable, case does not matter. For density estimation, area
must still be provided ("objects per square ???"). Note that for cue counts
(or other multiplier-based methods) one will still have to ensure that the
rates are in the correct units for the survey.
David L Miller
# distances measured in metres, effort in kilometres and # abundance over an area measured in hectares: convert_units("Metre", "Kilometre", "Hectare") # all SI units, so the result is 1 convert_units("Metre", "metre", "square metre") # for points ignore effort convert_units("Metre", NULL, "Hectare")
# distances measured in metres, effort in kilometres and # abundance over an area measured in hectares: convert_units("Metre", "Kilometre", "Hectare") # all SI units, so the result is 1 convert_units("Metre", "metre", "square metre") # for points ignore effort convert_units("Metre", NULL, "Hectare")
This is an internal routine and shouldn't be necessary in normal analyses.
create_bins(data, cutpoints)
create_bins(data, cutpoints)
data |
|
cutpoints |
vector of cutpoints for the bins |
argument data
with two extra columns distbegin
and
distend
.
David L. Miller
## Not run: library(Distance) data(minke) # put the minke data into bins 0-1, 1-2, 2-3 km minke_cuts <- create_bins(minke[!is.na(minke$distance),], c(0,1,2,3)) ## End(Not run)
## Not run: library(Distance) data(minke) # put the minke data into bins 0-1, 1-2, 2-3 km minke_cuts <- create_bins(minke[!is.na(minke$distance),], c(0,1,2,3)) ## End(Not run)
create.bins
is now deprecated, please use create_bins
create.bins(data, cutpoints)
create.bins(data, cutpoints)
data |
|
cutpoints |
vector of cutpoints for the bins |
argument data
with two extra columns distbegin
and
distend
.
David L. Miller
Cues are treated as an indirect count, requiring the use of multipliers.
A data.frame
with 109 rows and 15 variables.
'Region.Label stratum labels
Area
size (km^2) of each stratum
Sample.Label
transect labels
Cue.rate
rate of blows per animal per hour
Cue.rate.SE
variability in cue rate
Cue.rate.df
degrees of freedom (number of animals sampled for cues)
object
object ID
distance
perpendicular distance (km)
Sample.Fraction
proportion of full circle scanned (radians)
Sample.Fraction.SE
variability in sampling fraction (0)
Search.time
Duration of scanning effort (hr)
bss
Beaufort sea state
sp
Species detected (all observations W in these data)
size
Number of animals in group (all 1 in these data)
Study.Area
study area name
Because whale blows disappear instantaneously, there is no need to measure a decay rate. However a cue production rate (blows per individual per unit time) is required, as is a measure of variability of that rate.
There are two other nuances in this survey. Even though the survey
is taking place on a moving ship, effort is measured as amount of time
scanning for blows. In some instances, it is not possible for the observer
to scan the sea all around them as view may be restricted by the ship's
superstructure. Here a sampling fraction
multiplier is employed to deal
with restricted vision. Units of measure of cue.rate
and Search.time
must be equal.
Once a detection function is fitted to data, this function can be used to compute abundance estimates over required areas. The function also allows for stratification and variance estimation via various schemes (see below).
dht2( ddf, observations = NULL, transects = NULL, geo_strat = NULL, flatfile = NULL, strat_formula, convert_units = 1, er_est = c("R2", "P2"), multipliers = NULL, sample_fraction = 1, ci_width = 0.95, innes = FALSE, stratification = "geographical", total_area = NULL, binomial_var = FALSE )
dht2( ddf, observations = NULL, transects = NULL, geo_strat = NULL, flatfile = NULL, strat_formula, convert_units = 1, er_est = c("R2", "P2"), multipliers = NULL, sample_fraction = 1, ci_width = 0.95, innes = FALSE, stratification = "geographical", total_area = NULL, binomial_var = FALSE )
ddf |
model fitted by |
observations |
|
transects |
|
geo_strat |
|
flatfile |
data in the flatfile format, see |
strat_formula |
a formula giving the stratification structure (see "Stratification" below). Currently only one level of stratification is supported. |
convert_units |
conversion factor between units for the distances,
effort and area. See "Units" below. Can supply one per detection function in
|
er_est |
encounter rate variance estimator to be used. See "Variance"
below and |
multipliers |
|
sample_fraction |
proportion of the transect covered (e.g., 0.5 for
one-sided line transects). May be specified as either a single number or a
|
ci_width |
for use with confidence interval calculation (defined as 1-alpha, so the default 95 will give a 95% confidence interval). |
innes |
logical flag for computing encounter rate variance using either
the method of Innes et al (2002) where estimated abundance per transect
divided by effort is used as the encounter rate, vs. (when |
stratification |
what do strata represent, see "Stratification" below. |
total_area |
for options |
binomial_var |
if we wish to estimate abundance for the covered area
only (i.e., study area = surveyed area) then this must be set to be
|
a data.frame
(of class dht_result
for pretty printing) with
estimates and attributes containing additional information, see "Outputs"
for information on column names.
The data format allows for complex stratification schemes to be set-up. Three objects are always required:
ddf
the detection function (see ds
or
ddf
for information on the format of their inputs).
observations
has one row per observation and links the observations to
the transects. Required columns:
object
(unique ID for the observation, which must match with the
data in the detection function)
Sample.Label
(unique ID for the transect).
Additional columns for strata which are not included in the detection
function are required (stratification covariates that are included in
the detection function do not need to be included here). The important
case here is group size, which must have column name size
(but does
not need to be in the detection function).
transects
has one row per sample (point or line transect). At least
one row is required. Required columns: Sample.Label
(unique ID for the
transect), Effort
(line length for line transects, number of visits for
point transects), if there is more than one geographical stratum.
With only these three arguments, abundance can only be calculated for the covered area. Including additional information on the area we wish to extrapolate to (i.e., the study area), we can obtain abundance estimates:
geo_strat
has one row for each stratum that we wish to estimate
abundance for. For abundance in the study area, at least one row is
required. Required columns: Area
(the area of that stratum). If there
is >1 row, then additional columns, named in strat_formula
.'
Note that if the Area
column is set to all 0, then only density estimates
will be returned.
It is often the case that we cannot measure distances to individuals or groups directly, but instead need to estimate distances to something they produce (e.g., for whales, their blows; for elephants their dung) – this is referred to as indirect sampling. We may need to use estimates of production rate and decay rate for these estimates (in the case of dung or nests) or just production rates (in the case of songbird calls or whale blows). We refer to these conversions between "number of cues" and "number of animals" as "multipliers".
The multipliers
argument is a list
, with 2 possible elements (creation
and decay
). Each element of which is a data.frame
and must have at least
a column named rate
, which abundance estimates will be divided by (the
term "multiplier" is a misnomer, but kept for compatibility with Distance
for Windows). Additional columns can be added to give the standard error and
degrees of freedom for the rate if known as SE
and df
, respectively. You
can use a multirow data.frame
to have different rates for different
geographical areas (for example). In this case the rows need to have a
column (or columns) to merge
with the data (for example Region.Label
).
The strat_formula
argument is used to specify a column to use to stratify
the results, using the form ~column.name
where column.name
is the column
name you wish to use.
The stratification
argument is used to specify which of four types of
stratification are intended:
"geographical"
if each stratum represents a different geographical
areas and you want the total over all the areas
"effort_sum"
if your strata are in fact from replicate
surveys (perhaps using different designs) but you don't have many
replicates and/or want an estimate of "average variance"
"replicate"
if you have replicate surveys but have many of them, this
calculates the average abundance and the variance between those many
surveys (think of a population of surveys)
"object"
if the stratification is really about the type of object
observed, for example sex, species or life stage and what you want is the
total number of individuals across all the classes of objects. For example,
if you have stratified by sex and have males and females, but also want a
total number of animals, you should use this option.
A simple example of using stratification="geographical"
is given below.
Further examples can be found at http://examples.distancesampling.org/
(see, e.g., the deer pellet survey).
Variance in the estimated abundance comes from multiple sources. Depending on the data used to fit the model and estimate abundance, different components will be included in the estimated variances. In the simplest case, the detection function and encounter rate variance need to be combined. If group size varies, then this too must be included. Finally, if multipliers are used and have corresponding standard errors given, this are also included. Variances are combined by assuming independence between the measures and adding variances. A brief summary of how each component is calculated is given here, though see references for more details.
detection function: variance from the detection function parameters is transformed to variance about the abundance via a sandwich estimator (see e.g., Appendix C of Borchers et al (2002)).
encounter rate: for strata with >1 transect in them, the encounter
rate estimators given in Fewster et al (2009) can be specified via the
er_est
argument. If the argument innes=TRUE
then calculations use the
estimated number of individuals in the transect (rather than the
observed), which was give by Innes et al (2002) as a superior estimator.
When there is only one transect in a stratum, Poisson variance is assumed.
Information on the Fewster encounter rate variance estimators are given in
varn
group size: if objects occur in groups (sometimes "clusters"), then the empirical variance of the group sizes is added to the total variance.
multipliers: if multipliers with standard errors are given, their corresponding variances are added. If no standard errors are supplied, then their contribution to variance is assumed to be 0.
It is often the case that distances are recorded in one convenient set of
units, whereas the study area and effort are recorded in some other units.
To ensure that the results from this function are in the expected units, we
use the convert_units
argument to supply a single number to convert the
units of the covered area to those of the study/stratification area (results
are always returned in the units of the study area). For line transects, the
covered area is calculated as 2 * width * length
where width
is the
effective (half)width of the transect (often referred to as w in the
literature) and length
is the line length (referred to as L). If width
and length
are measured in kilometres and the study area in square
kilometres, then all is fine and convert_units
is 1 (and can be ignored).
If, for example, line length and distances were measured in metres, we
instead need to convert this to be kilometres, by dividing by 1000 for each
of distance and length, hence convert_units=1e-6
. For point transects,
this is slightly easier as we only have the radius and study area to
consider, so the conversion is just such that the units of the truncation
radius are the square root of the study area units.
On printing the output from call to dht2
, three tables are produced. Below is a guide to the output columns names, per table.
Summary statistics table
Region.Label
Stratum name (this first column name depends on the formula
supplied)
Area
Size of stratum
CoveredArea
Surveyed area in stratum (2 x w x L)
Effort
Transect length or number of point visits per stratum
n
Number of detections
k
Number of replicate transects
ER
Encounter rate
se.ER
Standard error of encounter rate
cv.ER
Coefficient of variation of encounter rate
Abundance or density estimates table:
Region.Label
As above
Estimate
Point estimate of abundance or density
se
Standard error
cv
Coefficient of variation
LCI
Lower confidence bound
UCI
Upper confidence bound
df
Degrees of freedom used for confidence interval computation
Components percentage of variance:
Region.Label
As above
Detection
Percent of variance in abundance/density associated with
detection function uncertainty
ER
Percent of variance in abundance/density associated with
variability in encounter rate
Multipliers
Percent of variance in abundance/density associated with
uncertainty in multipliers
Borchers, D.L., S.T. Buckland, P.W. Goedhart, E.D. Clarke, and S.L. Hedley. 1998. Horvitz-Thompson estimators for double-platform line transect surveys. Biometrics 54: 1221-1237.
Borchers, D.L., S.T. Buckland, and W. Zucchini. 2002 Estimating Animal Abundance: Closed Populations. Statistics for Biology and Health. Springer London.
Buckland, S.T., E.A. Rexstad, T.A. Marques, and C.S. Oedekoven. 2015 Distance Sampling: Methods and Applications. Methods in Statistical Ecology. Springer International Publishing.
Buckland, S.T., D.R. Anderson, K. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. 2001 Introduction to Distance Sampling: Estimating Abundance of Biological Populations. Oxford University Press.
Innes, S., M. P. Heide-Jorgensen, J.L. Laake, K.L. Laidre, H.J. Cleator, P. Richard, and R.E.A. Stewart. 2002 Surveys of belugas and narwhals in the Canadian high arctic in 1996. NAMMCO Scientific Publications 4, 169-190.
## Not run: # example of simple geographical stratification # minke whale data, with 2 strata: North and South data(minke) # first fitting the detection function minke_df <- ds(minke, truncation=1.5, adjustment=NULL) # now estimate abundance using dht2 # stratum labels are in the Region.Label column minke_dht2 <- dht2(minke_df, flatfile=minke, stratification="geographical", strat_formula=~Region.Label) # could compare this to minke_df$dht and see the same results minke_dht2 # can alternatively report density print(minke_dht2, report="density") ## End(Not run)
## Not run: # example of simple geographical stratification # minke whale data, with 2 strata: North and South data(minke) # first fitting the detection function minke_df <- ds(minke, truncation=1.5, adjustment=NULL) # now estimate abundance using dht2 # stratum labels are in the Region.Label column minke_dht2 <- dht2(minke_df, flatfile=minke, stratification="geographical", strat_formula=~Region.Label) # could compare this to minke_df$dht and see the same results minke_dht2 # can alternatively report density print(minke_dht2, report="density") ## End(Not run)
This function fits detection functions to line or point transect data and
then (provided that survey information is supplied) calculates abundance and
density estimates. The examples below illustrate some basic types of
analysis using ds()
.
ds( data, truncation = ifelse(is.null(cutpoints), ifelse(is.null(data$distend), max(data$distance), max(data$distend)), max(cutpoints)), transect = "line", formula = ~1, key = c("hn", "hr", "unif"), adjustment = c("cos", "herm", "poly"), nadj = NULL, order = NULL, scale = c("width", "scale"), cutpoints = NULL, dht_group = FALSE, monotonicity = ifelse(formula == ~1, "strict", "none"), region_table = NULL, sample_table = NULL, obs_table = NULL, convert_units = 1, er_var = ifelse(transect == "line", "R2", "P2"), method = "nlminb", mono_method = "slsqp", quiet = FALSE, debug_level = 0, initial_values = NULL, max_adjustments = 5, er_method = 2, dht_se = TRUE, optimizer = "both", winebin = NULL, dht.group, region.table, sample.table, obs.table, convert.units, er.var, debug.level, initial.values, max.adjustments )
ds( data, truncation = ifelse(is.null(cutpoints), ifelse(is.null(data$distend), max(data$distance), max(data$distend)), max(cutpoints)), transect = "line", formula = ~1, key = c("hn", "hr", "unif"), adjustment = c("cos", "herm", "poly"), nadj = NULL, order = NULL, scale = c("width", "scale"), cutpoints = NULL, dht_group = FALSE, monotonicity = ifelse(formula == ~1, "strict", "none"), region_table = NULL, sample_table = NULL, obs_table = NULL, convert_units = 1, er_var = ifelse(transect == "line", "R2", "P2"), method = "nlminb", mono_method = "slsqp", quiet = FALSE, debug_level = 0, initial_values = NULL, max_adjustments = 5, er_method = 2, dht_se = TRUE, optimizer = "both", winebin = NULL, dht.group, region.table, sample.table, obs.table, convert.units, er.var, debug.level, initial.values, max.adjustments )
data |
a |
truncation |
either truncation distance (numeric, e.g. 5) or percentage
(as a string, e.g. "15%"). Can be supplied as a |
transect |
indicates transect type "line" (default) or "point". |
formula |
formula for the scale parameter. For a CDS analysis leave
this as its default |
key |
key function to use; |
adjustment |
adjustment terms to use; |
nadj |
the number of adjustment terms to fit. In the absence of
covariates in the formula, the default value ( |
order |
order of adjustment terms to fit. The default value ( |
scale |
the scale by which the distances in the adjustment terms are
divided. Defaults to |
cutpoints |
if the data are binned, this vector gives the cutpoints of
the bins. Supplying a distance column in your data and specifying cutpoints
is the recommended approach for all standard binned analyses.
Ensure that the first element is 0 (or the left truncation
distance) and the last is the distance to the end of the furthest bin.
(Default |
dht_group |
should density abundance estimates consider all groups to
be size 1 (abundance of groups) |
monotonicity |
should the detection function be constrained for
monotonicity weakly ( |
region_table |
|
sample_table |
|
obs_table |
|
convert_units |
conversion between units for abundance estimation, see "Units", below. (Defaults to 1, implying all of the units are "correct" already.) |
er_var |
encounter rate variance estimator to use when abundance
estimates are required. Defaults to "R2" for line transects and "P2" for
point transects (>= 1.0.9, earlier versions <= 1.0.8 used the "P3" estimator
by default for points). See |
method |
optimization method to use (any method usable by
|
mono_method |
optimization method to use when monotonicity is enforced.
Can be either |
quiet |
suppress non-essential messages (useful for bootstraps etc).
Default value |
debug_level |
print debugging output. |
initial_values |
a |
max_adjustments |
maximum number of adjustments to try (default 5) only
used when |
er_method |
encounter rate variance calculation: default = 2 gives the
method of Innes et al, using expected counts in the encounter rate. Setting
to 1 gives observed counts (which matches Distance for Windows) and 0 uses
binomial variance (only useful in the rare situation where study area =
surveyed area). See |
dht_se |
should uncertainty be calculated when using |
optimizer |
By default this is set to 'both'. In this case
the R optimizer will be used and if present the MCDS optimizer will also
be used. The result with the best likelihood value will be selected. To
run only a specified optimizer set this value to either 'R' or 'MCDS'.
See |
winebin |
If you are trying to use our MCDS.exe optimizer on a
non-windows system then you may need to specify the winebin. Please
see |
dht.group |
deprecated, see same argument with underscore, above. |
region.table |
deprecated, see same argument with underscore, above. |
sample.table |
deprecated, see same argument with underscore, above. |
obs.table |
deprecated, see same argument with underscore, above. |
convert.units |
deprecated, see same argument with underscore, above. |
er.var |
deprecated, see same argument with underscore, above. |
debug.level |
deprecated, see same argument with underscore, above. |
initial.values |
deprecated, see same argument with underscore, above. |
max.adjustments |
deprecated, see same argument with underscore, above. |
a list with elements:
ddf
a detection function model object.
dht
abundance/density information (if survey region data was supplied,
else NULL
)
If abundance estimates are required then the data.frame
s region_table
and sample_table
must be supplied. If data
does not contain the columns
Region.Label
and Sample.Label
then the data.frame
obs_table
must
also be supplied. Note that stratification only applies to abundance
estimates and not at the detection function level. Density and abundance
estimates, and corresponding estimates of variance and confidence intervals,
are calculated using the methods described in Buckland et al. (2001)
sections 3.6.1 and 3.7.1 (further details can be found in the documentation
for dht
).
For more advanced abundance/density estimation please see the
dht
and dht2
functions.
Examples of distance sampling analyses are available at http://examples.distancesampling.org/.
Hints and tips on fitting (particularly optimisation issues) are on the
mrds_opt
manual page.
Note that if the data contains a column named size
, cluster size will be
estimated and density/abundance will be based on a clustered analysis of
the data. Setting this column to be NULL
will perform a non-clustered
analysis (for example if "size
" means something else in your dataset).
The right truncation point is by default set to be largest observed distance or bin end point. This is a default will not be appropriate for all data and can often be the cause of model convergence failures. It is recommended that one plots a histogram of the observed distances prior to model fitting so as to get a feel for an appropriate truncation distance. (Similar arguments go for left truncation, if appropriate). Buckland et al (2001) provide guidelines on truncation.
When specified as a percentage, the largest right
and smallest left
percent distances are discarded. Percentages cannot be supplied when using
binned data.
For left truncation, there are two options: (1) fit a detection function to
the truncated data as is (this is what happens when you set left
). This
does not assume that g(x)=1 at the truncation point. (2) manually remove
data with distances less than the left truncation distance – effectively
move the centre line out to be the truncation distance (this needs to be
done before calling ds
). This then assumes that detection is certain at
the left truncation distance. The former strategy has a weaker assumption,
but will give higher variance as the detection function close to the line
has no data to tell it where to fit – it will be relying on the data from
after the left truncation point and the assumed shape of the detection
function. The latter is most appropriate in the case of aerial surveys,
where some area under the plane is not visible to the observers, but their
probability of detection is certain at the smallest distance.
Note that binning is performed such that bin 1 is all distances greater or equal to cutpoint 1 (>=0 or left truncation distance) and less than cutpoint 2. Bin 2 is then distances greater or equal to cutpoint 2 and less than cutpoint 3 and so on.
When adjustment terms are used, it is possible for the detection function to not always decrease with increasing distance. This is unrealistic and can lead to bias. To avoid this, the detection function can be constrained for monotonicity (and is by default for detection functions without covariates).
Monotonicity constraints are supported in a similar way to that described
in Buckland et al (2001). 20 equally spaced points over the range of the
detection function (left to right truncation) are evaluated at each round
of the optimisation and the function is constrained to be either always
less than it's value at zero ("weak"
) or such that each value is
less than or equal to the previous point (monotonically decreasing;
"strict"
). See also check.mono
.
Even with no monotonicity constraints, checks are still made that the
detection function is monotonic, see check.mono
.
In extrapolating to the entire survey region it is important that the unit
measurements be consistent or converted for consistency. A conversion
factor can be specified with the convert_units
argument. The values of
Area
in region_table
, must be made consistent with the units for
Effort
in sample_table
and the units of distance
in the data.frame
that was analyzed. It is easiest if the units of Area
are the square of
the units of Effort
and then it is only necessary to convert the units of
distance
to the units of Effort
. For example, if Effort
was entered
in kilometres and Area
in square kilometres and distance
in metres then
using convert_units=0.001
would convert metres to kilometres, density
would be expressed in square kilometres which would then be consistent with
units for Area
. However, they can all be in different units as long as
the appropriate composite value for convert_units
is chosen. Abundance
for a survey region can be expressed as: A*N/a
where A
is Area
for
the survey region, N
is the abundance in the covered (sampled) region,
and a
is the area of the sampled region and is in units of Effort * distance
. The sampled region a
is multiplied by convert_units
, so it
should be chosen such that the result is in the same units as Area
. For
example, if Effort
was entered in kilometres, Area
in hectares (100m x
100m) and distance
in metres, then using convert_units=10
will convert
a
to units of hectares (100 to convert metres to 100 metres for distance
and .1 to convert km to 100m units).
One can supply data
only to simply fit a detection function. However, if
abundance/density estimates are necessary further information is required.
Either the region_table
, sample_table
and obs_table
data.frame
s can
be supplied or all data can be supplied as a "flat file" in the data
argument. In this format each row in data has additional information that
would ordinarily be in the other tables. This usually means that there are
additional columns named: Sample.Label
, Region.Label
, Effort
and
Area
for each observation. See flatfile
for an example.
If column Area
is omitted, a density estimate is generated but note that
the degrees of freedom/standard errors/confidence intervals will not match
density estimates made with the Area
column present.
David L. Miller
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2001). Distance Sampling. Oxford University Press. Oxford, UK.
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2004). Advanced Distance Sampling. Oxford University Press. Oxford, UK.
flatfile
, AIC.ds
,
ds.gof
, p_dist_table
,
plot.ds
, add_df_covar_line
# An example from mrds, the golf tee data. library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) ds.model <- ds(tee.data, 4) summary(ds.model) plot(ds.model) ## Not run: # same model, but calculating abundance # need to supply the region, sample and observation tables region <- book.tee.data$book.tee.region samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs ds.dht.model <- ds(tee.data, 4, region_table=region, sample_table=samples, obs_table=obs) summary(ds.dht.model) # specify order 2 cosine adjustments ds.model.cos2 <- ds(tee.data, 4, adjustment="cos", order=2) summary(ds.model.cos2) # specify order 2 and 3 cosine adjustments, turning monotonicity # constraints off ds.model.cos23 <- ds(tee.data, 4, adjustment="cos", order=c(2, 3), monotonicity=FALSE) # check for non-monotonicity -- actually no problems check.mono(ds.model.cos23$ddf, plot=TRUE, n.pts=100) # include both a covariate and adjustment terms in the model ds.model.cos2.sex <- ds(tee.data, 4, adjustment="cos", order=2, monotonicity=FALSE, formula=~as.factor(sex)) # check for non-monotonicity -- actually no problems check.mono(ds.model.cos2.sex$ddf, plot=TRUE, n.pts=100) # truncate the largest 10% of the data and fit only a hazard-rate # detection function ds.model.hr.trunc <- ds(tee.data, truncation="10%", key="hr", adjustment=NULL) summary(ds.model.hr.trunc) # compare AICs between these models: AIC(ds.model) AIC(ds.model.cos2) AIC(ds.model.cos23) ## End(Not run)
# An example from mrds, the golf tee data. library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) ds.model <- ds(tee.data, 4) summary(ds.model) plot(ds.model) ## Not run: # same model, but calculating abundance # need to supply the region, sample and observation tables region <- book.tee.data$book.tee.region samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs ds.dht.model <- ds(tee.data, 4, region_table=region, sample_table=samples, obs_table=obs) summary(ds.dht.model) # specify order 2 cosine adjustments ds.model.cos2 <- ds(tee.data, 4, adjustment="cos", order=2) summary(ds.model.cos2) # specify order 2 and 3 cosine adjustments, turning monotonicity # constraints off ds.model.cos23 <- ds(tee.data, 4, adjustment="cos", order=c(2, 3), monotonicity=FALSE) # check for non-monotonicity -- actually no problems check.mono(ds.model.cos23$ddf, plot=TRUE, n.pts=100) # include both a covariate and adjustment terms in the model ds.model.cos2.sex <- ds(tee.data, 4, adjustment="cos", order=2, monotonicity=FALSE, formula=~as.factor(sex)) # check for non-monotonicity -- actually no problems check.mono(ds.model.cos2.sex$ddf, plot=TRUE, n.pts=100) # truncate the largest 10% of the data and fit only a hazard-rate # detection function ds.model.hr.trunc <- ds(tee.data, truncation="10%", key="hr", adjustment=NULL) summary(ds.model.hr.trunc) # compare AICs between these models: AIC(ds.model) AIC(ds.model.cos2) AIC(ds.model.cos23) ## End(Not run)
This function is deprecated, please see gof_ds
.
ds.gof(model, breaks = NULL, nc = NULL, qq = TRUE, ks = FALSE, ...)
ds.gof(model, breaks = NULL, nc = NULL, qq = TRUE, ks = FALSE, ...)
model |
deprecated. |
breaks |
deprecated. |
nc |
deprecated. |
qq |
deprecated. |
ks |
deprecated. |
... |
deprecated. |
Nothing, deprecated.
David L Miller
Simulated line transect survey of duck nests, designed to reproduce the data of Figure 2 in Anderson and Pospahala (1970).
A data.frame
with 534 rows and 7 variables
Region.Label
strata names (single stratum in this instance)
Area
size of refuge (0 in this case, actual size 60km^2)
Sample.Label
transect ID
Effort
length of transects (km)
object
nest ID
distance
perpendicular distance (m)
Study.Area
name of wildlife refuge
The Monte Vista National Wildlife Refuge is in southern Colorado in the USA at an altitude of roughly 2400m.
Simulated data, from the distance sampling introductory course, Centre for Research into Ecological & Environmental Modelling, University of St Andrews.
Anderson, D. R., and R. S. Pospahala. 1970. Correction of bias in belt transect studies of immotile objects. The Journal of Wildlife Management 34 (1): 141–146. doi:10.2307/3799501
Study took place in Tai National Park Cote d'Ivoire in 2014. Filmed Maxwell's duikers (Philantomba maxwellii) were assigned to distance intervals; recorded distances are the midpoints of the intervals. This data includes only observations recorded at times of peak activity.
A data.frame
with 6277 rows and 6 variables
Region.Label
strata names (single stratum)
Area
size of study area (40.37 km^2)
multiplier
spatial effort, as the proportion of a circle covered by
the angle of view of the camera (42 degrees for these cameras)
Sample.Label
camera station identifier (21 functioning cameras in
this data set)
Effort
temporal effort, i.e. the number of 2-second time-steps over
which the camera operated
object
unique object ID
distance
radial distance (m) to interval midpoint
Howe, E.J., Buckland, S.T., Després-Einspenner, M.-L. and Kühl, H.S. (2017), Distance sampling with camera traps. Methods Ecol Evol, 8: 1558-1565. doi:10.1111/2041-210X.12790
Howe, Eric J. et al. (2018), Data from: Distance sampling with camera traps, Dryad, Dataset, doi:10.5061/dryad.b4c70
Create a detection function object for strip/plot surveys for use with
dht2
.
dummy_ddf(data, width, left = 0, transect = "line")
dummy_ddf(data, width, left = 0, transect = "line")
data |
as specified for |
width |
right truncation |
left |
left truncation (default 0, no left truncation) |
transect |
|
David L Miller
Observers aboard tuna vessels detecting dolphin schools along with a number of possibly useful covariates for modelling the detection function.
A data.frame
with 1090 rows and 13 variables:
Region.Label
stratum labels (only one)
Area
size (nmi) of each stratum
Sample.Label
transect labels
Effort
transect length (nmi)
object
object ID
distance
perpendicular distance (nmi)
LnCluster
natural log of cluster size
Month
month of detection
Beauf.class
Beaufort sea state
Cue.type
initial cue triggering detection
Search.method
observer method making the detection
size
cluster size
Study.Area
study area name
Several different search methods included in these data
0
binoculars from crows nest
2
binoculars from elsewhere on ship
3
helicopter searching ahead of ship
5
radar detects of seabirds above dolphin schools
Several cue types were also recorded by observers.
1
seabirds above the school
2
water splashes
3
unspecified
4
floating objects such as logs
Inter-American Tropical Tuna Commission
Distance
allows loading data as a "flat file" and analyse data (and
obtain abundance estimates) straight away, provided that the format of the
flat file is correct. One can provide the file as, for example, an Excel
spreadsheet using readxl::read_xls
in or CSV using
read.csv
.
Each row of the data table corresponds to either: (1) an observation or (2) a sample (transect) without observations. In either case the following columns must be present:
distance
observed distance to object
object
a unique identifier for each observation (only required when
using dht2
)
Sample.Label
identifier for the sample (transect id)
Effort
effort for this transect (e.g. line transect length or number
of times point transect was visited)
Region.Label
label for a given stratum (see below)
Area
area of the strataWhen the row represents a transect without observations,
distanceand any other observation-specific covariates (including
sizeand detection function covariates) take the value
NA'.
Note that in the simplest case (one area surveyed only once) there is only
one Region.Label
and a single corresponding Area
duplicated for each
observation.
The example given below was provided by Eric Rexstad. Additional examples can be found at http://examples.distancesampling.org/.
## Not run: library(Distance) # Need to have the readxl package installed from CRAN require(readxl) # Need to get the file path first minke.filepath <- system.file("minke.xlsx", package="Distance") # Load the Excel file, note that col_names=FALSE and we add column names after minke <- read_xlsx(minke.filepath, col_names=FALSE) names(minke) <- c("Region.Label", "Area", "Sample.Label", "Effort", "distance") # One may want to call edit(minke) or head(minke) at this point # to examine the data format ## perform an analysis using the exact distances pooled.exact <- ds(minke, truncation=1.5, key="hr", order=0) summary(pooled.exact) ## Try a binned analysis # first define the bins dist.bins <- c(0,.214, .428,.643,.857,1.071,1.286,1.5) pooled.binned <- ds(minke, truncation=1.5, cutpoints=dist.bins, key="hr", order=0) # binned with stratum as a covariate minke$stratum <- ifelse(minke$Region.Label=="North", "N", "S") strat.covar.binned <- ds(minke, truncation=1.5, key="hr", formula=~as.factor(stratum), cutpoints=dist.bins) # Stratified by North/South full.strat.binned.North <- ds(minke[minke$Region.Label=="North",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) full.strat.binned.South <- ds(minke[minke$Region.Label=="South",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) ## model summaries model.sel.bin <- data.frame(name=c("Pooled f(0)", "Stratum covariate", "Full stratification"), aic=c(pooled.binned$ddf$criterion, strat.covar.binned$ddf$criterion, full.strat.binned.North$ddf$criterion+ full.strat.binned.South$ddf$criterion)) # Note model with stratum as covariate is most parsimonious print(model.sel.bin) ## End(Not run)
## Not run: library(Distance) # Need to have the readxl package installed from CRAN require(readxl) # Need to get the file path first minke.filepath <- system.file("minke.xlsx", package="Distance") # Load the Excel file, note that col_names=FALSE and we add column names after minke <- read_xlsx(minke.filepath, col_names=FALSE) names(minke) <- c("Region.Label", "Area", "Sample.Label", "Effort", "distance") # One may want to call edit(minke) or head(minke) at this point # to examine the data format ## perform an analysis using the exact distances pooled.exact <- ds(minke, truncation=1.5, key="hr", order=0) summary(pooled.exact) ## Try a binned analysis # first define the bins dist.bins <- c(0,.214, .428,.643,.857,1.071,1.286,1.5) pooled.binned <- ds(minke, truncation=1.5, cutpoints=dist.bins, key="hr", order=0) # binned with stratum as a covariate minke$stratum <- ifelse(minke$Region.Label=="North", "N", "S") strat.covar.binned <- ds(minke, truncation=1.5, key="hr", formula=~as.factor(stratum), cutpoints=dist.bins) # Stratified by North/South full.strat.binned.North <- ds(minke[minke$Region.Label=="North",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) full.strat.binned.South <- ds(minke[minke$Region.Label=="South",], truncation=1.5, key="hr", order=0, cutpoints=dist.bins) ## model summaries model.sel.bin <- data.frame(name=c("Pooled f(0)", "Stratum covariate", "Full stratification"), aic=c(pooled.binned$ddf$criterion, strat.covar.binned$ddf$criterion, full.strat.binned.North$ddf$criterion+ full.strat.binned.South$ddf$criterion)) # Note model with stratum as covariate is most parsimonious print(model.sel.bin) ## End(Not run)
Goodness of fit testing for detection function models. For continuous
distances Kolmogorov-Smirnov and Cramer-von Mises tests can be used, when
binned or continuous distances are used a test can be used.
gof_ds( model, plot = TRUE, chisq = FALSE, nboot = 100, ks = FALSE, nc = NULL, breaks = NULL, ... )
gof_ds( model, plot = TRUE, chisq = FALSE, nboot = 100, ks = FALSE, nc = NULL, breaks = NULL, ... )
model |
a fitted detection function. |
plot |
if |
chisq |
if |
nboot |
number of replicates to use to calculate p-values for the Kolmogorov-Smirnov goodness of fit test statistics |
ks |
perform the Kolmogorov-Smirnov test (this involves many bootstraps so can take a while) |
nc |
number of evenly-spaced distance classes for chi-squared test, if
|
breaks |
vector of cutpoints to use for binning, if |
... |
other arguments to be passed to |
Kolmogorov-Smirnov and Cramer-von Mises tests are based on looking at the
quantile-quantile plot produced by qqplot.ddf
and
deviations from the line .
The Kolmogorov-Smirnov test asks the question "what's the largest vertical
distance between a point and the line?" It uses this distance as a
statistic to test the null hypothesis that the samples (EDF and CDF in our
case) are from the same distribution (and hence our model fits well). If the
deviation between the
line and the points is too large we reject
the null hypothesis and say the model doesn't have a good fit.
Rather than looking at the single biggest difference between the y=x line and the points in the Q-Q plot, we might prefer to think about all the differences between line and points, since there may be many smaller differences that we want to take into account rather than looking for one large deviation. Its null hypothesis is the same, but the statistic it uses is the sum of the deviations from each of the point to the line.
A chi-squared test is also run if chisq=TRUE
. In this case binning of
distances is required if distance data are continuous. This can be specified
as a number of equally-spaced bins (using the argument nc=
) or the
cutpoints of bins (using breaks=
). The test compares the number of
observations in a given bin to the number predicted under the fitted
detection function.
Note that a bootstrap procedure is required for the Kolmogorov-Smirnov test
to ensure that the p-values from the procedure are correct as the we are
comparing the cumulative distribution function (CDF) and empirical
distribution function (EDF) and we have estimated the parameters of the
detection function. The nboot
parameter controls the number of bootstraps
to use. Set to 0
to avoid computing bootstraps (much faster but with no
Kolmogorov-Smirnov results, of course).
## Not run: # fit and test a simple model for the golf tee data library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) ds.model <- ds(tee.data,4) # don't make plot gof_ds(ds.model, plot=FALSE) ## End(Not run)
## Not run: # fit and test a simple model for the golf tee data library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) ds.model <- ds(tee.data,4) # don't make plot gof_ds(ds.model, plot=FALSE) ## End(Not run)
The data are from independent surveys by eight observers of a population of 250 groups (760 individuals) of golf tees. The tees, of two colours, were placed in groups of between 1 and 8 in a survey region of 1680 m^2, either exposed above the surrounding grass, or at least partially hidden by it. They were surveyed by the 1999 statistics honours class at the University of St Andrews.
Data is a list
with 4 elements each of which is a data.frame
:
book.tee.dataframe
object
object ID
observer
observer ID
detected
detected or not detected
distance
perpendicular distance
size
group size
sex
number of tees in group
exposure
tee height above ground
book.tee.region
Region.Label
stratum name
Area
stratum size
book.tee.samples
Sample.Label
transect label
Region.Label
stratum name
Effort
transect length
book.tee.obs
object
object ID
Region.Label
stratum in which it was detected
Sample.Label
transect on which it was detected
We treat each group of golf tees as a single animal with size equal to the number of tees in the group; yellow tees are male, green are female; tees exposed above the surrounding grass are classified as exposed, others as unexposed. We are grateful to Miguel Bernal for making these data available; they were collected by him as part of a masters project.
Borchers, D. L., S.T. Buckland, and W. Zucchini. 2002. Estimating Animal Abundance: Closed Populations. Statistics for Biology and Health. London: Springer-Verlag. https://link.springer.com/book/10.1007/978-1-4471-3708-5
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Advanced Distance Sampling: Estimating Abundance of Biological Populations. Oxford University Press. Oxford, 2004.
Extract the log-likelihood from a fitted detection function.
## S3 method for class 'dsmodel' logLik(object, ...)
## S3 method for class 'dsmodel' logLik(object, ...)
object |
a fitted detection function model object |
... |
included for S3 completeness, but ignored |
a numeric value giving the log-likelihood with two attributes:
"df"
the "degrees of freedom for the model (number of parameters) and
"nobs"
the number of observations used to fit the model
David L Miller
## Not run: library(Distance) data(minke) model <- ds(minke, truncation=4) # extract the log likelihood logLik(model) ## End(Not run)
## Not run: library(Distance) data(minke) model <- ds(minke, truncation=4) # extract the log likelihood logLik(model) ## End(Not run)
Simulated line transect survey. Twelve transects, detection function is half-normal. True object density is 79.8 animals per km^2.
A data.frame
with 106 rows and 7 variables
Region.Label
strata names (single stratum)
Area
size of study area (1 in this case, making abundance and density
equal)
Sample.Label
transect ID
Effort
length of transects (km)
object
object ID
distance
perpendicular distance (m)
Study.Area
name of study area
There is no unit object associated with this dataset
Simulated data, from the distance sampling introductory course, Centre for Research into Ecological & Environmental Modelling, University of St Andrews.
Helper to use a models specified using activity::fitact
to fit an
activity model and generate single realisations for bootstrapping with
bootdht
.
make_activity_fn(..., detector_daily_duration = 24)
make_activity_fn(..., detector_daily_duration = 24)
... |
parameters specified by activity::fitact |
detector_daily_duration |
by default we assume that detectors were able to detect animals for 24 hours, if they were only able to do this for some proportion of the day (say daylight hours), then adjust this argument accordingly |
Uses activity::fitact
to generate single possible availability estimates
based on bootstraps. The function returns another function, which can be
passed to bootdht
. It is recommended that you try out the function before
passing it to bootdht
. See examples for a template for use.
a function which generates a single bootstrap estimate of availability
David L Miller
Data simulated from models fitted to 1992/1993 Southern Hemisphere minke whale data collected by the International Whaling Commission. See Branch and Butterworth (2001) for survey details (survey design is shown in figure 1(e)). Data simulated by David Borchers.
data.frame
with 99 observations of 5 variables:
Region.Label
stratum label ("North"
or "South"
)
Area
stratum area
Sample.Label
transect identifier
Effort
transect length
distance
observed distance
object
unique object ID
Data are included here as both R data and as an Excel spreadsheet to
illustrate the "flat file" input method. See flatfile
for how
to load this data and an example analysis.
Shipped with the Distance for Windows.
Branch, T.A. and D.S. Butterworth (2001) Southern Hemisphere minke whales: standardised abundance estimates from the 1978/79 to 1997/98 IDCR-SOWER surveys. Journal of Cetacean Research and Management 3(2): 143-174
Hedley, S.L., and S.T. Buckland. Spatial Models for Line Transect Sampling. Journal of Agricultural, Biological, and Environmental Statistics 9, no. 2 (2004): 181-199. doi:10.1198/1085711043578.
data(minke) head(minke)
data(minke) head(minke)
Generate a table of frequencies of probability of detection from a detection function model. This is particularly useful when employing covariates, as it can indicate if there are detections with very small detection probabilities that can be unduly influential when calculating abundance estimates.
object |
fitted detection function |
bins |
how the results should be binned |
proportion |
should proportions be returned as well as counts? |
Because dht
uses a Horvitz-Thompson-like estimator, abundance
estimates can be sensitive to errors in the estimated probabilities. The
estimator is based on , which means that the
sensitivity is greater for smaller detection probabilities. As a rough
guide, we recommend that the method be not used if more than say 5% of the
are less than 0.2, or if any are less than 0.1. If
these conditions are violated, the truncation distance w can be reduced.
This causes some loss of precision relative to standard distance sampling
without covariates.
a data.frame
with probability bins, counts and (optionally)
proportions. The object has an attribute p_range
which contains the
range of estimated detection probabilities
This function is located in the mrds
package but the documentation
is provided here for easy access.
David L Miller
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
## Not run: # example using a model for the minke data data(minke) # fit a model result <- ds(minke, formula=~Region.Label) # print table p_dist_table(result) # with proportions p_dist_table(result, proportion=TRUE) ## End(Not run)
## Not run: # example using a model for the minke data data(minke) # fit a model result <- ds(minke, formula=~Region.Label) # print table p_dist_table(result) # with proportions p_dist_table(result, proportion=TRUE) ## End(Not run)
This is just a simple wrapper around plot.ds
. See the
manual page for that function for more information.
## S3 method for class 'dsmodel' plot(x, pl.den = 0, ...)
## S3 method for class 'dsmodel' plot(x, pl.den = 0, ...)
x |
an object of class |
pl.den |
shading density for histogram (default |
... |
extra arguments to be passed to |
NULL
, just produces a plot.
David L. Miller
Predict detection probabilities (or effective strip widths/effective areas of detection) from a fitted distance sampling model using either the original data (i.e., "fitted" values) or using new data.
## S3 method for class 'dsmodel' predict( object, newdata = NULL, compute = FALSE, esw = FALSE, se.fit = FALSE, ... )
## S3 method for class 'dsmodel' predict( object, newdata = NULL, compute = FALSE, esw = FALSE, se.fit = FALSE, ... )
object |
|
newdata |
new |
compute |
if |
esw |
if |
se.fit |
should standard errors on the predicted probabilities of
detection (or ESW if |
... |
for S3 consistency |
For line transects, the effective strip half-width (esw=TRUE
) is the
integral of the fitted detection function over either 0 to W or the
specified int.range
. The predicted detection probability is the
average probability which is simply the integral divided by the distance
range. For point transect models, esw=TRUE
calculates the effective
area of detection (commonly referred to as "nu", this is the integral of
2/width^2 * r * g(r)
.
Fitted detection probabilities are stored in the model
object and
these are returned unless compute=TRUE
or newdata
is
specified. compute=TRUE
is used to estimate numerical derivatives for
use in delta method approximations to the variance.
Note that the ordering of the returned results when no new data is supplied
(the "fitted" values) will not necessarily be the same as the data supplied
to ddf
, the data (and hence results from predict
) will
be sorted by object ID (object
).
a list with a single element: fitted
, a vector of average
detection probabilities or esw values for each observation in the original
data ornewdata
. If se.fit=TRUE
there is an additional element $se.fit
,
which contains the standard errors of the probabilities of detection or ESW.
David L Miller
Prediction function for dummy detection functions. The function returns as
many 1s as there are rows in newdata
. If esw=TRUE
then the
strip width is returned.
## S3 method for class 'fake_ddf' predict( object, newdata = NULL, compute = FALSE, int.range = NULL, esw = FALSE, ... )
## S3 method for class 'fake_ddf' predict( object, newdata = NULL, compute = FALSE, int.range = NULL, esw = FALSE, ... )
object |
model object |
newdata |
how many 1s should we return? |
compute |
unused, compatibility with |
int.range |
unused, compatibility with |
esw |
should the strip width be returned? |
... |
for S3 consistency |
David L Miller
See dht2
for information on printed column names.
## S3 method for class 'dht_result' print(x, report = "abundance", groups = FALSE, ...)
## S3 method for class 'dht_result' print(x, report = "abundance", groups = FALSE, ...)
x |
object of class |
report |
should |
groups |
should abundance/density of groups be produced? |
... |
unused |
Simply prints out a brief description of the model which was fitted. For more
detailed information use summary
.
## S3 method for class 'dsmodel' print(x, ...)
## S3 method for class 'dsmodel' print(x, ...)
x |
a distance sampling analysis (result from calling |
... |
not passed through, just for S3 compatibility. |
David L. Miller
Provides a brief summary of a distance sampling analysis. Including: detection function parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'summary.dsmodel' print(x, ...)
## S3 method for class 'summary.dsmodel' print(x, ...)
x |
a summary of distance sampling analysis |
... |
unspecified and unused arguments for S3 consistency |
Nothing, just prints the summary.
David L. Miller and Jeff Laake
Simulated point transect survey. Thirty point transects, detection function is half-normal. True object density is 79.6 animals per hectare.
A data.frame
with 144 rows and 7 variables
Region.Label
strata names (single stratum)
Area
size of study area (0 in this case)
Sample.Label
transect ID
Effort
number of visits to point
object
object ID
distance
radial distance (m)
Study.Area
name of study area
Simulated data, from the distance sampling introductory course, Centre for Research into Ecological & Environmental Modelling, University of St Andrews.
Overdispersion causes AIC to select overly-complex models, so analysts should specify the number/order of adjustment terms manually when fitting distance sampling models to data from camera traps, rather than allowing automated selection using AIC. Howe et al (2019) described a two-step method for selecting among models of the detection function in the face of overdispersion.
QAIC(object, ..., chat = NULL, k = 2) chi2_select(object, ...)
QAIC(object, ..., chat = NULL, k = 2) chi2_select(object, ...)
object |
a fitted detection function object |
... |
additional fitted model objects. |
chat |
a value of |
k |
penalty per parameter to be used; default 2 |
In step 1, and overdispersion factor () is computed
either (1) for each key function family, from the most complex model in each
family, as the chi-square goodness of fit test statistic divided by its
degrees of freedom (
), or (2) for all models in the
candidate set, from the raw data (
). In camera trap
surveys of solitary animals,
would be the mean number
of distance observations recorded during a single pass by an animal in front
of a trap. In surveys of social animals employing human observers,
would be the mean number of detected animals per
detected group, and in camera trap surveys of social animals
the mean number of distance observations recorded
during an encounter between a group of animals and a CT. In step two, the
chi-square goodness of fit statistic divided by its degrees of freedom is
calculated for the QAIC-minimizing model within each key function, and the
model with the lowest value is selected for estimation.
The QAIC()
function should only be used select among models with the same
key function (step 1). QAIC()
uses by default,
computing it from the model with the most parameters. Alternatively,
can be calculated from the raw data and included in
the call to
QAIC()
. Users must identify the QAIC-minimizing model within
key functions from the resulting data.frame
, and provide these models as
arguments in ch2_select()
. chi2_select()
then computes and reports the
chi-square goodness of fit statistic divided by its degrees of freedom for
each of those models. The model with the lowest value is recommended for
estimation.
a data.frame
with one row per model supplied, in the same order as
given
David L Miller, based on code from Eric Rexstad and explanation from Eric Howe.
Howe, E. J., Buckland, S. T., Després-Einspenner, M.-L., & Kühl, H. S. (2019). Model selection with overdispersed distance sampling data. Methods in Ecology and Evolution, 10(1), 38–47. doi:10.1111/2041-210X.13082
## Not run: library(Distance) data("wren_cuecount") # fit hazard-rate key models w3.hr0 <- ds(wren_cuecount, transect="point", key="hr", adjustment=NULL, truncation=92.5) w3.hr1 <- ds(wren_cuecount, transect="point", key="hr", adjustment="cos", order=2, truncation=92.5) w3.hr2 <- ds(wren_cuecount, transect="point", key="hr", adjustment="cos", order=c(2, 4), truncation=92.5) # fit unform key models w3.u1 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=1, truncation=92.5) w3.u2 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=c(1,2), monotonicity="none", truncation=92.5) w3.u3 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=c(1,2,3), monotonicity="none", truncation=92.5) # fit half-normal key functions w3.hn0 <- ds(wren_cuecount, transect="point", key="hn", adjustment=NULL, truncation=92.5) w3.hn1 <- ds(wren_cuecount, transect="point", key="hn", adjustment="herm", order=2, truncation=92.5) # stage 1: calculate QAIC per model set QAIC(w3.hr0, w3.hr1, w3.hr2) # no adjustments smallest QAIC(w3.u1, w3.u2, w3.u3) # 2 adjustment terms (by 0.07) QAIC(w3.hn0, w3.hn1) # no adjustments smallest # stage 2: select using chi^2/degrees of freedom between sets chi2_select(w3.hr0, w3.u2, w3.hn0) # example using a pre-calculated chat chat <- attr(QAIC(w3.hr0, w3.hr1, w3.hr2), "chat") QAIC(w3.hr0, chat=chat) ## End(Not run)
## Not run: library(Distance) data("wren_cuecount") # fit hazard-rate key models w3.hr0 <- ds(wren_cuecount, transect="point", key="hr", adjustment=NULL, truncation=92.5) w3.hr1 <- ds(wren_cuecount, transect="point", key="hr", adjustment="cos", order=2, truncation=92.5) w3.hr2 <- ds(wren_cuecount, transect="point", key="hr", adjustment="cos", order=c(2, 4), truncation=92.5) # fit unform key models w3.u1 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=1, truncation=92.5) w3.u2 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=c(1,2), monotonicity="none", truncation=92.5) w3.u3 <- ds(wren_cuecount, transect="point", key="unif", adjustment="cos", order=c(1,2,3), monotonicity="none", truncation=92.5) # fit half-normal key functions w3.hn0 <- ds(wren_cuecount, transect="point", key="hn", adjustment=NULL, truncation=92.5) w3.hn1 <- ds(wren_cuecount, transect="point", key="hn", adjustment="herm", order=2, truncation=92.5) # stage 1: calculate QAIC per model set QAIC(w3.hr0, w3.hr1, w3.hr2) # no adjustments smallest QAIC(w3.u1, w3.u2, w3.u3) # 2 adjustment terms (by 0.07) QAIC(w3.hn0, w3.hn1) # no adjustments smallest # stage 2: select using chi^2/degrees of freedom between sets chi2_select(w3.hr0, w3.u2, w3.hn0) # example using a pre-calculated chat chat <- attr(QAIC(w3.hr0, w3.hr1, w3.hr2), "chat") QAIC(w3.hr0, chat=chat) ## End(Not run)
Point transect data collected in Colorado 1980/81 to examine effect of agricultural practices upon avian community.
data.frame
with 468 observations (1980) and 448 observations
(1981) of 7 variables:
Region.Label
stratum label (pasture ID)
Area
stratum area (set to 1 so density is reported)
Sample.Label
transect identifier
Effort
number of visits
object
object ID
distance
radial distance (m)
Study.Area
name of study area
Design consisted of point transects placed in multiple pastures (3 in 1980 and 4 in 1981). While many species were observed, only data for Savannah sparrows (Passerculus sandwichensis) are included here.
Data given here are different from the Distance for Windows example project. Here each individual sighting is treated as an independent observation. This corresponds to the analysis in Buckland et al. (2001) Section 8.7. In the Distance for Windows project objects are clusters of individuals. This should not affect the results too greatly as most clusters were of size 1, and so the results obtained should not be too far out.
Knopf, F.L., J.A. Sedgwick, and R.W. Cannon. (1988) Guild structure of a riparian avifauna relative to seasonal cattle grazing. The Journal of Wildlife Management 52 (2): 280–290. doi:10.2307/3801235
Because sika deer spend most of their time in woodland areas, abundance estimates are based on pellet group counts. Line transect methods were applied to estimate deer pellet group density by geographic block.
A data.frame
with 1923 rows and 11 variables.
Region.Label
stratum labels
Area
size (ha) of each stratum
Sample.Label
transect labels
Defecation.rate
rate of dung production per individual per day
Defecation.rate.SE
variability in defecation rate
Decay.rate
time (days) for dung to become undetectable
Decay.rate.SE
variability in decay rate
Effort
transect length (km)
object
object ID
distance
perpendicular distance (cm)
Study.Area
study area name
Data presented here are from the Peebleshire portion of the study described by Marques et al. (2001).
Marques, F.F.C., S.T. Buckland, D. Goffin, C.E. Dixon, D.L. Borchers, B.A. Mayle, and A.J. Peace. (2001). Estimating deer abundance from line transect surveys of dung: sika deer in southern Scotland. Journal of Applied Ecology 38 (2): 349–363. doi:10.1046/j.1365-2664.2001.00584.x
Data simulated from models fitted to 1992/1993 Southern Hemisphere minke whale data collected by the International Whaling Commission. See Branch and Butterworth (2001) for survey details (survey design is shown in figure 1(e)). Data simulated by David Borchers.
data.frame
with 99 observations of 7 variables:
Region.Label
stratum label ("North"
or "South"
)
Area
stratum area (square nautical mile)
Sample.Label
transect identifier
Effort
transect length (nautical mile)
object
object ID
distance
observed distance (nautical mile)
Study.Area
name of study area
Branch, T.A. and D.S. Butterworth. (2001) Southern Hemisphere minke whales: standardised abundance estimates from the 1978/79 to 1997/98 IDCR-SOWER surveys. Journal of Cetacean Research and Management 3(2): 143-174
Hedley, S.L., and S.T. Buckland. (2004) Spatial models for line transect sampling. Journal of Agricultural, Biological, and Environmental Statistics 9: 181-199. doi:10.1198/1085711043578.
Provide a summary table of useful information about fitted detection
functions. This can be useful when paired with knitr
's kable
function. By
default models are sorted by AIC and will therefore not allow models with
different truncations and distance binning.
summarize_ds_models(..., sort = "AIC", output = "latex", delta_only = TRUE)
summarize_ds_models(..., sort = "AIC", output = "latex", delta_only = TRUE)
... |
models to be summarised |
sort |
column to sort by (default |
output |
should the output be given in |
delta_only |
only output AIC differences (default |
Note that the column names are in LaTeX format, so if you plan to manipulate
the resulting data.frame
in R, you may wish to rename the columns for
ease of access.
David L Miller
## Not run: # fit some models to the golf tee data library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) model_hn <- ds(tee.data,4) model_hr <- ds(tee.data,4, key="hr") summarize_ds_models(model_hr, model_hn, output="plain") ## End(Not run)
## Not run: # fit some models to the golf tee data library(Distance) data(book.tee.data) tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1) model_hn <- ds(tee.data,4) model_hr <- ds(tee.data,4, key="hr") summarize_ds_models(model_hr, model_hn, output="plain") ## End(Not run)
A simple function to calculate summaries of bootstrap output generated by
bootdht
.
## S3 method for class 'dht_bootstrap' summary(object, alpha = 0.05, ...)
## S3 method for class 'dht_bootstrap' summary(object, alpha = 0.05, ...)
object |
output from |
alpha |
value to use in confidence interval calculation (to obtain
|
... |
for S3 compatibility, unused. |
Summaries are only made for numeric outputs. Both median and mean are
reported to allow assessment of bias. The coefficient of variation reported
(in column cv
) is based on the median calculated from the bootstraps.
a data.frame
of summary statistics
Provides a brief summary of a distance sampling analysis. This includes parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'dsmodel' summary(object, ...)
## S3 method for class 'dsmodel' summary(object, ...)
object |
a distance analysis |
... |
unspecified and unused arguments for S3 consistency |
list of extracted and summarized objects
This function just calls summary.ds
and dht
,
collates and prints the results in a nice way.
David L. Miller
systematic_var_1
consists of simulated line transect data with large
differences in transect length. In systematic_var_2
that transect length
gradient is coupled with a strong animal gradient; exaggerating encounter
rate variance between transects.
data.frame
with 253 observations (systematic_var_1
) or 256
observations (systematic_var_2
) of 7 variables:
Region.Label
stratum label (default)
Area
stratum area (0.5 km^2)
Sample.Label
transect identifier
Effort
transect length (km)
object
object ID
distance
perpendicular distance (m)
Study.Area
name of study area
True population size is 1000 objects in the study area of size 0.5 km^2; such that true density is 2000 objects per km.
Fewster, R.M., S.T. Buckland, K.P. Burnham, D.L. Borchers, P.E. Jupp, J.L. Laake and L. Thomas. (2009) Estimating the encounter rate variance in distance sampling. Biometrics 65 (1): 225–236. doi:10.1111/j.1541-0420.2008.01018.x
Sometimes data is provided in the flatfile
format, but we
really want it in mrds
format (that is, as distance data, observation
table, sample table and region table format). This function undoes the
flattening, assuming that the data have the correct columns.
unflatten(data)
unflatten(data)
data |
data in flatfile format (a |
list
of four data.frame
s: distance data, observation table,
sample table, region table.
David L Miller
Simulated line transect survey. Only eight line transects, detection function is half-normal.
A data.frame
with 60 rows and 9 variables
Region.Label
strata names (single stratum)
Area
size of study area (mi^2)
Sample.Label
transect ID
Effort
transect length (mi)
object
object ID
distance
perpendicular distance (km)
MSTDO
time since medication taken by observer (min)
Hour
time of day of sighting (hour)
Study.Area
name of study area
Hour
is covariate that has no effect on detection function,
while MSTDO
does affect the detection function. Examine the ability
of model selection to choose the correct model.
Simulated data, from the distance sampling introductory course, Centre for Research into Ecological & Environmental Modelling, University of St Andrews.
Returns a table of conversions between the units used in Distance for
Windows. This is extracted from the DistIni.mdb
default database.
units_table()
units_table()
David L Miller
Observations of winter wren (Troglodytes troglodytes L.) collected by Steve Buckland in woodland/parkland at Montrave Estate near Leven, Fife, Scotland.
Four different surveys were carried out:
wren_5min
5-minute point count
wren_snapshot
snapshot method
wren_cuecount
cue count
wren_lt
line transect survey
wren_5min
: 134 observations of 8 variables
Region.Label
stratum name (single stratum)
Area
size (ha) of Montrave study area
Sample.Label
point label
Effort
Number of visits to point
object
Object ID
distance
radial distance (m)
direction
direction of detection from point
Study.Area
Montrave Estate
wren_snapshot
: 119 observations of 7 variables
Region.Label
stratum name (single stratum)
Area
size (ha) of Montrave study area
Sample.Label
point label
Effort
Number of visits to point
object
Object ID
distance
radial distance (m)
Study.Area
Montrave Estate
wren_cuecount
: 774 observations of 9 variables
Region.Label
stratum name (single stratum)
Area
size (ha) of Montrave study area
Sample.Label
point label
Cue.rate
Production rate (per min) of cues
Cue.rate.SE
SE of cue production rate
object
Object ID
distance
radial distance (m)
Search.time
Time (min) listening for cues
Study.Area
Montrave Estate
wren_lt
: 156 observations of 8 variables
Region.Label
stratum name (single stratum)
Area
size (ha) of Montrave study area
Sample.Label
transect label
Effort
transect length (km)
object
Object ID
distance
perpendicular distance (m)
Study.Area
Montrave Estate
Steve Buckland
Buckland, S. T. (2006) Point-transect surveys for songbirds: robust methodologies. The Auk 123 (2): 345–357.