| empinf {boot} | R Documentation | 
This function calculates the empirical influence values for a statistic applied to a data set. It allows four types of calculation, namely the infinitesimal jackknife (using numerical differentiation), the usual jackknife estimates, the "positive" jackknife estimates and a method which estimates the empirical influence values using regression of bootstrap replicates of the statistic. All methods can be used with one or more samples.
empinf(boot.out=NULL, data=NULL, statistic=NULL, 
       type=NULL, stype="w", index=1, t=NULL,
       strata=rep(1, n), eps=0.001, ...)
| boot.out | A bootstrap object created by the function boot.  Iftypeis"reg"then 
this argument is required.  For any of the other types it is 
an optional argument.  If it is included when optional then the values ofdata,statistic,stype, andstrataare taken from the components ofboot.outand any values passed toempinfdirectly are ignored. | 
| data | A vector, matrix or data frame containing
the data for which empirical influence values are required.  It is a required
argument if boot.outis not supplied.  Ifboot.outis supplied thendatais set toboot.out$dataand any value supplied is ignored. | 
| statistic | The statistic for which empirical influence values are required.  It must be
a function of at least two arguments, the data set and 
a vector of weights, frequencies or indices.  The nature of the second
argument is given by the value of stype.  Any other arguments that it 
takes must be supplied toempinfand will be passed tostatisticunchanged.
This is a required argument ifboot.outis not supplied, otherwise its value
is taken fromboot.outand any value supplied here will be ignored. | 
| type | The calculation type to be used for the empirical influence values.  
Possible values of typeare"inf"(infinitesimal jackknife),"jack"(usual jackknife),"pos"(positive jackknife), and"reg"(regression
estimation).  The default value depends on the other arguments.  Iftis supplied then the default value oftypeis"reg"andboot.outshould be present so that its frequency array can be found.  Ittis not 
supplied then ifstypeis"w", the default value oftypeis"inf"; otherwise, ifboot.outis present the default is"reg".  If 
none of these conditions apply then the default is"jack".  
Note that it is an error fortypeto be"reg"ifboot.outis missing or to be"inf"ifstypeis not"w". | 
| stype | A character variable giving the nature of the second argument to statistic.
It can take on three values:"w"(weights),"f"(frequencies), or"i"(indices).  Ifboot.outis supplied the value ofstypeis set toboot.out$stypeand any value supplied here is ignored.  Otherwise it is an
optional argument which defaults to"w".  Iftypeis"inf"thenstypeMUST be"w". | 
| index | An integer giving the position of the variable of interest in the output of statistic. | 
| t | A vector of length boot.out$Rwhich gives the bootstrap replicates of the
statistic of interest.tis used only whentypeisregand it defaults
toboot.out$t[,index]. | 
| strata | An integer vector or a factor specifying the strata for multi-sample problems.
If boot.outis supplied  the value ofstratais set toboot.out$strata.  
Otherwise it is an optional argument which has default corresponding to the 
single sample situation. | 
| eps | This argument is used only if typeis"inf".  In that case the value of
epsilon to be used for numerical differentiation will beepsdivided by
the number of observations indata. | 
| ... | Any other arguments that statistictakes.  They will be passed unchanged tostatisticevery time that it is called. | 
If type is "inf" then numerical differentiation is used to approximate the
empirical influence values.  This makes sense only for statistics which
are written in weighted form (i.e. stype is "w").  If type is "jack"
then the
usual leave-one-out jackknife estimates of the empirical influence are 
returned.  If type is "pos" then the positive (include-one-twice) jackknife
values are used.  If type is "reg" then a bootstrap object must be supplied.
The regression method then works by regressing the bootstrap replicates of
statistic on the frequency array from which they were derived.  The 
bootstrap frequency array is obtained through a call to boot.array.  Further
details of the methods are given in Section 2.7 of Davison and Hinkley (1997).
Empirical influence values are often used frequently in nonparametric bootstrap
applications.  For this reason many other functions call empinf when they
are required.  Some examples of their use are for nonparametric delta estimates
of variance, BCa intervals and finding linear approximations to statistics for 
use as control variates.  They are also used for antithetic bootstrap 
resampling.
A vector of the empirical influence values of statistic applied to data.
The values will be in the same order as the observations in data.
All arguments to empinf must be passed using the name=value convention.  If
this is not followed then unpredictable errors can occur.
Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics, 38, SIAM.
Fernholtz, L.T. (1983) von Mises Calculus for Statistical Functionals. Lecture Notes in Statistics, 19, Springer-Verlag.
boot, boot.array, boot.ci, control, jack.after.boot, linear.approx, var.linear
# The empirical influence values for the ratio of means in 
# the city data.
ratio <- function(d, w) sum(d$x *w)/sum(d$u*w)
empinf(data=city,statistic=ratio)
city.boot <- boot(city,ratio,499,stype="w")
empinf(boot.out=city.boot,type="reg")
# A statistic that may be of interest in the difference of means
# problem is the t-statistic for testing equality of means.  In 
# the bootstrap we get replicates of the difference of means and 
# the variance of that statistic and then want to use this output
# to get the empirical influence values of the t-statistic.
grav1 <- gravity[as.numeric(gravity[,2])>=7,]
grav.fun <- function(dat, w)
{    strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
     d <- dat[, 1]
     ns <- tabulate(strata)
     w <- w/tapply(w, strata, sum)[strata]
     mns <- tapply(d * w, strata, sum)
     mn2 <- tapply(d * d * w, strata, sum)
     s2hat <- sum((mn2 - mns^2)/ns)
     c(mns[2]-mns[1],s2hat)
}
grav.boot <- boot(grav1, grav.fun, R=499, stype="w", strata=grav1[,2])
# Since the statistic of interest is a function of the bootstrap
# statistics, we must calculate the bootstrap replicates and pass
# them to empinf using the t argument.
grav.z <- (grav.boot$t[,1]-grav.boot$t0[1])/sqrt(grav.boot$t[,2])
empinf(boot.out=grav.boot,t=grav.z)