Robustness

Stanford University, Spring 2016, STATS 205

Robustness of Estimators

Robustness properties of the three estimators the mean, the median, and the Hodges–Lehmann estimate.
Main measures of robustness:
- efficiency,
- influence,
- and breakdown
Today, we focus on influence and breakdown

Sensitivity Curve

Function of the observations
The parameter \(\theta\) is a function of the cdf \(F\)
The observations \(x_1,x_2,\dots,x_n\) are drawn from \(F\)
Denote \(\widehat{\theta}\) as the estimator of \(\theta\) in the sample
Measure change in estimator \(\widehat{\theta}_n\) when outlier \(x\) is added to a sample \(x_n\) \[ x_n = ( x_1,x_2,\dots,x_n )^T \] \[ x_{n+1} = ( x_1,x_2,\dots,x_n,x )^T \]
The sensitivity curve of an estimator is \[ S(x,\widehat{\theta}) = \frac{\widehat{\theta}_{n+1} - \widehat{\theta}_n}{1/(n+1)} \]

Sensitivity Curve Example

x_n = c(1.85,2.35,-3.85,-5.25,-0.15,2.15,0.15,-0.25,-0.55,2.65)
mean(x_n)

## [1] -0.09

median(x_n)

## [1] 0

hl.loc(x_n)

## [1] 0

Sensitivity Curve Example

Mean is unbounded
Median and Hodges–Lehmann is bounded (estimators change slightly and changes soon become constant)

Influence Function

Sensitivity curve depends on sample \(x_1,x_2,\dots,x_n\)
Influence function depends on the population distribution \(F\)
It's a functional: a function that takes another function as its argument
Denote functional as \(T(F)\), e.g. median \(T(F) = F^{-1}(1/2)\)
Then using Gateaux derivatives (directional derivatives for functionals) \[ L_F(x) = lim_{\epsilon\to 0} \frac{T((1-\epsilon)F + \epsilon \delta_x)-T(F)}{\epsilon} \] with the point mass \(\delta_x\) corresponding to adding an outlier \(x\)

Influence Function

An estimator is robust if its influence function is bounded
The influence function for our estimators are (up to constant of proportionality and center)
- Mean: \(x\),
- Median: \(\operatorname{sign}(x)\), and
- HL: \(F(x) − 0.5\)
Hence, mean is not robust, but median and HL are robust

Influence Function Example

Breakdown Point of an Estimator

Suppose we contaminate \(n-m\) points in our sample \[ \boldsymbol{x}_n^* = (x_1,\dots,x_m,x_{m+1}^*,\dots,x_n^*) \]
Consider the contamination to be as very large (close to \(\infty\))
Breaking point: the smallest value \(n-m\) so that \(\widehat{\theta}(\boldsymbol{x}_n^*)\) is meaningless
Finite sample breakdown point: The ratio \(\frac{n-m}{n}\)
Asymptotic breakdown point: If it converges to a finite value as \(n \to \infty\)

Breakdown Point of an Estimator

Sample mean:
- Finite sample BP: \(\frac{1}{n}\)
- Asymptotic BP: \(0\)
Sample median:
- Finite sample BP: \(\operatorname{floor}\left( \frac{n − 1}{2} \right)\)
- Asymptotic BP: \(0.5\)
Hodges–Lehmann estimator:
- Asymptotic BP: \(0.29\)

Breakdown Point of an Estimator

The median of the Walsh averages, stays with the majority.

\[ \begin{align} \# \text{good points} & > \# \text{bad points} \\ \frac{m(m+1)}{2} & > \frac{n(n+1) - m(m+1)}{2} \\ m(m+1) & > \frac{n(n+1)}{2} \end{align} \]

Breakdown Point of an Estimator

Obtaining finite sample BP is messy, asymptotic is easier: \[m(m+1) > \frac{n(n+1)}{2} \Leftrightarrow \frac{m(m+1)}{n(n+1)} > \frac{1}{2}\]
let \(x = m/n\) and \(x \approx (m+1)/(n+1)\)
then \(x^2 > \frac{1}{2} \Leftrightarrow x > \frac{1}{\sqrt{2}}\)
convert to BP \(\frac{1}{\#bad} = 1 - \frac{1}{\#good}\): \[1-\frac{1}{\sqrt{2}} \approx 0.29\]
More details are here

Breakdown Point of an Estimator

Instead of testing for normality to decide whether to use a \(t\)-test or the Wilcoxon test
It is easier to use the notion of robustness of an estimator and their associated procedures
There are too many ways that a distribution can deviate from normality
How much robustness do we need? Is \(0.29\) enough or \(0.5\) really necessary?
Sample median preferred over Hodges–Lehmann estimator?
- Based on breakdown point: yes
- This however ignores estimator efficiency (we talk about this later)