Stanford University, Spring 2016, STATS 205

Previous Lectures

  • One-sample tests:
    • Sign test
    • Signed-Rank Wilcoxon
  • Estimators, confidence intervals, and robustness to outliers
  • Bootstrap
    • Error of estimators and significance for hypothesis testing
    • Complete enumerations
    • Tail probability
  • Observations from continuous distributions

Today

  • Observations from discrete distributions
  • Proportion problems
  • \(\chi^2\) Tests

Proportion Problems

  • Discrete variables
  • The random variable \(X\) consists of categories
  • For now, we focus on binary categories failure (0) and success (1)
  • \(X\) is a random variable distributed according to the Bernoulli distribution
    • with success probabiliy \(p\) and
    • failure probabililty \(1-p\)
  • We know that \(\operatorname{E}(X) = p\) and \(\operatorname{Var} = p(1-p)\)

Proportion Problems

  • Statistical problems can be
    • estimating \(p\)
    • forming confidence intervals around estimate \(\widehat{p}\)
    • and testing hypothesis \[H_0: p = p_0 \text{ versus } H_A: p \ne p_0\]
  • Let \(X_1,\dots,X_n\) iid Bernoulli with success probability \(p\) and \(S\) be the total number of successes
  • Then \(S\) follows a binomial distribution with \(n\) trials and success probability \(p\)
  • If \(p\) is unkown, we estimate \(p\) with \(\widehat{p} = \frac{S}{n}\)

Proportion Problems

n = 10; p = 1/2; nsim = 10000; obsv = rbinom(nsim, size = n, prob = p)

Proportion Problems

Proportion Problems

n = 100; p = 1/2; nsim = 10000; obsv = rbinom(nsim, size = n, prob = p)

Proportion Problems

Proportion Problems

  • As \(n \to \infty\) while \(p\) is fixed:
    • The de Moivre–Laplace theorem (a special case of the central limit theorem) says \(S\) approach a normal distribution
    • Easy confidence interval (just evaluate cdf of the normal)
    • Approximation of binomial distribution with \(n\) trials and \(p\) success by \(N(np,np(1-p))\)

Example: Squeaky Hip Replacements

143 subjects with ceramic hip replacements

Ten report that their hip replacements squeaked

phat = 10/143
phat
## [1] 0.06993007
zcv = qnorm(0.975)
phat+c(-1,1)*zcv*sqrt(phat*(1-phat)/143)
## [1] 0.02813069 0.11172945

We estimate between roughly 3 and 11% of patients who receive ceramic hip replacements will report squeaky replacements

Hypothesis Testing

  • Reject null hypotheis if \(|z|\) is large \[z = \frac{\widehat{p}-p_0}{\sqrt{p_0(1-p_0)/n}} \sim N(0,1)\]
  • \(z\) is asymptotically standard normal
  • An equivalent test statistic of \(|z|\) is \(z^2\)
  • Squared normal is distributed as \(\chi^2\) distribution

Example: Left-Handed Professional Ball Players

  • Theory:
    • Professional baseball players have different proportion of left handed player than left-handed people in genral population
    • From previous study, we know that general public is has a proportion of \(p_0 = 0.15\)
  • Hypothesis testing:
    • \(H_0: p = 0.15 \text{ versus } H_A: p \ne 0.15\)

Example: Left-Handed Professional Ball Players

library(Rfit)
head(baseball)
##   height weight bat throw field average
## 1     74    218   R     L     0   3.330
## 2     75    185   R     R     1   0.286
## 3     77    219   L     L     0   3.040
## 4     73    185   R     R     1   0.271
## 5     69    160   S     R     1   0.242
## 6     73    222   R     R     0   3.920

Example: Left-Handed Professional Ball Players

ind = with(baseball,throw=='L')
n = length(ind)
phat = sum(ind)/n
phat
## [1] 0.2542373
p0 = 0.15
z = (phat-p0)/(sqrt(p0*(1-p0)/n))
pvalue = 1-pchisq(z^2,df=1)
pvalue
## [1] 0.02494189

What is Nonparametrics Statistics Again?

  • In all three nonparametric test (sign, signed-rank, \(\chi^2\)) no assumption on variances of observations
  • In contrast, in the \(t\)-test variances are estimated

Why Not Use Finite Sample Binomial Test?

Since we know that \(S\) follows a binomial distribution, why don't we use it?

## 
##  Exact binomial test
## 
## data:  sum(ind) and n
## number of successes = 15, number of trials = 59, p-value = 0.04192
## alternative hypothesis: true probability of success is not equal to 0.15
## 95 percent confidence interval:
##  0.1498208 0.3844241
## sample estimates:
## probability of success 
##              0.2542373
  • Finite sample \(p\)-values have upper bounded significance level \(\alpha\)
  • Asymptotic \(p\)-values may be above above \(\alpha\)

Why Not Use Finite Sample Binomial Test?

Problem is due to discreteness

Example: \(n = 5\) and test \(H_0: p = 0.5\) versus \(H_A: p \ne 0.5\)

Null distribution of \(S\) is binomial with \(n = 5\) and \(p = 0.5\)

Suppose outcome is \(S = 5\) (most extreme observation)

n = 5; S = 5; p = 0.5
phat = S/n
pvalue = 2*p^S; pvalue
## [1] 0.0625

Problem is that the null hypethsis can never be true below \(\alpha = 0.05\)

So in this case, \(\alpha\) has no meaning

Discrete Random Variable (RV)

  • Extension from two categories to multiple caterogries
  • Consider discrete RV \(X\) with \(1,2,\dots,c\) categories
  • Let \(p(j) = P(X = j)\) define the probabiliy mass function
  • We wish to test: \[H_0: p(j) = p_0(j), j = 1,\dots,c\] \[H_A: p(j) \ne p_0(j), \text{ for some } j\]

Discrete RV

  • \(X_1,\dots,X_n\) is a random sample on \(X\)
  • Let \(O_j = \#\{ X_i = j \}\)
  • Observed frequencies are constrained \(\sum_{j=1}^c O_j = n\)
  • The expected frequency for category \(j\) is \(\operatorname{E}_j = \operatorname{E}_{H_0}(O_j)\)
  • Two cases for \(H_0\)

Discrete RV

  • Case 1:
    • All \(p_0(j)\) are specified
    • So we get \(E_j = np_0(j)\)
    • Test stastitics is \[\chi^2 = \sum_{j=1}^c \frac{(O_j-E_j)^2}{E_j}\]
  • Hypothesis \(H_0\) is rejected in favor of \(H_A\) for large values of \(\chi^2\)
  • Observed frequencies, \((O_1,\dots,O_c)^T\) has a multinomial distribution, so exact distribution can be obtained
  • Asymptotically \(\chi^2\) distribution with \(c−1\) degrees of freedom
  • If we know \(c-1\) frequencies, we can calculate the \(c\)th from total \(n\)

Discrete RV Example

Roll a dice \(n = 370\) times

Observe frequencies

O = c(58,55,62,68,66,61)
n = sum(O); n
## [1] 370

Test whether dice is fair \(p(j) \equiv 1/6\)

p0 = 1/6
E = rep(n*p0,6)
Chi2_0 = sum((O-E)^2/E); Chi2_0
## [1] 1.902703

Discrete RV Example

Assymptotically equal to \(\chi^2\) with \(c-1\) degress of freedom

pvalue = 1-pchisq(Chi2_0,df=6-1); pvalue
## [1] 0.8624375

Thus there is no evidence to support the dice being unfair

Discrete RV

  • Case 2:
    • Only form of pmf is known
    • Have to estimate \(p\)
    • Same test stastitics but now with estimate \(\widehat{p}\) \[\chi^2 = \sum_{j=1}^c \frac{(O_j-E_j)^2}{E_j}\]
  • Hypothesis \(H_0\) is rejected in favor of \(H_A\) for large values of \(\chi^2\)

Discrete RV Example

Number of males in the first seven children for \(n = 1334\) Swedish ministers of religion

males = 0:7
ministers = c(6,57,206,362,365,256,69,13)
n = sum(ministers); n
## [1] 1334
df = data.frame(ministers=ministers,males=males); t(df)
##           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## ministers    6   57  206  362  365  256   69   13
## males        0    1    2    3    4    5    6    7

For example, 206 ministers had 2 sons and 5 daughters in their first 7 children

Discrete RV Example

The maximum likelihood estimator of \(p\) is

nChildren = n*7
nMale = sum(df$ministers*df$males)
phat = nMale/nChildren; phat
## [1] 0.5140287
p0 = dbinom(males,7,phat)
E = n*p0
##           [,1] [,2]  [,3]  [,4]  [,5]  [,6] [,7] [,8]
## E          8.5 63.2 200.6 353.7 374.1 237.4 83.7 12.6
## ministers  6.0 57.0 206.0 362.0 365.0 256.0 69.0 13.0
## males      0.0  1.0   2.0   3.0   4.0   5.0  6.0  7.0

Discrete RV Example

Chi2_0 = sum((df$ministers-E)^2/E)
pvalue = 1-pchisq(Chi2_0,df=8-1-1); pvalue
## [1] 0.4257546

No evidence to refute a binomial probability model for the number of sons in the first seven children of a Swedish minister

Discrete RV

  • Alternatively to testing for deviation from a model
  • We can get a confidence interval on pairwise difference in proportions \(\widehat{p}_j - \widehat{p}_k\)
  • The confidence intervals are easy again because, we assume asymptotic normallity

Discrete RV Example

Difference in the probabilities of all daughters or all sons

##           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## ministers    6   57  206  362  365  256   69   13
## males        0    1    2    3    4    5    6    7

6 ministers had no sons, and 13 ministers had all sons

n = 1334; p0 = 6/n; p7 = 13/n
se = sqrt((p0+p7-(p0-p7)^2)/n)
zcv = qnorm(0.975)
lb = p0-p7 - zcv*se; ub = p0-p7 + zcv*se; res = c(p0-p7,lb,ub); res
## [1] -0.005247376 -0.011645444  0.001150692

Confidence interval covers 0, thus no significant difference in the proportions

Several Discrete RVs

  • Goal is to compare several discrete RV, which have same range \(\{ 1,2,\dots,c \}\)
  • Consider hypothesis test:
    • \(H_0:\) \(X_1,\dots,X_r\) have the same distribution
    • \(H_A:\) Distributions of \(X_i\) and \(X_j\) differ for some \(i \ne j\)
  • Total number of samples \(n = \sum_{i=1}^r n_i\)
  • Observed frequencies: \[O_{ij} = \#\{ \text{sample items in sample drawn on } X_i \text{ such that } X_i = j\},\]
  • for \(i = 1,\dots,r\) and \(j = 1,\dots,c\)
  • \(O_{ij}\) is a \(r \times c\) matrix of observed frequency
  • They are called contingency tables

Several Discrete RVs

  • Compare observed frequencies to the expected frequencies under \(H_0\)
  • Estimate the common distribution \((p_1,\dots,p_c)^T\), where \(p_j\) is the probability that category \(j\) occurs
  • Estimate probability of category \(j\) overall \[ \widehat{p}_j = \frac{\sum_{i=1}^r O_{ij}}{n}, j = 1,\dots,c\]
  • Estimate expected frequencies \(\widehat{E}_{ij} = n_i \widehat{p}_j\)
  • Notice that the sample size can vary between variables

Several Discrete RVs

  • Test statistics \[\chi^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij}-\widehat{E}_{ij})^2}{\widehat{E}_{ij}}\]
  • Asymptotically \(\chi^2\) with \((r−1)(c−1)\) degrees of freedom
  • This is called test for homogeneity

Several Discrete RVs Example

Distribution of alcoholic status same for different type of crime?

Contingency table with frequencies of
criminals who committed crimes (6 RV's) and
their alcoholics status (category: alcoholic and non-alcoholic)

##          Alcoholic Non-Alcoholic
## Arson           50            43
## Rape            88            62
## Violence       155           110
## Theft          379           300
## Coining         18            14
## Fraud           63           144

Several Discrete RVs Example

## 
##  Pearson's Chi-squared test
## 
## data:  ct
## X-squared = 49.731, df = 5, p-value = 1.573e-09
Chi2_0 = (chifit$observed-chifit$expected)^2/chifit$expected; Chi2_0
##            Alcoholic Non-Alcoholic
## Arson     0.01617684    0.01809979
## Rape      0.97600214    1.09202023
## Violence  1.62222220    1.81505693
## Theft     1.16680759    1.30550686
## Coining   0.07191850    0.08046750
## Fraud    19.61720859   21.94912045

Several Discrete RVs Example

Most of the contribution to the test statistic comes from the crime fraud

Eliminate fraud and retest

## 
##  Pearson's Chi-squared test
## 
## data:  ct[-6, ]
## X-squared = 1.1219, df = 4, p-value = 0.8908

Conclusion:
Conditional on the criminal not committing fraud,
cannot reject that alcoholic status has same distribution for all crimes