Stanford University, Spring 2016, STATS 205

## Previous Lectures

• One-sample tests:
• Sign test
• Signed-Rank Wilcoxon
• Estimators, confidence intervals, and robustness to outliers
• Bootstrap
• Error of estimators and significance for hypothesis testing
• Complete enumerations
• Tail probability
• Observations from continuous distributions

## Today

• Observations from discrete distributions
• Proportion problems
• $$\chi^2$$ Tests

## Proportion Problems

• Discrete variables
• The random variable $$X$$ consists of categories
• For now, we focus on binary categories failure (0) and success (1)
• $$X$$ is a random variable distributed according to the Bernoulli distribution
• with success probabiliy $$p$$ and
• failure probabililty $$1-p$$
• We know that $$\operatorname{E}(X) = p$$ and $$\operatorname{Var} = p(1-p)$$

## Proportion Problems

• Statistical problems can be
• estimating $$p$$
• forming confidence intervals around estimate $$\widehat{p}$$
• and testing hypothesis $H_0: p = p_0 \text{ versus } H_A: p \ne p_0$
• Let $$X_1,\dots,X_n$$ iid Bernoulli with success probability $$p$$ and $$S$$ be the total number of successes
• Then $$S$$ follows a binomial distribution with $$n$$ trials and success probability $$p$$
• If $$p$$ is unkown, we estimate $$p$$ with $$\widehat{p} = \frac{S}{n}$$

## Proportion Problems

n = 10; p = 1/2; nsim = 10000; obsv = rbinom(nsim, size = n, prob = p)

## Proportion Problems

n = 100; p = 1/2; nsim = 10000; obsv = rbinom(nsim, size = n, prob = p)

## Proportion Problems

• As $$n \to \infty$$ while $$p$$ is fixed:
• The de Moivreâ€“Laplace theorem (a special case of the central limit theorem) says $$S$$ approach a normal distribution
• Easy confidence interval (just evaluate cdf of the normal)
• Approximation of binomial distribution with $$n$$ trials and $$p$$ success by $$N(np,np(1-p))$$

## Example: Squeaky Hip Replacements

143 subjects with ceramic hip replacements

Ten report that their hip replacements squeaked

phat = 10/143
phat
## [1] 0.06993007
zcv = qnorm(0.975)
phat+c(-1,1)*zcv*sqrt(phat*(1-phat)/143)
## [1] 0.02813069 0.11172945

We estimate between roughly 3 and 11% of patients who receive ceramic hip replacements will report squeaky replacements

## Hypothesis Testing

• Reject null hypotheis if $$|z|$$ is large $z = \frac{\widehat{p}-p_0}{\sqrt{p_0(1-p_0)/n}} \sim N(0,1)$
• $$z$$ is asymptotically standard normal
• An equivalent test statistic of $$|z|$$ is $$z^2$$
• Squared normal is distributed as $$\chi^2$$ distribution

## Example: Left-Handed Professional Ball Players

• Theory:
• Professional baseball players have different proportion of left handed player than left-handed people in genral population
• From previous study, we know that general public is has a proportion of $$p_0 = 0.15$$
• Hypothesis testing:
• $$H_0: p = 0.15 \text{ versus } H_A: p \ne 0.15$$

## Example: Left-Handed Professional Ball Players

library(Rfit)
head(baseball)
##   height weight bat throw field average
## 1     74    218   R     L     0   3.330
## 2     75    185   R     R     1   0.286
## 3     77    219   L     L     0   3.040
## 4     73    185   R     R     1   0.271
## 5     69    160   S     R     1   0.242
## 6     73    222   R     R     0   3.920

## Example: Left-Handed Professional Ball Players

ind = with(baseball,throw=='L')
n = length(ind)
phat = sum(ind)/n
phat
## [1] 0.2542373
p0 = 0.15
z = (phat-p0)/(sqrt(p0*(1-p0)/n))
pvalue = 1-pchisq(z^2,df=1)
pvalue
## [1] 0.02494189

## What is Nonparametrics Statistics Again?

• In all three nonparametric test (sign, signed-rank, $$\chi^2$$) no assumption on variances of observations
• In contrast, in the $$t$$-test variances are estimated

## Why Not Use Finite Sample Binomial Test?

Since we know that $$S$$ follows a binomial distribution, why don't we use it?

##
##  Exact binomial test
##
## data:  sum(ind) and n
## number of successes = 15, number of trials = 59, p-value = 0.04192
## alternative hypothesis: true probability of success is not equal to 0.15
## 95 percent confidence interval:
##  0.1498208 0.3844241
## sample estimates:
## probability of success
##              0.2542373
• Finite sample $$p$$-values have upper bounded significance level $$\alpha$$
• Asymptotic $$p$$-values may be above above $$\alpha$$

## Why Not Use Finite Sample Binomial Test?

Problem is due to discreteness

Example: $$n = 5$$ and test $$H_0: p = 0.5$$ versus $$H_A: p \ne 0.5$$

Null distribution of $$S$$ is binomial with $$n = 5$$ and $$p = 0.5$$

Suppose outcome is $$S = 5$$ (most extreme observation)

n = 5; S = 5; p = 0.5
phat = S/n
pvalue = 2*p^S; pvalue
## [1] 0.0625

Problem is that the null hypethsis can never be true below $$\alpha = 0.05$$

So in this case, $$\alpha$$ has no meaning

## Discrete Random Variable (RV)

• Extension from two categories to multiple caterogries
• Consider discrete RV $$X$$ with $$1,2,\dots,c$$ categories
• Let $$p(j) = P(X = j)$$ define the probabiliy mass function
• We wish to test: $H_0: p(j) = p_0(j), j = 1,\dots,c$ $H_A: p(j) \ne p_0(j), \text{ for some } j$

## Discrete RV

• $$X_1,\dots,X_n$$ is a random sample on $$X$$
• Let $$O_j = \#\{ X_i = j \}$$
• Observed frequencies are constrained $$\sum_{j=1}^c O_j = n$$
• The expected frequency for category $$j$$ is $$\operatorname{E}_j = \operatorname{E}_{H_0}(O_j)$$
• Two cases for $$H_0$$

## Discrete RV

• Case 1:
• All $$p_0(j)$$ are specified
• So we get $$E_j = np_0(j)$$
• Test stastitics is $\chi^2 = \sum_{j=1}^c \frac{(O_j-E_j)^2}{E_j}$
• Hypothesis $$H_0$$ is rejected in favor of $$H_A$$ for large values of $$\chi^2$$
• Observed frequencies, $$(O_1,\dots,O_c)^T$$ has a multinomial distribution, so exact distribution can be obtained
• Asymptotically $$\chi^2$$ distribution with $$câˆ’1$$ degrees of freedom
• If we know $$c-1$$ frequencies, we can calculate the $$c$$th from total $$n$$

## Discrete RV Example

Roll a dice $$n = 370$$ times

Observe frequencies

O = c(58,55,62,68,66,61)
n = sum(O); n
## [1] 370

Test whether dice is fair $$p(j) \equiv 1/6$$

p0 = 1/6
E = rep(n*p0,6)
Chi2_0 = sum((O-E)^2/E); Chi2_0
## [1] 1.902703

## Discrete RV Example

Assymptotically equal to $$\chi^2$$ with $$c-1$$ degress of freedom

pvalue = 1-pchisq(Chi2_0,df=6-1); pvalue
## [1] 0.8624375

Thus there is no evidence to support the dice being unfair

## Discrete RV

• Case 2:
• Only form of pmf is known
• Have to estimate $$p$$
• Same test stastitics but now with estimate $$\widehat{p}$$ $\chi^2 = \sum_{j=1}^c \frac{(O_j-E_j)^2}{E_j}$
• Hypothesis $$H_0$$ is rejected in favor of $$H_A$$ for large values of $$\chi^2$$

## Discrete RV Example

Number of males in the first seven children for $$n = 1334$$ Swedish ministers of religion

males = 0:7
ministers = c(6,57,206,362,365,256,69,13)
n = sum(ministers); n
## [1] 1334
df = data.frame(ministers=ministers,males=males); t(df)
##           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## ministers    6   57  206  362  365  256   69   13
## males        0    1    2    3    4    5    6    7

For example, 206 ministers had 2 sons and 5 daughters in their first 7 children

## Discrete RV Example

The maximum likelihood estimator of $$p$$ is

nChildren = n*7
nMale = sum(df$ministers*df$males)
phat = nMale/nChildren; phat
## [1] 0.5140287
p0 = dbinom(males,7,phat)
E = n*p0
##           [,1] [,2]  [,3]  [,4]  [,5]  [,6] [,7] [,8]
## E          8.5 63.2 200.6 353.7 374.1 237.4 83.7 12.6
## ministers  6.0 57.0 206.0 362.0 365.0 256.0 69.0 13.0
## males      0.0  1.0   2.0   3.0   4.0   5.0  6.0  7.0

Chi2_0 = sum((df$ministers-E)^2/E) pvalue = 1-pchisq(Chi2_0,df=8-1-1); pvalue ## [1] 0.4257546 No evidence to refute a binomial probability model for the number of sons in the first seven children of a Swedish minister ## Discrete RV • Alternatively to testing for deviation from a model • We can get a confidence interval on pairwise difference in proportions $$\widehat{p}_j - \widehat{p}_k$$ • The confidence intervals are easy again because, we assume asymptotic normallity ## Discrete RV Example Difference in the probabilities of all daughters or all sons ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] ## ministers 6 57 206 362 365 256 69 13 ## males 0 1 2 3 4 5 6 7 6 ministers had no sons, and 13 ministers had all sons n = 1334; p0 = 6/n; p7 = 13/n se = sqrt((p0+p7-(p0-p7)^2)/n) zcv = qnorm(0.975) lb = p0-p7 - zcv*se; ub = p0-p7 + zcv*se; res = c(p0-p7,lb,ub); res ## [1] -0.005247376 -0.011645444 0.001150692 Confidence interval covers 0, thus no significant difference in the proportions ## Several Discrete RVs • Goal is to compare several discrete RV, which have same range $$\{ 1,2,\dots,c \}$$ • Consider hypothesis test: • $$H_0:$$ $$X_1,\dots,X_r$$ have the same distribution • $$H_A:$$ Distributions of $$X_i$$ and $$X_j$$ differ for some $$i \ne j$$ • Total number of samples $$n = \sum_{i=1}^r n_i$$ • Observed frequencies: $O_{ij} = \#\{ \text{sample items in sample drawn on } X_i \text{ such that } X_i = j\},$ • for $$i = 1,\dots,r$$ and $$j = 1,\dots,c$$ • $$O_{ij}$$ is a $$r \times c$$ matrix of observed frequency • They are called contingency tables ## Several Discrete RVs • Compare observed frequencies to the expected frequencies under $$H_0$$ • Estimate the common distribution $$(p_1,\dots,p_c)^T$$, where $$p_j$$ is the probability that category $$j$$ occurs • Estimate probability of category $$j$$ overall $\widehat{p}_j = \frac{\sum_{i=1}^r O_{ij}}{n}, j = 1,\dots,c$ • Estimate expected frequencies $$\widehat{E}_{ij} = n_i \widehat{p}_j$$ • Notice that the sample size can vary between variables ## Several Discrete RVs • Test statistics $\chi^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij}-\widehat{E}_{ij})^2}{\widehat{E}_{ij}}$ • Asymptotically $$\chi^2$$ with $$(râˆ’1)(câˆ’1)$$ degrees of freedom • This is called test for homogeneity ## Several Discrete RVs Example Distribution of alcoholic status same for different type of crime? Contingency table with frequencies of criminals who committed crimes (6 RV's) and their alcoholics status (category: alcoholic and non-alcoholic) ## Alcoholic Non-Alcoholic ## Arson 50 43 ## Rape 88 62 ## Violence 155 110 ## Theft 379 300 ## Coining 18 14 ## Fraud 63 144 ## Several Discrete RVs Example ## ## Pearson's Chi-squared test ## ## data: ct ## X-squared = 49.731, df = 5, p-value = 1.573e-09 Chi2_0 = (chifit$observed-chifit$expected)^2/chifit$expected; Chi2_0
##            Alcoholic Non-Alcoholic
## Arson     0.01617684    0.01809979
## Rape      0.97600214    1.09202023
## Violence  1.62222220    1.81505693
## Theft     1.16680759    1.30550686
## Coining   0.07191850    0.08046750
## Fraud    19.61720859   21.94912045

## Several Discrete RVs Example

Most of the contribution to the test statistic comes from the crime fraud

Eliminate fraud and retest

##
##  Pearson's Chi-squared test
##
## data:  ct[-6, ]
## X-squared = 1.1219, df = 4, p-value = 0.8908

Conclusion:
Conditional on the criminal not committing fraud,
cannot reject that alcoholic status has same distribution for all crimes