Introduction to Nonparametric Statistics

Description

Nonparametric analogs of the one- and two-sample t-tests and analysis of variance; the sign test, median test, Wilcoxon’s tests, and the Kruskal-Wallis and Friedman tests, tests of independence. Nonparametric regression and nonparametric density estimation, modern nonparametric techniques, nonparametric confidence interval estimates.

Textbooks

Required

KM: Kloke and McKean (2015). Nonparametric Statistical Methods Using R

Optional

ET: Efron and Tibshirani (1994). An Introduction to the Bootstrap
FM: Friendly and Meyer (2015). Discrete Data Analysis with R
G: Good (2005). Permutations, Parametric, and Boostrap Test of Hypothesis
Ha: Hall (1992). The Bootstrap and Edgeworth Expansion
HP: Henderson and Parmeter (2015). Applied Nonparametric Econometrics
HMT: Hoaglin, Mosteller, and Tukey (1983). Understanding Robust and Exploratory Data Analysis
HWC: Hollander, Wolfe, and Chicken (2013). Nonparametric Statistical Methods
Lo: Lovasz (2012). Large Networks and Graph Limits
L: Lehmann (2006). Nonparametrics Statistical Methods Based on Ranks
MQJH: Müller, Quintana, Jara, and Hanson (2015). Bayesian Nonparametric Data Analysis
PE: Patrangenaru and Ellingson (2015). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis
RW: Rasmussen and Williams (2006). Gaussian Processes for Machine Learning
S: Silverman (1986). Density Estimation for Statistics and Data Analysis
T: Tsybakov (2009). Introduction to Nonparametric Estimation
W: Wasserman (2006). All of Nonparametric Statistics

Instructor

Christof Seiler, Sequoia Hall 116 (christof.seiler [at] stanford [dot] edu)
Office hours: Wednesdays from 10:00 to 11:30 am in 105 at Sequoia

TA’s

Nan Bi (nbi [at] stanford [dot] edu)
Office hours:
- Wednesdays from 2:30 to 3:30 pm in 420-147
- Thursdays from 10:30 to 11:30 am in Fishbowl at Sequoia
Lexi Guan (lguan [at] stanford [dot] edu)
Office hours:
- Monday from 10:00 to 11:00 am in Bowker at Sequoia
- Friday from 4:00 to 5:00 pm in Fishbowl at Sequoia

Grading

Midterm Project Proposal (3 pages with references, 10%, due by April 29th)
Final Project (12 pages plus references, 40% due by June 3rd)
Projects can be done alone or in pairs
Weekly homework (40%)
Class participation (10%)

Midterm Project Content

Some optional guidelines:

State the problem
Describe the data
Review what statistical methods are available to analyze your data
List their advantages and disadvantages, in particular compare nonparametric to parameteric methods
Propose a solution using nonparametric methods
List all the tasks that you plan to do: collecting data, programming, simulating data, estimating, testing, etc.

Final Project

Example of an excellent final project on using kernel density estimation to predict conflicts in the Congo:

https://github.com/jmoore523/STATS205-DRCConflict

Slides

Lecture	Topic(s)	Background Material
1	Logistics and Introduction	KM Chapter 1
2	Sign Test and Signed-Rank Wilcoxon	KM Chapters 2.1, 2.2, and 2.3
3	Robustness	KM Chapter 2.5
4	Bootstrap (Part 1) and Bootstrap (Example)	KM Chapter 2.4
5	Bootstrap (Part 2)	KM Chapter 2.4
6	Proportion Problems and \( \chi^2 \) Tests (Part 1)	KM Chapters 2.6 and 2.7
7	\( \chi^2 \) Tests (Part 2)	KM Chapter 2.7
8	Two-Sample Problems (Part 1)	KM Chapters 3.1 and 3.2
9	Two-Sample Problems (Part 2)	KM Chapters 3.2 and 3.4
10	Permutation Tests (Part 1)	G Chapter 1
11	Permutation Tests (Part 2) and Neuroimaging (Example)	WRR and NH
12	Rank-Based Linear Regression	KM Chapters 4.1, 4.2, 4.3, 4.4, and 4.8
13	Nonlinear Regression (Part 1)	W Chapter 4
14	Nonlinear Regression (Part 2)	W Chapter 5
15	Nonlinear Regression (Part 3)	W Chapter 5
16	Bayesian Nonparametrics (Part 1)	W16
17	Bayesian Nonparametrics (Part 2) and BNP in Practice	W16
18	ANOVA	KM Chapters 5.1, 5.2, 8.1, 8.2, HMT, and G16
19	Survival Analysis (Part 1)	KM Chapters 6.1 and 6.2
20	Survival Analysis (Part 2) and Midterm Proposal Discussion	KM Chapters 6.1 and 6.2
21	Ranked Set Sampling	HWC Chapter 15 and NWC
22	Wavelets	HWC Chapter 13 and W Chapter 9
23	Graph Limits or Graphons	Lo Part 1 and Ch
24	Inference for Data Visualization	D83, BHLLSW, and JWH
25	Multivariate Nonparametric Tests	Tu, Ho, RH, and FR
26	Bootstrap (Part 3)	ET Chapters 12 and Ha, and Lo
27	Bootstrap (Part 4)	ET Chapters 14 and Ha
28	Wrapup

Homework

Assignment	Deadline	Solution
Homework 1	April 7th at 1:30 pm	Solution 1
Homework 2	April 15th at 1:30 pm	Solution 2
Homework 3	April 25th at 1:30 pm	Solution 3
Homework 4	May 10th at 1:30 pm	Solution 4
Homework 5	May 19th at 1:30 pm	Solution 5
Homework 6	May 27th at 1:30 pm	Solution 6

Late Homework Policy

We will deduct 20% from maximum scores for each late day
Each student can hand in one homework late (within two days after the deadline)
Please contact me in case of emergencies

Christof Seiler