Introduction to Nonparametric Statistics
Description
Nonparametric analogs of the one- and two-sample t-tests and analysis of variance; the sign test, median test, Wilcoxon’s tests, and the Kruskal-Wallis and Friedman tests, tests of independence. Nonparametric regression and nonparametric density estimation, modern nonparametric techniques, nonparametric confidence interval estimates.
Textbooks
Required
- KM: Kloke and McKean (2015). Nonparametric Statistical Methods Using R
Optional
- ET: Efron and Tibshirani (1994). An Introduction to the Bootstrap
- FM: Friendly and Meyer (2015). Discrete Data Analysis with R
- G: Good (2005). Permutations, Parametric, and Boostrap Test of Hypothesis
- Ha: Hall (1992). The Bootstrap and Edgeworth Expansion
- HP: Henderson and Parmeter (2015). Applied Nonparametric Econometrics
- HMT: Hoaglin, Mosteller, and Tukey (1983). Understanding Robust and Exploratory Data Analysis
- HWC: Hollander, Wolfe, and Chicken (2013). Nonparametric Statistical Methods
- Lo: Lovasz (2012). Large Networks and Graph Limits
- L: Lehmann (2006). Nonparametrics Statistical Methods Based on Ranks
- MQJH: Müller, Quintana, Jara, and Hanson (2015). Bayesian Nonparametric Data Analysis
- PE: Patrangenaru and Ellingson (2015). Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis
- RW: Rasmussen and Williams (2006). Gaussian Processes for Machine Learning
- S: Silverman (1986). Density Estimation for Statistics and Data Analysis
- T: Tsybakov (2009). Introduction to Nonparametric Estimation
- W: Wasserman (2006). All of Nonparametric Statistics
Suggested Journal Articles
- B: Basu (1980). Randomization Analysis of Experimential Data: The Fisher Randomization Test
- Bh: Bhattacharya (2015). Power of Graph-Based Two-Sample Tests
- BHLLSW: Buja, Cook, Hofmann, Lawrence, Lee, Swayne, and Wickham (2009). Statistical Inference for Exploratory Data Analysis and Model Diagnostics
- CB: Carpenter and Bithell (2000). Bootstrap Conidence Intervals: When, Which, What? A Practical Guide for Medical Statisticians
- Ch: Chatterjee (2012). Matrix Estimation by Universal Singular Value Thresholding
- C: Critchlow (1986), A Unified Approach to Constructing Nonparametric Rank Tests
- D83: Diaconis (1983). Theories of Data Analysis: From Magical Thinking Through Classical Statistics
- DE: Diaconis and Efron (1983). Computer Intensive Methods in Statistics
- DHa: Diaconis and Holmes (1994). Gray Codes for Randomization Procedures
- DHb: Diaconis and Holmes (1994). Three Examples of the Markov Chain Monte Carlo Method
- Ef: Efron (1987). Better Bootstrap Confidence Intervals
- Ef14: Efron (2014). Estimation and Accuracy after Model Selection
- ENK: Eklund, Nichols, and Knutsson (2015). Can Parametric Statistical Methods Be Trusted for fMRI Based Group Studies?
- FR: Friedman and Rafsky (1979). Multivariate Generalizations of the Wolfowitz and Smirnov Two-Sample Tests
- F: Friendly (1994). Mosaic Displays for Multi-Way Contingency Tables
- GH: Greenacre and Hastie (1987). The Geometric Interpretation of Correspondence Analysis
- H: Holmes (2008). Multivariate Data Analysis: The French Way
- JWH: Josse, Wager, and Husson (2014). Confidence Areas forFixed-Effects PCA
- NWC: Nahhas, Wolfe, and Chen (2002). Ranked Set Sampling: Cost and Optimal Set Size
- MRR: Marin, Pudlo, Robert, and Ryder (2012). Approximate Bayesian Computational Methods
- MW: Milan and Whittaker (1995). Application of the Parametric Bootstrap to Models that Incorporate a Singular Value Decomposition
- NH: Nichols and Holmes (2001). Nonparametric Permutation Tests For Functional Neuroimaging: A Primer with Examples
- PT: Pagano and Tritchler (1983). On Obtaining Permutation Distributions in Polynomial Time
- RH: Rousseeuw and Hubert (2015). Statistical Depth Meets Computational Geometry: A Short Survey
- SMN: Silver, Montana, and Nichols (2011). False Positives in Neuroimaging Genetics Using Voxel-Based Morphometry Data
- Tu: Tukey (1974). Mathematics and the Picturing of Data
- WHE: Wager, Hastie, and Efron (2014). Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife
- WRR: Witztum, Rips, and Rosenberg (1994). Equidistant Letter Sequences in the Book of Genesis
Suggested Links
Instructor
Christof Seiler, Sequoia Hall 116 (christof.seiler [at] stanford [dot] edu)
Office hours: Wednesdays from 10:00 to 11:30 am in 105 at Sequoia
TA’s
- Nan Bi (nbi [at] stanford [dot] edu)
Office hours:
- Wednesdays from 2:30 to 3:30 pm in 420-147
- Thursdays from 10:30 to 11:30 am in Fishbowl at Sequoia
- Lexi Guan (lguan [at] stanford [dot] edu)
Office hours:
- Monday from 10:00 to 11:00 am in Bowker at Sequoia
- Friday from 4:00 to 5:00 pm in Fishbowl at Sequoia
Grading
- Midterm Project Proposal (3 pages with references, 10%, due by April 29th)
- Final Project (12 pages plus references, 40% due by June 3rd)
- Projects can be done alone or in pairs
- Weekly homework (40%)
- Class participation (10%)
Midterm Project Content
Some optional guidelines:
- State the problem
- Describe the data
- Review what statistical methods are available to analyze your data
- List their advantages and disadvantages, in particular compare nonparametric to parameteric methods
- Propose a solution using nonparametric methods
- List all the tasks that you plan to do: collecting data, programming, simulating data, estimating, testing, etc.
Final Project
Example of an excellent final project on using kernel density estimation to predict conflicts in the Congo:
Slides
Lecture |
Topic(s) |
Background Material |
1 |
Logistics and Introduction |
KM Chapter 1 |
2 |
Sign Test and Signed-Rank Wilcoxon |
KM Chapters 2.1, 2.2, and 2.3 |
3 |
Robustness |
KM Chapter 2.5 |
4 |
Bootstrap (Part 1) and Bootstrap (Example) |
KM Chapter 2.4 |
5 |
Bootstrap (Part 2) |
KM Chapter 2.4 |
6 |
Proportion Problems and \( \chi^2 \) Tests (Part 1) |
KM Chapters 2.6 and 2.7 |
7 |
\( \chi^2 \) Tests (Part 2) |
KM Chapter 2.7 |
8 |
Two-Sample Problems (Part 1) |
KM Chapters 3.1 and 3.2 |
9 |
Two-Sample Problems (Part 2) |
KM Chapters 3.2 and 3.4 |
10 |
Permutation Tests (Part 1) |
G Chapter 1 |
11 |
Permutation Tests (Part 2) and Neuroimaging (Example) |
WRR and NH |
12 |
Rank-Based Linear Regression |
KM Chapters 4.1, 4.2, 4.3, 4.4, and 4.8 |
13 |
Nonlinear Regression (Part 1) |
W Chapter 4 |
14 |
Nonlinear Regression (Part 2) |
W Chapter 5 |
15 |
Nonlinear Regression (Part 3) |
W Chapter 5 |
16 |
Bayesian Nonparametrics (Part 1) |
W16 |
17 |
Bayesian Nonparametrics (Part 2) and BNP in Practice |
W16 |
18 |
ANOVA |
KM Chapters 5.1, 5.2, 8.1, 8.2, HMT, and G16 |
19 |
Survival Analysis (Part 1) |
KM Chapters 6.1 and 6.2 |
20 |
Survival Analysis (Part 2) and Midterm Proposal Discussion |
KM Chapters 6.1 and 6.2 |
21 |
Ranked Set Sampling |
HWC Chapter 15 and NWC |
22 |
Wavelets |
HWC Chapter 13 and W Chapter 9 |
23 |
Graph Limits or Graphons |
Lo Part 1 and Ch |
24 |
Inference for Data Visualization |
D83, BHLLSW, and JWH |
25 |
Multivariate Nonparametric Tests |
Tu, Ho, RH, and FR |
26 |
Bootstrap (Part 3) |
ET Chapters 12 and Ha, and Lo |
27 |
Bootstrap (Part 4) |
ET Chapters 14 and Ha |
28 |
Wrapup |
|
Homework
Late Homework Policy
- We will deduct 20% from maximum scores for each late day
- Each student can hand in one homework late (within two days after the deadline)
- Please contact me in case of emergencies