Categorical Data Analysis

The main ideas of the course are to develop a critical approach to the analysis of categorical data often encountered in health sciences research. This process will include gaining some technical insight (mechanics of the statistical methodology behind the ideas) as well as applications of these methods in health-related data. Some of the main objectives of this course are:

  • To develop an intuitive and critical approach to the analysis of frequency tables.

  • To examine basic ideas and methods of generalized linear models (e.g. Logistic regression, Multinomial Logistic Regression, Ordinal Logistic Regression and Log Linear Models)

  • To gain experience in categorical data analysis using statistical software packages (SPSS/Open Epi,etc.)

Quick review:

Some of the basic sampling techniques, variable types, probability distributions relevant in our course (binomial, multinomial, Poisson, etc.), expectation, concept of likelihood, tests for one-way tables

Contingency tables: Review of 2 X 2 tables and r X c tables, tests for independence and homogeneity of proportions, Fisher's exact test, McNemar's test. Introduction to three-way tables, full and conditional independence, collapsing. (Contigency Table Analysis)

Rate and Ratios Formulas

Introduction to generalized linear models: Logistic regression, interpretation of coefficients, model selection, diagnostics, goodness of fit. Introduction of multinomial regression, polytomous regression, poisson regression.

Loglinear models: for multi-way tables

Special topics:

Books:

Agresti, Alan (2002) Categorical Data Analysis, Second Edition, Willey

Agresti, Alan (2007) An Introdiction to Categorical Data Analysis, Welley

Hosmer, D.W. and Lemeshow, S. (2000) Applied Logistic Regression, Second Edition, Wiley