In previous __posts__ we discussed the challenges of accounting for weights in stratified random samples. While the calculation of population estimates is relatively standard, there is no universally accepted norm for statistical inference for weighted data. However, some methods are more appropriate than others. We will focus on examining three different methods for analyzing weighted data and discuss which is most appropriate to use, given the information available.

Three common methods for testing stratified random samples (weighted data) are:

- The
**Simple Random Sample (SRS) Method**assumes that the sample is an unweighted sample that is representative of the population, and does not include adjustments based on the weights that are assigned to each entry in the data set. This is the basic chi-square test taught in most introductory statistics classes. - The
**Raw Weight (RW) Method**multiplies each entry by their respective weight and runs the analysis on this adjusted weighted sample. - The
**Rao-Scott Method**takes into account both sampling variability and varibility among the assigned weights to adjust the chi-square from the RW method.

One example of a data set which incorporates a weight variable is the ** Complementary and Alternative Medicine (CAM) Survey**, which was conducted by the National Center for Health Statistics (NCHS) in 2012. For the CAM survey, NCHS researchers gathered information on numerous variables such as race, sex, region, employment, marital status, and whether each individual surveyed used various types of CAM. In this dataset, weights were assigned based on race, sex, and age.

Among African Americans who used CAM for wellness, we conducted a chi-square test to determine whether there was a significant difference in the proportion of physical therapy users in each region. Below is a table comparing the test statistics and p-values for each of the three statistical tests:

The SRS method assumes that we are analyzing data collected from a simple random sample instead of a stratified random sample. Since the proportions in our sample do not represent the population, this method is inappropriate. The RW method multiplies each entry by their weight giving a slightly more representative sample. While this method is useful for estimating populations, the multiplication of the weights tends to give p-values that are much too small. Thus, both the SRS and RW methods are inaccurate methods for testing this data set. The __Rao-Scott method__ involves adjustments for non-SRS sample designs as well as accounting for the weights, resulting in a better representation of the population.

Try it on your own!

Through a summer MAP with Pam Fellers and Shonda Kuiper, we created a __CAM Data shiny app__. Go to this app and compare how population estimates and test statistics can changes based upon the statistical method that is used. For example, select the **X Axis Variable** to be *Sex* and the **Color By** variable to be *Surgery*. Examine the chi-square values from each of the three types of tests. Which test gives the most extreme p-value? The least extreme? You can also find multiple datasets and student lab activities giving details on how to properly analyze weighted data __here__.