When a researcher is interested in examining distinct subgroups within a population, it is common to use a stratified random sample to better represent the entire population. This method involves dividing the population of interest into several small subgroups (called strata) based on specific variables of interest and then taking a simple random sample from each of these smaller groups. To account for stratified random samples, weights are used to better estimate population parameters.

Many people fail to recognize that data from a stratified random sample should not treated as a simple random sample (SRS), as Kathy Kamp, Professor of Anthropology, mentions in an __earlier blog post__. The following example explains why it is important to treat stratified random samples and SRS differently.

In 2010, CBS and the New York Times conducted a national phone survey (a stratified random sample) of 1,087 subjects as part of “a continuing series of monthly surveys that solicit[ed] public opinion on a range of political and social issues” (ICPSR 33183, 2012 March 15). In addition to political preference, they gathered information on race, sex, age, and region of residence.

The figure below demonstrates how population estimates vary depending on the use of weights. The unweighted graph incorrectly overestimates the number of females in the democratic party (52% Democrat and 40% Republican). This leads to an incorrect overestimate of the number of democrats in the nation. However, when weights are properly incorporated into the analysis we see that the ratios are actually much closer (46% Democrat and 45% Republican).

As demonstrated above, there is a difference between the weighted and unweighted graphs and resulting proportions. Specifically, the number and percent of Republican supporters increases when we take into account the weights. The weighted graph and proportions give a more accurate estimation of Political Preference by Sex in the population than the unweighted graph.

Try it on your own!

Through a summer MAP with Pam Fellers and Shonda Kuiper, we have created a Political Data app using this dataset. Follow __this link__ in to view the influence of weights on the population estimates for all the subgroups within this dataset. For example, select the X Axis Variable to be “Region” and the Y Axis Variable to be “Political Preference”. What do you notice about the weighted graph in comparison to the unweighted graph? You can also find datasets and several student lab activities giving details for proper estimation and testing for survey (weighted) data at __this website__.