Assignment 2-4

In previous assignments I’ve looked into the association between union membership and political participation among paid employees, using the Outlook On Life surveys dataset. I found that respondents who have a union member in their household are more likely to have engaged in political participation over the past 2 years. This was consistent with what I expected on the basis of a study by Kerrissey and Schofer.

In the present assignment we’re to check for a potential moderator. The study by Kerrissey and Schofer found that the association between union membership and political participation is stronger for lower educated respondents, possibly because they have fewer other sources of political capital at their disposal.

Against this background I decided to test the association between union membership and political participation for different subgroups based on education. The OOL dataset has a variable with four education levels (less than high school; high school; some college; bachelor’s degree or higher). Since there are relatively few respondents with less than high school, I decided to lump together the first two categories.

First of all, here’s a grouped bar chart showing what percentage of respondents have engaged in political participation, by union membership (at household level) and by education level. Political participation levels appear higher for higher educated respondents, which will not come as a surprise. More surprisingly, the association between union membership and political participation appears stronger for higher educated respondents.

So let’s take a look at the chi squares for the different education levels. The entire Python script for my analysis can be found here. Below I copy some of the output from the code:

measure: "political_participation", group: "employees"
 
 
Results for "low"
union                     No  Yes
political_participation          
0.0                      158   36
1.0                       87   24
 
chi-square value, p value, expected counts
(0.24815891922850686, 0.61837443272471648, 1, array([[ 155.83606557,   38.16393443],
       [  89.16393443,   21.83606557]]))
 
 
Results for "medium"
union                     No  Yes
political_participation          
0.0                      130   26
1.0                      124   41
 
chi-square value, p value, expected counts
(2.7736422284672679, 0.095827887556796373, 1, array([[ 123.43925234,   32.56074766],
       [ 130.56074766,   34.43925234]]))
 
 
Results for "high"
union                     No  Yes
political_participation          
0.0                      157   27
1.0                      154   60
 
chi-square value, p value, expected counts
(9.5760783080978147, 0.0019712903653131314, 1, array([[ 143.77889447,   40.22110553],
       [ 167.22110553,   46.77889447]]))

The results show that the chi square value is smallest for the lowest education group and largest for the highest education group; and only significant for the highest education group (note that a post-hoc tests is not required because the explanatory variable has only two levels).

This comes as a surprise. Based on the study by Kerrissey and Schofer, I expected that the asssociation between union membership and political participation would be stronger for the lower educated respondents. However, using the OOL data, the association is only significant for the highest education level.

Note for students reviewing this assignment: the elaboration below isn’t strictly speaking part of the assignment. I wouldn’t want to waste your time so feel free to skip the rest of the article and make your assessment based on the text above.

I can’t really explain why my analysis leads to a result that seems at odds with the Kerrissey and Schofer study, but here are some considerations.

First of all, it’s entirely possible that I made some silly mistake in my analysis. And if that’s not the case, the method applied by Kerrissey and Schofer is different in a number of ways from my analysis. For example, they did regression analyses taking a number of relevant background variables into account. Further, they found a significant interaction between union membership and education in two different datasets. One could argue that Kerrissey and Schofer’s analysis is superior and their finding therefore more credible. Even so, it would be nice to be able to explain why a simpler model results in an opposite outcome.

Second, characteristics of respondents might play a role. I have the impression that union members may be overrepresented in the OOL dataset, but I don’t immediately see how that would explain the different outcome. More importantly, I did my analysis on a subset consisting of respondents with paid employment. It’s entirely possible that paid employees tend to be higher educated than unemployed and retired respondents. I guess it wouldn’t hurt rerunning the analysis on the entire group of respondents.

Third, it may matter how you define and measure political participation. I used a measure that includes contacting an official, participating in a protest or march and signing a petition. Kerrissey and Schofer found an interaction for voting, protest and membership. It would be interesting to see what happens if I use just the protest variable instead of the composite measure.

All respondents, composite measure

When I run my analysis on the entire group of respondents rather than just paid employees, the outcome changes in that there’s now a significant association between union membership and participation, not just for the highest education group, but also the medium education group. For the lowest education group, there’s still no significant association. So this doesn’t really explain the difference.

measure: "political_participation", group: "all_respondents"
 
 
Results for "low"
union                     No  Yes
political_participation          
0.0                      466   73
1.0                      279   60
 
chi-square value, p value, expected counts
(2.4819670432296759, 0.11515814971338957, 1, array([[ 457.35193622,   81.64806378],
       [ 287.64806378,   51.35193622]]))
 
 
Results for "medium"
union                     No  Yes
political_participation          
0.0                      262   32
1.0                      279   82
 
chi-square value, p value, expected counts
(14.963425544693942, 0.00010961532600357433, 1, array([[ 242.83053435,   51.16946565],
       [ 298.16946565,   62.83053435]]))
 
 
Results for "high"
union                     No  Yes
political_participation          
0.0                      237   34
1.0                      307   95
 
chi-square value, p value, expected counts
(12.134008533047874, 0.00049510583539001945, 1, array([[ 219.05497771,   51.94502229],
       [ 324.94502229,   77.05497771]]))

All respondents, protest measure

Using the protest measure rather than the composite participation measure, the association is once again only significant for the highest educated group.

measure: "protest_demo", group: "all_respondents"
 
 
Results for "low"
union          No  Yes
protest_demo          
0.0           695  119
1.0            49   15
 
chi-square value, p value, expected counts
(2.9184647984510526, 0.087571143400387977, 1, array([[ 689.76765376,  124.23234624],
       [  54.23234624,    9.76765376]]))
 
 
Results for "medium"
union          No  Yes
protest_demo          
0.0           499   99
1.0            43   15
 
chi-square value, p value, expected counts
(2.5743436452143076, 0.10860914232973193, 1, array([[ 494.07926829,  103.92073171],
       [  47.92073171,   10.07926829]]))
 
 
Results for "high"
union          No  Yes
protest_demo          
0.0           487  104
1.0            52   23
 
chi-square value, p value, expected counts
(6.5436232099967668, 0.010526075473228987, 1, array([[ 478.3018018,  112.6981982],
       [  60.6981982,   14.3018018]]))

After these additional analyses, it’s clear that it makes a difference whether you include respondents who are not paid employees, but I don’t think that fully accounts for the difference between the analysis using the OOL dataset and Kerrissey and Schofer’s analyses. Using a ‘protest’ variable instead of a broader composite measure of political participation also didn’t help clear things up. I’m afraid I still don’t really have an explanation for the different outcomes.