In previous assignments I’ve been looking into the association between union membership and political participation (both categorical variables), using the Outlook On Life surveys. For our present assignment we’re to generate a correlation coefficient, so I had to use other variables. I decided to test whether younger respondents tend to have more positive views of Occupy Wallstreet.
Here’s the code:
# Import relevant libraries
# Read data & print size of dataframe
data = pandas.read_csv('../../Data Management and Visualization/Data/ool_pds.csv', low_memory = False)
# Only variable W1_D16 contains missing values that need to be recoded
data['W1_D16'] = data['W1_D16'].replace(-1, numpy.nan).replace(998, numpy.nan)
sub = data[['PPAGE', 'W1_D16']].dropna()
scat = seaborn.regplot(x="PPAGE", y="W1_D16", fit_reg=True, data=sub)
print ('Association between age and opinion of OWS')
print (scipy.stats.pearsonr(sub['PPAGE'], sub['W1_D16']))
And here’s the output:
Association between age and opinion of OWS
There’s a negative and statistically significant (
p ) correlation between age and opinions on OWS, so yes, younger people do seem to be likely to have a more positive view of OWS. However, the correlation coefficient is very small, -.05, which implies that age could explain a mere 0.25% of variation in opinions on OWS.