Assignment 2-3

21 April 2016

In previous assignments I’ve been looking into the association between union membership and political participation (both categorical variables), using the Outlook On Life surveys. For our present assignment we’re to generate a correlation coefficient, so I had to use other variables. I decided to test whether younger respondents tend to have more positive views of Occupy Wallstreet.

Here’s the code:


# Import relevant libraries
import pandas
import numpy
import seaborn
import scipy

# Read data & print size of dataframe
data = pandas.read_csv('../../Data Management and Visualization/Data/ool_pds.csv', low_memory = False)
print (data.shape)

# Only variable W1_D16 contains missing values that need to be recoded
data['W1_D16'] = data['W1_D16'].replace(-1, numpy.nan).replace(998, numpy.nan)

sub = data[['PPAGE', 'W1_D16']].dropna()
scat = seaborn.regplot(x="PPAGE", y="W1_D16", fit_reg=True, data=sub)

print ('Association between age and opinion of OWS')
print (scipy.stats.pearsonr(sub['PPAGE'], sub['W1_D16']))

And here’s the output:


Association between age and opinion of OWS
(-0.050642104468121348, 0.030560880228248256)

There’s a negative and statistically significant (p ) correlation between age and opinions on OWS, so yes, younger people do seem to be likely to have a more positive view of OWS. However, the correlation coefficient is very small, -.05, which implies that age could explain a mere 0.25% of variation in opinions on OWS.

21 April 2016 | Categories: assignment, dai