Python 3.5.1 |Anaconda 4.0.0 (x86_64)| (default, Dec 7 2015, 11:24:55)

Type "copyright", "credits" or "license" for more information.


IPython 4.1.2 -- An enhanced Interactive Python.

? -> Introduction and overview of IPython's features.

%quickref -> Quick reference.

help -> Python's own help system.

object? -> Details about 'object', use 'object??' for extra details.

%guiref -> A brief reference about the graphical user interface.


In [1]: runfile('/Users/dirkkloosterboer/Coursera/Data Analysis and Interpretation/Data Analysis Tools/Code/ool_2_1.py', wdir='/Users/dirkkloosterboer/Coursera/Data Analysis and Interpretation/Data Analysis Tools/Code')

(2294, 436)

ANOVA to compare means by MSA [metro] status

OLS Regression Results

==============================================================================

Dep. Variable: W1_N1H R-squared: 0.004

Model: OLS Adj. R-squared: 0.003

Method: Least Squares F-statistic: 7.079

Date: Tue, 05 Apr 2016 Prob (F-statistic): 0.00786

Time: 20:40:16 Log-Likelihood: -9214.0

No. Observations: 1973 AIC: 1.843e+04

Df Residuals: 1971 BIC: 1.844e+04

Df Model: 1

Covariance Type: nonrobust

====================================================================================

coef std err t P>|t| [95.0% Conf. Int.]

------------------------------------------------------------------------------------

Intercept 56.8739 1.734 32.805 0.000 53.474 60.274

C(PPMSACAT)[T.1] 4.8965 1.840 2.661 0.008 1.287 8.506

==============================================================================

Omnibus: 52.822 Durbin-Watson: 1.906

Prob(Omnibus): 0.000 Jarque-Bera (JB): 53.770

Skew: -0.380 Prob(JB): 2.11e-12

Kurtosis: 2.726 Cond. No. 5.80

==============================================================================


Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

median mean std len

PPMSACAT

0 50.0 56.873874 28.022007 222.0

1 60.0 61.770417 25.541425 1751.0


Explore state-level data

median mean std len

PPSTATEN

11 45.0 38.125000 17.512750 8.0

12 15.0 15.000000 NaN 1.0

13 47.5 54.750000 20.661962 4.0

14 50.0 59.000000 23.880832 35.0

15 50.0 54.000000 11.401754 5.0

16 70.0 64.473684 27.380789 19.0

21 70.0 65.650000 23.055523 120.0

22 50.0 61.032258 24.728007 62.0

23 60.0 59.600000 28.085129 95.0

31 60.0 62.126437 26.683243 87.0

32 50.0 56.875000 24.878738 32.0

33 70.0 66.785714 25.140123 98.0

34 70.0 68.072464 24.620244 69.0

35 60.0 64.142857 23.840330 35.0

41 60.0 61.521739 28.541628 23.0

42 50.0 50.071429 26.934291 14.0

43 75.0 69.333333 28.435319 48.0

44 45.0 50.000000 14.142136 4.0

45 50.0 39.000000 15.968719 5.0

46 50.0 55.909091 21.888145 11.0

47 45.0 43.750000 24.320306 12.0

51 77.5 65.000000 36.055513 6.0

52 70.0 65.070423 24.355887 71.0

53 77.5 76.000000 18.826695 10.0

54 70.0 64.689655 22.309000 58.0

55 70.0 72.000000 13.581033 10.0

56 60.0 61.428571 28.252819 77.0

57 50.0 56.333333 25.954906 36.0

58 60.0 63.194175 24.506925 103.0

59 50.0 53.898148 28.088625 108.0

61 70.0 64.650000 31.860345 20.0

62 60.0 62.975610 23.168392 41.0

63 70.0 65.230769 24.919962 52.0

64 65.0 60.833333 25.735556 24.0

71 50.0 55.000000 19.525624 9.0

72 70.0 62.068966 28.105580 29.0

73 50.0 48.888889 28.104613 18.0

74 50.0 56.617834 25.707560 157.0

81 60.0 60.000000 0.000000 2.0

82 50.0 65.000000 20.615528 5.0

83 35.0 34.000000 12.942179 5.0

84 50.0 46.520000 25.351068 25.0

85 60.0 53.333333 50.332230 3.0

86 50.0 49.814815 23.837651 27.0

87 50.0 55.625000 30.640484 8.0

88 60.0 64.130435 21.513853 23.0

91 70.0 66.475000 27.124582 40.0

92 60.0 54.411765 29.255467 17.0

93 60.0 63.405128 25.050361 195.0

94 27.5 27.500000 17.677670 2.0

95 65.0 69.000000 15.572412 5.0


ANOVA to compare means by state for states with at least 50 respondents

OLS Regression Results

==============================================================================

Dep. Variable: W1_N1H R-squared: 0.024

Model: OLS Adj. R-squared: 0.014

Method: Least Squares F-statistic: 2.366

Date: Tue, 05 Apr 2016 Prob (F-statistic): 0.00229

Time: 20:40:16 Log-Likelihood: -6706.7

No. Observations: 1441 AIC: 1.345e+04

Df Residuals: 1425 BIC: 1.353e+04

Df Model: 15

Covariance Type: nonrobust

=====================================================================================

coef std err t P>|t| [95.0% Conf. Int.]

-------------------------------------------------------------------------------------

Intercept 65.2308 3.544 18.406 0.000 58.279 72.183

C(PPSTATEN)[T.CA] -1.8256 3.989 -0.458 0.647 -9.650 5.999

C(PPSTATEN)[T.FL] -11.3326 4.314 -2.627 0.009 -19.794 -2.871

C(PPSTATEN)[T.GA] -2.0366 4.347 -0.468 0.640 -10.565 6.492

C(PPSTATEN)[T.IL] 1.5549 4.385 0.355 0.723 -7.046 10.156

C(PPSTATEN)[T.MD] -0.1603 4.665 -0.034 0.973 -9.311 8.990

C(PPSTATEN)[T.MI] 2.8417 4.693 0.606 0.545 -6.364 12.048

C(PPSTATEN)[T.MO] 4.1026 5.115 0.802 0.423 -5.932 14.137

C(PPSTATEN)[T.NC] -3.8022 4.587 -0.829 0.407 -12.800 5.196

C(PPSTATEN)[T.NJ] -4.1985 4.806 -0.874 0.382 -13.625 5.228

C(PPSTATEN)[T.NY] 0.4192 4.243 0.099 0.921 -7.904 8.742

C(PPSTATEN)[T.OH] -3.1043 4.480 -0.693 0.488 -11.892 5.683

C(PPSTATEN)[T.PA] -5.6308 4.408 -1.277 0.202 -14.279 3.017

C(PPSTATEN)[T.TN] -2.2552 5.338 -0.423 0.673 -12.725 8.215

C(PPSTATEN)[T.TX] -8.6129 4.089 -2.106 0.035 -16.634 -0.592

C(PPSTATEN)[T.VA] -0.5411 4.881 -0.111 0.912 -10.115 9.033

==============================================================================

Omnibus: 40.008 Durbin-Watson: 1.933

Prob(Omnibus): 0.000 Jarque-Bera (JB): 42.284

Skew: -0.408 Prob(JB): 6.58e-10

Kurtosis: 2.805 Cond. No. 22.3

==============================================================================


Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Post-hoc test [HSD] for state means

Multiple Comparison of Means - Tukey HSD,FWER=0.05

==============================================

group1 group2 meandiff lower upper reject

----------------------------------------------

AL CA -1.8256 -15.5161 11.8649 False

AL FL -11.3326 -26.1386 3.4733 False

AL GA -2.0366 -16.9589 12.8857 False

AL IL 1.5549 -13.4945 16.6044 False

AL MD -0.1603 -16.1711 15.8504 False

AL MI 2.8417 -13.2668 18.9502 False

AL MO 4.1026 -13.4551 21.6603 False

AL NC -3.8022 -19.547 11.9426 False

AL NJ -4.1985 -20.6932 12.2962 False

AL NY 0.4192 -14.1441 14.9826 False

AL OH -3.1043 -18.4801 12.2714 False

AL PA -5.6308 -20.7624 9.5008 False

AL TN -2.2552 -20.5757 16.0654 False

AL TX -8.6129 -22.6479 5.422 False

AL VA -0.5411 -17.2933 16.211 False

CA FL -9.507 -20.0286 1.0146 False

CA GA -0.211 -10.8956 10.4737 False

CA IL 3.3806 -7.481 14.2422 False

CA MD 1.6653 -10.4933 13.8239 False

CA MI 4.6673 -7.6198 16.9544 False

CA MO 5.9282 -8.2055 20.0619 False

CA NC -1.9766 -13.7828 9.8297 False

CA NJ -2.3729 -15.1621 10.4163 False

CA NY 2.2449 -7.9325 12.4223 False

CA OH -1.2787 -12.588 10.0307 False

CA PA -3.8051 -14.7803 7.17 False

CA TN -0.4295 -15.5003 14.6413 False

CA TX -6.7873 -16.1931 2.6185 False

CA VA 1.2845 -11.835 14.4041 False

FL GA 9.296 -2.7849 21.377 False

FL IL 12.8876 0.6499 25.1252 True

FL MD 11.1723 -2.2299 24.5744 False

FL MI 14.1743 0.6555 27.6932 True

FL MO 15.4352 0.2185 30.6519 True

FL NC 7.5304 -5.5529 20.6138 False

FL NJ 7.1341 -6.8426 21.1109 False

FL NY 11.7519 0.1172 23.3865 True

FL OH 8.2283 -4.4085 20.865 False

FL PA 5.7019 -6.6367 18.0404 False

FL TN 9.0775 -7.0134 25.1683 False

FL TX 2.7197 -8.2464 13.6858 False

FL VA 10.7915 -3.4882 25.0712 False

GA IL 3.5915 -8.7866 15.9697 False

GA MD 1.8762 -11.6543 15.4068 False

GA MI 4.8783 -8.7679 18.5245 False

GA MO 6.1392 -9.1907 21.469 False

GA NC -1.7656 -14.9804 11.4492 False

GA NJ -2.1619 -16.2619 11.938 False

GA NY 2.4558 -9.3266 14.2382 False

GA OH -1.0677 -13.8406 11.7051 False

GA PA -3.5942 -16.0721 8.8837 False

GA TN -0.2186 -16.4165 15.9794 False

GA TX -6.5763 -17.699 4.5463 False

GA VA 1.4955 -12.9048 15.8957 False

IL MD -1.7153 -15.386 11.9554 False

IL MI 1.2867 -12.4984 15.0719 False

IL MO 2.5476 -12.9061 18.0013 False

IL NC -5.3571 -18.7154 8.0011 False

IL NJ -5.7535 -19.9879 8.481 False

IL NY -1.1357 -13.0787 10.8073 False

IL OH -4.6593 -17.5805 8.2619 False

IL PA -7.1857 -19.8154 5.444 False

IL TN -3.8101 -20.1253 12.5051 False

IL TX -10.1679 -21.4606 1.1248 False

IL VA -2.0961 -16.628 12.4359 False

MD MI 3.002 -11.8266 17.8306 False

MD MO 4.2629 -12.1284 20.6542 False

MD NC -3.6419 -18.0745 10.7908 False

MD NJ -4.0382 -19.2854 11.2091 False

MD NY 0.5796 -12.5541 13.7133 False

MD OH -2.944 -16.9731 11.0851 False

MD PA -5.4704 -19.2315 8.2907 False

MD TN -2.0948 -19.3007 15.1111 False

MD TX -8.4526 -20.9978 4.0926 False

MD VA -0.3808 -15.9061 15.1446 False

MI MO 1.2609 -15.226 17.7477 False

MI NC -6.6439 -21.185 7.8972 False

MI NJ -7.0402 -22.3901 8.3097 False

MI NY -2.4225 -15.6752 10.8303 False

MI OH -5.946 -20.0866 8.1946 False

MI PA -8.4725 -22.3472 5.4023 False

MI TN -5.0969 -22.3938 12.2001 False

MI TX -11.4546 -24.1244 1.2152 False

MI VA -3.3828 -19.009 12.2434 False

MO NC -7.9048 -24.0364 8.2269 False

MO NJ -8.3011 -25.1654 8.5633 False

MO NY -3.6833 -18.6641 11.2974 False

MO OH -7.2069 -22.9785 8.5647 False

MO PA -9.7333 -25.267 5.8004 False

MO TN -6.3577 -25.0117 12.2963 False

MO TX -12.7155 -27.1831 1.7521 False

MO VA -4.6437 -21.7599 12.4725 False

NC NJ -0.3963 -15.364 14.5714 False

NC NY 4.2214 -8.5867 17.0296 False

NC OH 0.6979 -13.027 14.4227 False

NC PA -1.8286 -15.2793 11.6222 False

NC TN 1.547 -15.4117 18.5058 False

NC TX -4.8107 -17.0147 7.3933 False

NC VA 3.2611 -11.9899 18.512 False

NJ NY 4.6177 -9.1018 18.3373 False

NJ OH 1.0942 -13.4848 15.6732 False

NJ PA -1.4323 -15.7535 12.889 False

NJ TN 1.9434 -15.7138 19.6005 False

NJ TX -4.4144 -17.5717 8.7429 False

NJ VA 3.6574 -12.3666 19.6814 False

NY OH -3.5236 -15.8752 8.8281 False

NY PA -6.05 -18.0964 5.9964 False

NY TN -2.6744 -18.5423 13.1935 False

NY TX -9.0322 -19.6684 1.6041 False

NY VA -0.9603 -14.9883 13.0676 False

OH PA -2.5264 -15.5432 10.4904 False

OH TN 0.8492 -15.7675 17.4658 False

OH TX -5.5086 -17.2326 6.2154 False

OH VA 2.5632 -12.3064 17.4328 False

PA TN 3.3756 -13.0154 19.7666 False

PA TX -2.9822 -14.3841 8.4198 False

PA VA 5.0897 -9.5274 19.7067 False

TN TX -6.3578 -21.7422 9.0266 False

TN VA 1.714 -16.1838 19.6119 False

TX VA 8.0718 -5.4068 21.5504 False

----------------------------------------------


Print summaries per state

median mean std len

PPSTATEN

AL 70.0 65.230769 24.919962 52.0

CA 60.0 63.405128 25.050361 195.0

FL 50.0 53.898148 28.088625 108.0

GA 60.0 63.194175 24.506925 103.0

IL 70.0 66.785714 25.140123 98.0

MD 70.0 65.070423 24.355887 71.0

MI 70.0 68.072464 24.620244 69.0

MO 75.0 69.333333 28.435319 48.0

NC 60.0 61.428571 28.252819 77.0

NJ 50.0 61.032258 24.728007 62.0

NY 70.0 65.650000 23.055523 120.0

OH 60.0 62.126437 26.683243 87.0

PA 60.0 59.600000 28.085129 95.0

TN 60.0 62.975610 23.168392 41.0

TX 50.0 56.617834 25.707560 157.0

VA 70.0 64.689655 22.309000 58.0



In [2]: