karl_lembke (karl_lembke) wrote,

Playing with statistics

A while ago, I had calculated the correlation coefficient between congresscritters' ratings awarded by the Americans for Democratic Action (ADA) and the American Conservative Union (ACU). Not surprisingly, the ratings are strongly anti-correlated – when one is high, the other tends to be very low. Specifically, the correlation coefficient works out to about -0.95.
Now a correlation coefficient that strong is highly significant, but not quite the straight line you might expect. There are a lot of points that stray from that straight line.
Here's a scatter plot, done in Excel:

Firstly, as you can see, there are quite a few points that fall well away from the dashed line reprsenting the best linear fit. Some fall quite a distance away, with one fellow earning an ACU rating of 70 and an ADA rating of 88, and another couple near zero ACU and 30 ADA. However, as you'd expect with that high correlation coefficient, most scores do cluster around that line.
Secondly, you see a distinct "dumbbell"* distribution along the best-fit line, with most of the scores clustering at the extreme ends of the line, and relatively few in the middle. By eyeball, I'd estimate the centroids of the two clusters to fall at ACU 87 and ADA 21, and ACU 10 and ADA 77. If the 50/50 point represents some center of public opinion, it would seem to follow that the two clusters represent equivalent removes from this center (47 units for the conservative cluster, 48 for the liberal).
I have no idea whether this means a darned thing, but it's a fun way to waste time with Excel.

* Feel free to use this as a straight line.
Tags: politics

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.