Pur Autre Vie

I'm not wrong, I'm just an asshole

Tuesday, May 05, 2015

Self-Driving Data

You may have seen a feature on the New York Times website, part of its "Upshot" section.  Based on recent research, the feature allows you to select a county and then see the effect of that county on the children who grow up there.  My own county, Kings County in New York, is apparently fairly bad for poor children and fairly good for rich children.  (This isn't universally true, as you might expect.  There are counties that are far better for poor children than for rich children.  A good example is Montgomery County, Maryland, which is part of the D.C. metro area.)

But you've got to be very careful when you are dealing with "data-driven" analysis.  And in this case, the data are highly misleading.  The key is something that was divulged in an accompanying article, which assesses why the data seem to indicate that Manhattan is a bad place to grow up at all income levels (but especially for affluent children):

A third factor is marriage, which clearly plays a role in the Manhattan effect. Children who grow up there are less likely to marry, at least by age 30 and probably over all, than similar children elsewhere. About one-third of the income penalty stems from the fact that Manhattan children are more likely to be living without another adult in their late 20s, and of course a second adult often bring a second income. (Our analysis measures household income.)

Aha!  The study looks at household income, and so the timing of marriage affects the conclusion.  Imagine two counties.  In one, children grow up to have average per capita income of $25,000.  In the other, they grow up to have average per capita income of $45,000.  But in the first county, virtually all children are married by the age of 25.  In the second county, virtually no one marries by the age of 25.  The first county, the one in which children end up with vastly lower per capita income, will appear to bestow somewhat better economic outcomes on children who grow up there (at least, by age 25).  They will have household income of ~$50,000, whereas in the other county the comparable number will be ~$45,000.

This may go a long way to explaining something odd about the data:  some of the best counties seem to be extremely rural.  There is a big swath of blue (which is the "good" color) through the plains states, out in counties with miniscule populations.  But the big cities out there—Omaha, Minneapolis, and so forth—are not particularly "good."  There's something about being extremely rural that seems to help.  And the answer may very well be that people get married much younger in those rural counties.

And this is one of the things that gets me about the fetishization of "data-driven" approaches to public policy.  How many people who play around with that map will fail to understand that the results are being driven to a significant degree by marriage rates?  And how many of them will reach false conclusions about the world based on that ignorance?  And (here is the crucial part) how many of them will congratulate themselves for being conversant with the data, for basing their views on "scientific evidence"?  Put something in the right format, and publish it on a reputable website, and just about everyone will believe it.  And not just believe it—attribute a high degree of confidence to it, because quantitative approaches are good and qualitative approaches are bad!  It's frustrating and obnoxious beyond measure.