Yep, Stuart's right about GSP/Cap...VMT <i>does</i> matter, even multivariately...

Stuart and I have been emailing back and forth regarding his post on modeling state gsp using vmt as an independent variable.

Paula, both here and on her blog, correctly suggested that education should also be considered as an independent variable. So, what I did was pull together some data on % college educated for each state and include it in with the data Stuart had already collected...and then I conducted a multivariate regression on the data, which is presented below.  That regression allows us to find out what the effects are for all of these independent variables on gsp/cap after controlling for the effects of the other variables.  This allows to get a better picture of what's going on (though we lose the visual facility that Stuart had with his bivariate graphs).  Much more under the fold.

So, what's the takeaway?  Stuart's right: states with higher vmts have lower gsp/cap, even after controlling for education and population density.

It should also be noted that I also agreed with much of the criticism in the comments regarding the logarithmic transforms that need to be done on the variables...however, if after taking the ln of each variable, the comparative magnitude of the coefficients remains present, then we can go back to the pre-transformed data and make easier inferences using the coefficients that are present.

So, the dependent variable is gsp/cap, the independent variables are vmt/cap, population density, and education.  The unit of analysis is state, (however we are dropping DE and WY for reasons mentioned earlier.)  (fyi: including them in the analysis weakens the case a bit for VMT, but the education result is still present).

Here's the multivariate regression (using robust standard errors, just because I like overkill) results (using Stata):


. regress  gspcap educ popden vmtcap, robust beta plus

Regression with robust standard errors                 Number of obs =      48
                                                       F(  3,    44) =   38.49
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.6664
                                                       Root MSE      =  3555.4

             |               Robust
      gspcap |      Coef.   Std. Err.      t    P>|t|                     Beta
        educ |   659.0628   107.6447     6.12   0.000                 .5328718
     popdens |  -1.443011   7.739639    -0.19   0.853                -.0235832
      vmtcap |   -1.79709   .5305359    -3.39   0.001                -.4813086
       _cons |   38455.69   7496.157     5.13   0.000                        .

what does this gobbledygook mean?  well, each coefficient is the change in the dependent variable resulting from a one unit change in that independent variable, controlling for the other variables present in the equation.  So, for a one percentage change in percent education of a state, gsp/cap goes up $659.  A one unit change in vmtcap results in a -1.79 unit change in gspcap even after controlling for education.  The effects for education and vmtcap are statistically significant at p<.001 or greater.  (I can explain that more if you all want me to).

The final column out there is what we call a "standardized beta."  It's not the best measure of strength around (trust me, I could explain it, but you don't want me to), but it is an indicator of "standardized explanatory strength."  We cannot directly compare the regression coefficients' magnitudes, but betas can help us do that (caveat: somewhat).  Because the magnitude of the betas are relatively the same, though in different directions (vmt is an inverse relationship, educ a direct relationship), we can say that these two variables have a relatively similar explanatory power.

Now, the question of the logarithmic transforms.  Regression is really robust, but it has some of them is that these variables are normally distributed.  They aren't.  The solution is to attempt to make them more normal by transforming them using some this case, using natural logs of those variables that are "large," gspcap, vmtcap, and popdens, because it makes them more normal in their distribution.  This is a pretty standard trick.  Here's the results:

. regress  lngspcap educ lnpopden lnvmtcap, robust beta plus

Regression with robust standard errors                 Number of obs =      48
                                                       F(  3,    44) =   36.49
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.6753
                                                       Root MSE      =  .09517

             |               Robust
    lngspcap |      Coef.   Std. Err.      t    P>|t|                     Beta
        educ |   .0188446   .0030129     6.25   0.000                 .5615207
    lnpopden |  -.0079779   .0159703    -0.50   0.620                -.0678694
    lnvmtcap |  -.4890024    .121637    -4.02   0.000                -.4718397
       _cons |   14.55413   1.181904    12.31   0.000                        .

no changes, except in the raw coefficients.  The significance levels stay the same, the betas stay relatively similar.  So, we can say with some confidence that the transforms aren't all that necessary.

It should be noted that, because of the transforms we did, we would have to reinterpret (exponentiate them and the dv over e) the coefficients in order to make direct inferences about prediction (such as a rise of 1% in a state's education = $659 in gspcap that I discussed above). However, even if we change them back, we get substantively similiar results.

So, what's the takeaway?  Stuart's right.  States with higher vmts have lower gsp/cap, even after controlling for education and population density.  

Now the question is: why?
A possible reason...

Do high VMTs indicate the ruralness of a state?

For example I stay in a pretty rural area (out in the sticks) and have to commute to London pretty much every other week. This is because there's just not that many shops, restaurants and things to do in the area where I live. Commuting to work is easy though.

If you live in the suburbs of a large city you spend a lot of time in traffic - so high hours in vehicle, low actual miles travelled. OTOH if you live in a rural area with low traffic - relatively more miles travelled for each hour spent in a vehicle.

It seems sensible that a large number of miles travelled implies low traffic (because everyone has the same number of hours per day). Low traffic, high miles indicates rural areas? which would have lower gsp/cap values.

Thus if the price of oil rises dramatically - it would tax rural areas more than urban areas - and the price of land in rural areas should fall. This seems contrary to what some economists are predicting!

Well, that was the original idea, but supposedly the population density variable would capture that effect...
A confounding issue is cost of living - salaries are higher in NYC, but living standards may be lower, so gdp/cap must be adjusted to reflect spending power in rural vs urban settings. In this case, gdp/cap is high, vmt/cap is very low.

Otherwise, how about this:
a) The more you drive, the less time and money you have to earn/create wealth.
I guess the following just says it another way..
b) The further you have to drive to work, the less return on your investment of time and money input vs salary output. At some point, it is not worth working.

I think there must be some optimum size of city for (COL adjusted) wealth creation - large enough that jobs are available and work is reasonably close, small enough that commutes are not too long.

What if the extra VMT are a dead weight economic loss? Money that could have been used for investment, consumption, etc just gets literally burned out the tail pipe?
Also, I don't think density is the only variable in predicting value per mile traveled.

New Jersey is the most "dense" state (ha ha!) at 438 people per square km, but it is mostly a consistent low suburban density without much hinterland, thus cars rule the road and traffic is a nightmare in many areas. Contrast that to Alaska at 0.42 people per square km. The density across the state is extremely variable, from vast uninhabited regions to dense urban areas and small isolated self contained villages. I doubt there is much traffic in Alaska

Perhaps the difference is the amount of time and fuel simply spent in traffic.

The lost time/productivity argument is the most plausible sounding to me.  Another post for another day....
It does sound plausible, but one supporting inference would be that the excess driving is coming out of the company's time (lower productivity) and not out of family time (lower quality of life). I'm thinking here that vmts are strongly influenced by commuting but have no evidence that's so.
I suspect that part of it is the dead weight economic loss, but on the other hand, if I lived in Montana and could go fly fishing one fine spring day ... why would I go build GDP?
Thank you for including the information for T statistics this time around.  I had meant to e-mail Stuart to ask him to include that in the future.

As to the why, have you considered breaking down the GSP by a couple of gross categories?  Say, the fraction in agriculture and financial services as two indicators?  I'm willing to advance the hypothesis that states with more emphasis on agriculture (and other activities dependent on natural resources) will require more vehicle miles to produce a dollar's output than states that depend more on people sitting at desks.

From Paula's weblog
I think Stuart has the causation backwards. It isn't driving that makes you poor, its being poor that makes you drive.
And now, PG, you're giving us the "multivariate regression on the data". Sigh. And Stuart's stuff which this follows re: "So, what's the takeaway? Stuart's right: states with higher vmts have lower gsp/cap, even after controlling for education and population density".

What about people? I mean, human beings--they are not statistics or variables like at the University of Chicago (Milton Friedman school of economics)--my alma mater, by the way.

Let's be honest here. Paula's statement (above) is simply correct. Common sense tells us this, not sophisticated modelling and multivariate regressions. This has to do with the real experiences of actual people trying to live their lives. Being poor doesn't mean shorter commutes because they live outside the highly centralized, expensive world in which the rich people live. It means longer commutes because the people providing these services to the well-off can't afford to live where they (the rich) do and so therefore must live in cheaper, more affordable outlying areas. This is a great tragedy and a large (perhaps the biggest) reason why exurban and suburban sprawl has taken place.

Modelling, statistical analysis, differential equations are just fine but this is the real world where people live--as best they can in a ever-more marginalized and impoverished social situation. Less math & modelling and more paying attention to how and where people actually have to live, please?

Do you know WHY this kind of modelling is fucked up? Because economists took individual people out the equation and starting dealing with their notion of aggregates. I suppose that would be OK if there was any reality in their results but there isn't. It's all self-serving and serves the political powers representing the rich & well-off. Do you remember the Laffer curve (President George the 1st called it voodoo ecomonics). Now, tax cuts for the rich and benefits cut offs for the poor are par for course.Come on, let's get real.
  1.  Dave, I can model Vmt/cap too.  We're testing hypotheses here Dave, providing evidence to make a point.  I saw nowhere in there where I said that I had a market on the correct answer, nor do I even know if the theory/hypothesis we tested there is sound.  These methods provide summaries of the data, that's all.  If you want to reverse the causal arrow, then suggest an hypothesis.  I'll test it.

  2.  If we had individual level data, I would use it.  We don't.  I have already cautioned in Stuart's first post about ecological fallacies and the like.

  3.  I don't get your normative displeasure about all of this.  We can test Stuart's reasonings, we can test Paula's reasonings, and we can test common-fucking-sense, can we not?  Data are good things, are they not?  If I can find an empirical basis for a normative case, isn't it a lot better than a normative case on its own?
Yeah, OK.

Re: #3 "[your] normative displeasure..."

is based on the statistical use of human beings and their behaviour in the aggregate. People are individuals first--at least a relatively few are, but most of them are fully acculturated and predictable --and maybe that's my problem. I can't blame you for that nor will I bring up the term "sheeple" which most people here at TOD probably don't even remember now anyway. This is my own personal annoyance and hardship--it pisses me off sometimes. People can be modelled in the aggregate-- sigh, :( -- except for ones like me who appear to be 2 standard deviations on the wrong side of the "success" bell curve.

Bad mood, lots of shit going wrong... sorry

In any case, that Peak Oil situation appears to be going OK, huh????
No worries, man.  I hope things get better for you...

Also, that's exactly what I am saying re: the ecological fallacy.  All I can make are statements about with this data is STATES and trends in those states, not individuals.    

its being poor that makes you drive

I agree with this last causation statement of yours.
Simple economic analysis should explain why.

Some geographic locations (i.e. Manhatten, San Francisco) evolve into centers of commerce.

People compete to be near those centers of commerce.
That is why rents are highest at ground zero and taper off as distance from ground zero increases.

The poor cannot afford ground zero rents. So they live farther away. If there is no decent mass transit (not the case in Manhatten, but fairly true for places like San Francisco), the poor have to drive.   and drive and drive.

It may seem like obvious common sense, but in the world of science, there is no such thing. I mean, its obvious that the sun, moon and stars revolve around the flat earth, right? This is what we see every day. This "common sense" doesn't fall apart until someone starts doing the calculations.

I could well be wrong in my assertion that Stuart originally had the causation backwards, but unless someone runs the numbers, there's no way to be sure. Personally, I'd rather see the numbers and know for certain I was wrong, than proceed forward without certainty that I'm heading the right direction.


Please stop this madness! "The states of the U.S." are not a collection that it makes sense to study in this fashion. They vary by over two orders of magnitude in land area, and nearly two orders of magnitude in population. There are an infinite number of demographic variables that you could attempt to "control for" (and with only 50 data points, added variables make the whole process suspect).

And what is the point, anyway? How does this help us learn how to live with 5% less energy, or with 50%?

I have to agree.  Even if the correlation exists, there's no proof of its cause.

That being the case, there's no way we can tell if it will hold up in the post-carbon age.

The people who think living out in the country will be a good thing post-peak are worried about a hard crash.  Like, New Orleans after Katrina.  They're worried about food, water, and looters, not about a few points of GDP.  

And they may have a point.  During the Great Depression, many people went to live with relatives on farms, where at least you could grow your own food and cut your own firewood.  

Though I suspect that might be a lot harder these days.  Many of us no longer have relatives who farm.  The U.S. population is roughly three times as large now as it was then.

In Stuart's post you can see that there is a relationship between GSP/Cap and VMT/Cap of 42% using R2.

To check if the reason for this is the distinction between rural states or urban states he uses population density. So a state with low density ought to be rural and a state with high density should be urban. I claimed in his post that using population density would be too simplistic to differentiate between rural and urban.

One alternative would be to take a compensated population density. I will take an example to explain this:
Take Hawaii with a population density of 196 hab/mi2. It has 5 counties:
Hawaii County     40 hab/mi2   with 13% of the pop.
Honoluu County  1500 hab/mi2   with 72% of the pop.
Kalawao County    10 hab/mi2   with  0% of the pop.  
Kauai County     100 hab/mi2   with  5% of the pop.
Maui County      120 hab/mi2   with 10% of the pop.

So it makes sense to do a compensated average:
.13*40 + .72*1500 + 0*10 + .05*100 + .1*120= 1102 hab/mi2

As Stuart noted: "So your idea is that this would allow us to compute the average density that the average resident experiencies, rather than the average density of the average square mile of the state"

I did this for all states with the corresponding counties using the following census bureau data.

I took out Delaware and Wyoming because they were not included in the relationship between GSP/Cap and VMT/Cap.

Apart from them I excluded New York, Alaska and Nevada for the following reasons. The compensated population density of New York is huge (15671 hab per squere mile) and this is because the most populated counties are tiny, for example New York County has 1.5 millon hab and is only 23 square miles so 68000 pop density. Alaska and Nevada have the opposite problem, Anchorage Municipality has 272000 hab but is very big 1697 square miles and Clark County has 1.7million hab with big cities (Las Vegas and Henderson) but is huge 7877 square miles. So the urbanness of New York is exagerated by tiny counties and the ruralness of Alaska and Nevada is exagerated beacause the counties are huge in area. The population density of Alaska and Nevada is  1 and 21 and the compensated population density is 70 and 169, but it should be much bigger.

Finally excluding these five states (Delaware, Wyoming, New York , Alaska and Nevada) we obtain fairly good R2. Taking the compensated density instead of the normal density (like the one Stuart used) has the effect of moving the poinsts to the right, but some points are moved more than others (note that the heights are not changed). And  we see an improvement. By excluding other states I saw the same pattern: it is almost always better to take compensated densities (specially if they are bad in the sense of New York, Alaska and Nevada).

First the VMT/Cap versus Density and below VMT/Cap versus compensated density.

Second the GSP/Cap versus Density and below the GSP/Cap versus compensated density

The state sticking out with 53000$ per capita is Connecticut probably its compensated density is too low.


This is very interesting.  Could you email us your spreadsheet so we can get Prof G. to repeat his regression with this variable, and see if with this revised population density the effect of VMT goes away once the revised popn density are controlled for.

I am not surprised (as an exiled resident) that Connecticut  is the outlier here.  It is compact and has a pretty good commuter rail network, providing easy access to NYC and Boston.
Interesting analysis.  Here are few thoughts.

1.  I have access to individual US county data.
All 3288 counties.  It would give more data
points.  However, I don't have the GDP

2.  I think all that you are really
seeing in your regression analysis is
that City folks earn more money than
rural folks!!  This is no surprize.
Agriculture has been under tremendous
pressure for decades!  Corn sells for
$2.00/bushel!  in 1927 it sold for $0.50
per bushel!!   Not much change in revenues
for farmers.  Lots of change in cost of
living, though.

Peak oil will improve the lot of farmers.
Eventually food prices will rise faster
than most other goods due to production
declines.  Basically, it will be a catch
up for decades of low farm prices.

3.  If you had good county by county data
you could regress all of the rural
counties and see what you get.
VMT (vehicle miles travelled) might not
even be significant.  If it is, then it
is a more significant finding.

I think Roberto's used of 'compensated density' is a nice refinement to the general analysis.

However, I'm having some difficulty getting a grasp on what we are really trying to accomplish here. Is it is to prove that low-income people drive more than high-income people? Is it to prove that people tend to drive less in high-density metropolitan areas relative to low-density rural and exurb area? I think we already know this, at least intuitively. Or it to suggest some sort of causal relationship between driving and income?

While I think this is a very interesting effort, I am skeptical that you are ever going to firmly pin anything down as 'proof' that A causes B and B causes C. There are just too many variables, many that we are probably not even aware of.

Furthermore, I would pay some closer attention to the 'outlyers' that have been dismissed as 'anomalous'. They might tell you something.

 Both Connecticut and Delaware are each physically small states with a moderate to high population density, but without any really large cities actually within the state itself.  However, northern Delaware can be considered to be continguous with the southern suburbs of Philadelphia, and southern Connecticut can be considerd to be contiguous to the norhern suburbs of New York City.

 Both DE and CT have  a large white-collar highly suburbaned population with generally good income. However, what's important to realize is that they might appear to be anomalous outlyers only because someone in Colonial times drew a line on a map and called this circumscribed area 'Delaware' and that one 'Connecticut'. In reality neither is much different from almost any area in the US comprised of the well-developed suburbs surrounding a major metropolitan area.. I think the same thing is true for say Wyoming or Alaska but in the opposite (rural) end of the scale.

Can I legitimately conclude from all these analyses that if I drive less I will move into a higher socioeconomic class?  :-)  

No :-)  That's the ecological fallacy that Prof G. wrote about (in response to my slightly tongue in cheeck title to the earlier piece: Why Does Driving Too Much Make You Poorer.  A conclusion that might be valid about states doesn't necessarily apply to individuals.

The point of the original piece was to get some feeling for how much moving to a rural area might hurt you in a "post peak but no collapse" scenario.

Here are some reasons why rural states are poorer. First is a lack of a truly competitive market for the most common agricultural commodities i.e. corn, wheat, and soybeans. The number of buyers can be counted on one hand, mainly ADM and Cargill. Secondly most manufactured goods are come from out of state. They export low value goods and import high value goods which is a sure path to poverty. Thirdly is opposition to organized labor is overwhelming in the retail (Walmart) and health care (nursing homes and hospitals) which are the major non farm employers. Add to that the long average distances to jobs and health care services and we see why there is more VMT. The production of biofuels may not improve the situation since ADM and Cargill are the major players in distilling and would probably come to corner the market for biodiesel feedstocks.
deplume -

Some good points!

Many people who live in the more dense suburbs of the Northeast tend to have a rather unrealistic idyllic image of what rural life is like. To many of them Vermont is a picture postcard scene with white chruches and covered bridges, and Kansas is like the opening scenes out of the Wizard of Oz.  Some of the worst poverty pockets I've encountered were in deep rural New England, and the backwaters of places like Kansas can also be pretty grim.

Rural poverty and urban poverty are two very different animals, with very different causes. I'm not sure which one is worse. Just a hunch on my part, but it seems that urban poverty has more or less stabilized or at least is not getting much worse, while rural poverty is getting progressively worse. There are so many dead or dying small towns throughout the heartland. And it seems the only way to make any money as a small farmer is to be lucky enough to an expanding suburb so you can sell your land off to developers.

I think the point made by others is a good one: come a peak oil meltdown, don't expect that you will be saved by moving to a rural area.