Blog

Do Small Reductions in Caloric Intake add up to Big Changes in Weight?

The answer is: probably not.​

​This is important question because there are many studies finding that various interventions (from fat taxes to menu labels) have very (though sometimes statistically significant) small effects on caloric intake.  Proponents of the policies are often undeterred - and say things like "well, a 20 kcal reduction every day can really add up to big weight loss over time."

As I've already discussed, some of this sort of analysis ​is based on the faulty logic that 3500kcal = 1lb.  But, as was mentioned in that post, our body does not react linearly to caloric changes in the fashion implied by this formula.  

Now, there's more on this topic by Trevor Butterworth in a well-written and catchy-titled post ​Sex And Lies! The Iffy Science Of Measuring Calories.  Here is a key excerpt:  

Hall was responsible for filling in the crucial measurements that elucidated one of the most widespread myths highlighted by Allison et al.: the idea that small, consistent changes in energy intake or expenditure will, over time, lead to large changes in weight. The assumption appears to have been based on the 1958 calculation by Max Wishnofsky that one pound of body fat gained or lost is equal to 3,500 kilocalories. This seemed to give people a convenient way to estimate weight loss through diet or exercise, while promising extremely convenient results. If you simply knocked off a 100 kilocalories from your energy intake each day—a ten-minute jog, or a mile walk—you'd end up losing over 50 pounds in five years. Little wonder that early proposals for soda and fat taxes promised to save Americans from themselves: pay a little more, consume a little less, watch a lot of weight disappear in a few years.
Hall first heard the claim listening to a dietician make a calculation for an obese patient. His intuition told him that this calculation was incorrect and would lead to exaggerated weight loss predictions. When he asked for a reference, he was pointed to a nutrition and dietetics textbook. "I subsequently found the mistake everywhere I looked." People weren't stopping to think "about the dynamic interaction between energy intake and expenditure, which is complicated," he says. What they failed to take into account was that "the rate of weight loss changes over time and is primarily determined by the imbalance between energy intake and expenditure—a value that also changes over time." To radically simplify his model, this means that cutting calories in your diet leads to a decreasing calorie expenditure, which in turn slows weight loss until weight eventually plateaus after a few years. "Of course," says Hall, "cheating on your diet will cause your weight to plateau much sooner." In the case of soda taxes, Hall and researchers at the US Department of Agriculture showed how static modeling overstated weight loss by 346 percent after five years.

Does Sugar Consumption Drive Diabetes?

A recent article in the journal PLoS ONE by the anti-sugar crusader Robert Lustig and three other co-authors has created quite a stir by purporting to show that increased sugar consumption causes diabetes.  In the paper, the authors hold up just shy of saying "cause" but that is the inference drawn by many in the media (see for example this story in Bloomberg among other places) who say things like:

Excessive sugar consumption may be the main driver of a global rise in diabetes,

Moreover, on Mark Bittman's NYT Blog, the author, Lustig, is cited as saying:​

This study is proof enough that sugar is toxic. Now it’s time to do something about it.

There is no way a study like this (comparing differences across countries) can firmly establish causation.  So, at a minimum the study indicates an interesting (and perhaps suggestive) correlation that might warrant a randomized control trial.  Nonetheless, I was intrigued and wanted to check out the evidence for myself.  

The evidence by Lustig and colleagues comes by linking data on diabetes prevalence rates across countries (which I was able to easily find online here) and data from the UN FAO on the availability of calories from different food stuffs in different countries (after a bit of digging, I was also able to find it online here - go the the "food balance sheets").  After a bit of effort, I downloaded both data sets for the most recent years available, merged them, and checked out the claims made in the paper.  

At first blush, I find very similar results to the ones reported in the paper.  Holding constant total calories available, a simple linear regression shows that for every 100 kcal increase in sugar availability, the prevalence of diabetes goes up by 1.3 percentage points (say from 8.5% (the sample mean) to 9.8%).  The estimated equation is:  ​

(% with diabetes)=​1.067+0.013*(per-capita available sugar kcal)+0.001*(per capita total available kcal)

My estimate is a little higher than the one reported in the paper probably because I'm not controlling for other factors (like GDP, kcal intake from meat, etc.) as the authors did.  Moreover, I'm using data on diabetes from 2012 whereas the authors used 2011 and older data (note: I use data from 174 countries in my estimates).  The only coefficient significant at the p=0.05 level in the above equation is the 0.013 estimate associated with sugar.   

So far so good - the correlation is confirmed.  

But let's get to the nitty gritty of the interpretation.  The data is at the country level.  So, what this implies is that a country that increases per-capita sugar availability by 100kcal will tend to have a 1.3 percentage point increase in the percent of the population with diabetes. 

But, we don't really care about countries per se.  We care about people.  There are a lot more people in some countries than others.  ​In the data set, the range is from a low of 0.00066 million adults to 980 million adults.  Shouldn't this factor into the analysis?  If we care about how many people in the world have diabetes, we'd better pay a lot more attention to China than to Luxembourg.  

We know from the mini-scandal associated with the claim that small schools outperform larger ones (see one account here)​ that outcomes from small schools (or small countries) tends to be a lot more variable (with more outliers) than data from large schools (or large countries).  That's just basic statistics.  

Intuitively, we should want a larger country to count more than a smaller one.  After all, there are many more people in larger countries - so if we want to think about the prevalence of diabetes in the world (rather than the average prevalence rate across countries)​, we'd want to calculate a weighted average, where larger countries get more weight (because they have more people).  The more people, the higher the weight.

Likewise, when we want to run analyses like the one above, we want to give more weight to countries with more people.  We can do this by running a weighted regression, where each country gets a weight proportional to it's population size.  This converts the equation to one about how countries differ to one about how individuals differ.  ​Stated differently, the weighted regression places the estimates at the level of the individual (picked at random from any country) rather than the level of the country (picked at random from a group of countries).

Here is the equation I get when I weight by a country's adult population:​

(% with diabetes)=0.692+0.002*(total available sugar kcal)+0.002*(total available kcal)

Now, the effect of sugar falls dramatically (and most importantly, it is no longer statistically significant at standard levels; the p-value is 0.074).  A 100 kcal increase in per-capita sugar availability only increases the % with diabetes by 0.2 (rather than 1.3 as previously estimated).  Moreover, total energy from all sources is now significant and roughly the same magnitude as sugar.  Thus, what matters in this framework is total kcal from any food source.  Moreover this regression suggests that a sugar calorie is roughly the same as any other calorie insofar as affecting diabetes.    ​

The paper at PLoS ONE says "regressions are population weighted."  But, I'm wondering that is indeed the case.  It could be true.  I don't have access to all their data and I'm not including all their controls.  

I'm happy to share the data and SAS code with anybody who cares to see it.​

​********

Addendum

​The nice thing about the web is that you get feedback.  Here's an update.  The source that reports diabetes prevalence actually reported three measures.  In the regressions above, I used national prevalence (total number with diabetes divided by total population).  However, as indicated at the data source here, they also report some sort of age adjusted measure that is likely more useful in comparing across countries that might have different mean ages.  

​When I use this "IGT comparative prevalence" measure, as they call it, then I get exactly the opposite of the results mentioned above.  When the data are NOT weighted, the sugar coefficient is only 0.0019 (p-value 0.27).  But, when the data ARE weighted by adult population, the sugar coefficient is 0.01277 (p-value < 0.001).  

So, there is an interesting mix of things going on here between the population, weighting, and age adjustment.  Just out of curiosity, and for some robustness checks, I did two things.  First, I re-ran the "preferred" model with population weighting using "IGT comparative prevalence" diabetes but included population as an explanatory variable. When I do this, sugar is no longer statistically significant (the estimate is 0.00242 with a p-value of 0.107), but population is (the estimate suggests larger populations have lower diabetes prevalence).  I can't quite figure out what is going on here but there has to be something weird going on in the sense that the model is  weighting by population and the dependent variable (and independent variables) are per-capita (i.e., are divided by population), that might be producing some unexpected results.    

Second, I ran a quantile regression to see how the results hold up at the median (rather than the mean, which is more sensitive to outliers), I find that (using IGT comparative prevalence and adult population as a weight with only sugar and total calories as explanatory vars) the sugar effect, at the median, is 0.0148 but the 95% confidence interval is (-0.0191, 0.0217) when using the SAS default rank method of calculating standard errors.  The 95% confidence interval changes to (0.0041, 0.0254) when using an alternative resampling method.  So, whether the median effect is statistically significant depends on which method of calculating standard errors is used.

Here is the plot of the "sugar effect" at each quantile.  The first shows the 95% confidence intervals determined by the resampling method and the second uses the SAS default (I have to admit that I'm not sure which method is preferred in this case). 

sugarquantile.JPG
sugarquantile2.JPG

Our Research on Menu Labeling

For the past couple years, I've been working with one of my former graduate students, Brenna Ellison at the University of Illinois, on some papers related to effects of calorie labels on menus (for those who may not be aware, "Obamacare" mandated that chain restaurants include such menu labels but the FDA has yet to release the final rules and implementation date).  

The first part of that research was finally published last week in the International Journal of Behavioral Nutrition and Physical Activity, where we report on a smaller sample from the larger study which includes the group of people Brenna interviewed after they ordered.  The results have been picked up by a couple news outlets, including this one from Reuters: 

Showing diners how many calories are in restaurant food items may influence how much they eat - especially among the least health-conscious people, a new study suggests.

That's true - but only partially true.  We find that the numeric labels mandated by "Obamacare" do not have a statistically significant effect on the number of calories people order.  The labels that we find to be (somewhat) effective are stop-light labels, in which we put a red dot next to the high calorie foods, a yellow dot next to the medium calorie foods, and a green dot next to the low-calorie foods.  As the story suggests, the labels are less influential among people who we rate (based on their survey answers) as health conscious.  The result isn't terribly surprising - people who are health conscious are probably already familiar with the caloric content of the foods they eat, and as such, adding labels are unlikely to provide new information.  Still, we'd want to know something about the cost of the label to know whether the policy was a net-plus (this is an issue we take up in our other papers still in the works).    

The result I found most interesting from the whole study was only discussed in the conclusions (and was missed all together in the news story) is the following:

Interestingly, despite the calorie+traffic light label’s effectiveness at reducing calories ordered, it was not the labeling format of choice. When asked which of the three labeling formats was preferred, only 27.5% of respondents said they wanted to see the calorie+traffic light label on their menus. Surprisingly, 42% preferred the calorie-only label which had virtually no influence on ordering behavior. These responses imply diners may want more information on their menus (the number of calories), yet diners do not want to be told what they should or should not consume (i.e., green = good, red = bad).

Summer School on Experimental Auctions

Pardon the public service announcement, but I wanted to let readers know that applications are now being accepted for a summer school I've co-taught on Experimental Auctions for the past 2 years in Italy.  Experimental auctions are a technique used to measure consumer willingness-to-pay for new food products, which in turn is used to project demand, market share, and benefits/costs of public policies.  We've had a fantastic time the past two years and I'm looking forward to the third, which was just approved for credit hours by the University of Bologna.  The content is mainly targeted toward graduate students or early career professionals (or marketing researchers interested in learning about a new technique).  You can find out more here and register here.

For a little enticement, here are some pictures of the previous years' classes.

SS_EA2011_Bertinoro1.JPG
2012-09-06 20.47.38.jpg

The danger of making public policy based on epidemiological studies

Scientific American recently ran an interesting story on antioxidants.  For a while, it seems, experts promoted antioxidants based on epidemiological studies that seemed to suggest they increased longevity.  It is a good thing these experts didn't convince policy makers to subsidize or mandated more vitamins and antioxidants in food years ago (although we do have mandated vitamin D milk and iodine in salt), only to discover this:

Vitamins Kill Epidemiological studies show that people who eat lots of fruits and vegetables, which are rich in vitamins and other antioxidants, tend to live longer and are less likely to develop cancer compared with those who do not.  So it seemed obvious that supplementing diet with antioxidants should lead to better health.  But the results of the most rigorously designed studies do not support that assumption.  Indeed, the evidence shows that some people who take certain supplements are actually more likely to develop life-threatening illnesses, such as lung cancer and heart disease.

There are many epidemiological studies showing correlations across people in the intake of one food (e.g., meat, chocolate, blueberries, wine) and some undesirable or desirable health outcome (e.g., cancer, heart disease, longevity, etc.).  But, it cannot be repeated enough: correlation is not causation.