Organic Food Consumption and Cancer

A couple of days ago, JAMA Internal Medicine published a paper looking at the relationship between stated levels of organic food consumption and cancer among a sample of 68,946 French consumers.

The paper, and the media coverage of it, is frustrating on many fronts, and it is symptomatic of what is wrong with so many nutritional and epidemiological studies relying on observational, self reported data without a clear strategy for identifying causal effects. As I wrote a couple years ago:

Fortunately economics (at least applied microeconomics) has undergone a bit of credibility revolution. If you attend a research seminar in virtually any economics department these days, you’re almost certain to hear questions like, “what is your identification strategy?” or “how did you deal with endogeneity or selection?” In short, the question is: how do we know the effects you’re reporting are causal effects and not just correlations.

Its high time for a credibility revolution in nutrition and epidemiology.

Yes, Yes, the title of the paper says “association” not “causation.” But, of course, that didn’t prevent the authors - in the abstract - from concluding, “promoting organic food consumption in the general population could be a promising preventive strategy against cancer” or CNN from running a headline that says, “You can cut your cancer risk by eating organic.”

So, first, how might this be only correlation and not causation? People who consume organic foods are likely to differ from people who do not in all sorts of ways that might also affect health outcomes. As the authors clearly show in their own study, people who say they eat a lot of organic food are higher income, are better educated, are less likely to smoke and drink, eat much less meat, and have overall healthier diets than people who say they never eat organic. The authors try to “control” for these factors in a statistical analysis, but there are two problems with this. First, the devil is in the details and the way these confounding factors are measured and interact could have significant effects. More importantly, some of these missing “controls” are things like overall health consciousness, risk aversion, social conformity, and more. This leads to a second more fundamental problem. These unobserved factors are likely to be highly correlated with both organic food consumption and cancer risk, and thus the estimated effect on organic is likely biased. There are many examples of this sort of endogeneity bias, and failure to think carefully about how to handle it can lead to effects that are under- or over-estimated and can even reverse the sign of the effect.

To illustrate, suppose an unmeasured variable like health consciousness is driving both organic purchases and cancer risk. A highly health conscious person is going to undertake all sorts of activities that might lower cancer risks - seeing the doctor regularly, taking vitamins, being careful about their diet, reading new dietary studies, exercising in certain ways, etc. And, such a person might also eat more organic food, thus the correlation. The point is that even if such a highly health conscious person weren’t eating organic, they’d still have lower cancer risk. It isn’t the organic causing the lower cancer risk. Or stated differently, if we took a highly health UNconscious person and forced them to eat a lot of organic, would we expect their cancer risk to fall? If not, this is correlation and not causation.

Ideally, we’d like to conduct a randomized controlled trial (RCT) (randomly feed one group a lot of organic and another group none and compare outcomes), but these types of studies can be very expensive and time consuming. Fortunately, economists and others have come up with creative ways to try to address the unobserved variable and endogeneity issues that gets us closer to the RCT ideal, but I see no effort on the part of these authors to take these issues seriously in their analysis.

Then, there are all sorts of worrying details in the study itself. Organic food consumption is a self-reported variable measured in a very ad-hoc way. People were asked if they consumed organic most of the time (people were given 2 points), occasionally (people were given one point), or never (no points), and this was summed across 16 different food categories ranging from fruits to meats to vegetable oils. Curiously, when the authors limit their organic food variable to only plant-based sources (presumable because this is where pesticide risks are most acute), the effects for most cancers diminishes. It is also curious that the there wasn’t always a “dose response” relationship between organic consumption scores and cancer risk. Also, when the authors limit their analysis to particular sub-groups (like men), the relationship between organic consumption and cancer disappears. Tamar Haspel, a food and agricultural writer for the Washington Post, delves into some of these issues and more in a Tweet-storm.

Finally, even if the estimated effects are “true”, how big and consequential are they? The authors studied 68,946 people, 1,340 of whom were diagnosed with cancer at some point during the approximately 6 year study. So, the baseline chance of any getting any type of cancer was (1340/68,946)*100 = 1.9%, or roughly 2 people out of 100. Now, let’s look at the case where the effects seem to be the largest and most consistent across the various specifications, non-Hodgkin lymphomas (NHL). There were 47 cases of NHL, meaning there was a (47/68,946)*100 = 0.068% overall chance of getting NHL in this population over this time period. 15 and 14 people, respectively, in the lowest first and second quartiles of organic food scores had NHL, but 16 people in the third highest quartile of organic food consumption had HCL. When we get to the highest quartile of stated organic food scale, the number of people with HCL now dropped to only 2. After making various statistical adjustments, the authors calculate a “hazard ratio” of 0.14 for people in the lowest vs. highest quartiles of organic food consumption, meaning there was a whopping 86% reduction in risk. But, what does that mean relative to the baseline? It means going from a risk of 0.068% to a risk of 0.068*0.14=0.01%, or from about 7 in 10,000 to 1 in 10,000. To put these figures in perspective, the overall likelihood of someone in the population dying from a car accident next year are about 1.25 in 10,000 and are about 97 in 10,000 over the course of a lifetime. The one-year and lifetime risk from dying from a fall on stairs and steps is 0.07 in 10,000 and 5.7 in 10,000.

In sum, I'm not arguing that eating more organic food might not be causally related to reduced cancer risk, especially given the plausible causal mechanisms. Rather, I’m arguing that this particular study doesn’t go very far in helping us answer that fundamental question. And, if we do ultimately arrive at better estimates from studies that take causal identification seriously that reverse these findings, we will have undermined consumer trust by promoting these types of studies (just ask people whether they think eggs, coffee, chocolate, or blueberry increase or reduce the odds of cancer or heart disease).

Dealing with Lazy Survey Takers

A tweet by @thefarmbabe earlier this week has renewed interest in my survey result from back in January 2015, where we found more than 80% of survey respondents said they wanted mandatory labels on foods containing DNA. For interested readers, see this discussion on the result, a follow-up survey where the question was asked in a different way with essentially the same result, or this peer-reviewed journal article with Brandon McFadden where we found basically the same result in yet another survey sample. No matter how we asked this question, it seems 80% of survey respondents say they want to label foods because they have DNA.

All this is probably good motivation for this recent study that Trey Malone and I just published in the journal Economic Inquiry. While there are many possible reasons for the DNA-label results (as I discussed here), one possibility is that survey takers aren’t paying very close attention to the questions being asked.

One method that’s been around a while to control for this problem is to use a “trap question” in a survey. The idea is to “trap” inattentive respondents by making it appear one question is being asked, when in fact - if you read closely - a different question is asked. Here are two of the trap questions we studied.


About 22% missed the first trap question (they did not click “high” to the last question in figure 2A) and about 25% missed the second question (the respondent clicked an emotion rather than “none of the above” in question 2B). So far, this isn’t all that new.

Trey’s idea was to prompt people who missed the trap question. Participants who incorrectly responded were given the following prompt, “You appear to have misunderstood the previous question. Please be sure to read all directions clearly before you respond.” The respondent then had the chance to revise their answers to the trap question they missed before proceeding to the rest of the survey. Among the “trapped” respondents, about 44% went back and correctly answered the first question, whereas about 67% went back and correctly answered the second question. Thus, this “nudge” led to an increase in attentiveness among a non-trivial number of respondents.

After the trap questions and potential prompts, respondents subsequently answered several discrete choice questions about which beer brands they’d prefer at different prices. Here are the key findings:

We find that individuals who miss trap questions and do not correctly revise their responses have significantly different choice patterns as compared to individuals who correctly answer the trap question. Adjusting for these inattentive responses has a substantive impact on policy impacts. Results, based on attentive participant responses, indicate that a minimum beer price would have to be substantial to substantially reduce beer demand.

In our policy simulations, we find a counter-intuitive result - a minimum beer price (as implemented in some parts of the UK) might actually increase alcohol consumption as it leads to a substitution from lower to higher alcohol content beers.

In another paper in the European Review of Agricultural Economics that was published back in July, Trey and I proposed a different, yet easy-to-interpret measure of (and way to fix) inattention bias in discrete choice statistical models.

Taken together, these papers show that inattention is a significant problem in surveys, and that adjusting results for inattention can substantively alter one’s results.

We haven’t yet done a study of whether people who say they want DNA labels are more or less likely to miss trap question or exhibit other forms of inattention bias, but that seems a natural question to ask. Still, inattention can’t be the full explanation for absurd label preferences. We’ve never found inattention bias as high as the level of support for mandatory labels on foods indicating the presence/absence of DNA.

New Published Research

I've had several new papers published in the last month or so that I haven't had a chance to discuss here on the blog.  So, before I forget, here's a short list.

  • What to Eat When Having a Millennial over for Dinner with Kelsey Conley was published in Applied Economic Perspectives and Policy.  We found Millennials have higher demand for cereal, beef, pork, poultry, eggs, and fresh fruit and lower demand for “other” food, and for food away from home relative to what would have been expected from the eating patterns of the young and old 35 years prior.  I'd previously blogged about an earlier version of this paper.
  • A simple diagnostic measure of inattention bias in discrete choice models with Trey Malone in the European Review of Agricultural Economics. Measuring the "fit" of discrete choice models has long been a challenge, and in this paper, we suggest a simple, easy-to-understand measure of inattention bias in discrete choice models. The metric, ranging from 0 to 1, can be compared across studies and samples.
  • Mitigating Overbidding Behavior using Hybrid Auction Mechanisms: Results from an Induced Value Experiment with David Ortega Rob Shupp and Rudy Nayga in Agribusiness.  Experimental auctions are a popular and useful tool in understanding demand for food and agricultural products. However, bidding behavior often deviates from theoretical predictions in traditional Vickrey and Becker–DeGroot–Marschak (BDM) auction mechanisms. We propose and explore the bidding behavior and demand revealing properties of a hybrid first price‐Vickrey auction and a hybrid first price‐BDM mechanism. We find the hybrid first price‐Vickrey auction and hybrid first price‐BDM mechanism significantly reduce participants’ likelihood of overbidding, and on average yield bids closer to true valuations. 



Measuring Beef Demand

There has been a lot of negative publicity about the health and environmental impacts of meat eating lately.  Has this reduced consumers' demand for beef?  Commodity organizations like the Beef Board run ads like "Beef It's What's for Dinner."  Have these ads increased beef demand?  To answer these sorts of questions, one needs a measure of consumer demand for beef.  In my FooDS project, I try to measure this by using consumers' willingness-to-pay for meat cuts over time.  But, there are other ways.

I just ran across this fascinating report Glynn Tonsor and Ted Scroeder wrote on beef demand.  At the onset, they explain their overall approach.

One way to synthesize beef demand is through construction of an index that measures and tracks changes in demand over time. An index is appealing because it provides an easy to understand, single-measure indicator of beef demand change over time. A demand index can be created by inferring the price one would expect to observe if demand was unchanged with that experienced in a base year (Tonsor, 2010). The “inferred” constant-demand price is compared to the beef price actually transpiring in the marketplace to indicate changes in underlying demand. If the realized beef price is higher (lower) than what is expected if demand were constant, economists say demand has increased (decreased) by the percentage difference detected. Applying this approach to publically available annual USDA aggregate beef disappearance and BLS retail price data provides information such as contained in Figure 1 indicating notable demand growth between 2010 and 2015 based upon existing indices currently maintained at Kansas State University.

They then show the beef demand index that Glynn has been updating for several years now based on aggregate USDA data.

In their report, Tonsor and Schroeder show, however, that measures of beef demand depend greatly on: 1) the data source being used, 2) the cut of beef in question, and 3) consumers' region of residence.  For example, here is a different beef demand index based on data from restaurants (or the "food service sector") segmented into different types of beef.  You'll notice the pattern of results below differs quite a bit from the aggregate measure above.  And, whereas demand for steak fell during the recession, demand for ground beef rose.

Another interesting result from their study is that the commonly used retail beef price series reported by the Bureau of Labor Statistics doesn't always mesh well with what we learn from from retail scanner data (in their case, data from the compiled by the company IRI).  Not only are BLS prices a biased estimate of scanner data prices, the bias isn’t constant over time.  In the report, Tonsor and Schroeder speculate a bit on why this is the case.  

In the near future, Glynn and I aim to compare my demand measures from FooDS with these demand measures. 

Worrying Trends with Farm Surveys

Response rates on [USDA-National Agricultural Statistics Survey] crop acreage and production surveys have been falling in recent decades (Ridolfo, Boone, and Dickey, 2013). From response rates of 80-85 percent in the early 1990s, rates have fallen below 60 percent in some cases (Figure 1). Of even greater concern, there appears to an acceleration in the decline in the last 5 years or so, suggesting the possibility that this decline reflects a long-term permanent change.

That's from an interesting (yet worrying) article by the USDA chief economist Robert Johansson along with Anne Effland, and Keith Coble at farmdocdaily. 

Why does this matter?

Responses to these surveys form the basis of what we think we know about, for example, how much farmland is in production, how much corn vs. soybeans is planted in a given year, the extent to which wheat yields are trending upward, and more.  It's hard to understate how much of what we think we know about the state of U.S. agriculture stems from these surveys.   For examples, I used these data in my article in the New York Times to describe the gains in farm productivity over time;  economists use the data to try to predict the possible effects of climate change on crop yields and farm profitability; the data are used to try to figure out how farmer's planting decisions respond changes in crop prices (which provide estimates of the elasticity of supply, which feed into various models that inform policy makers), and much more.

The concern with falling response rates is that the farmers who respond may be different than the one's who don't in a way that biases our understanding of crop acreage and production.  The authors write:  

Reduced response rates can potentially introduce bias or error to the estimates released by USDA. For example bias may occur if higher yielding farms drop out. Reduced response will almost assuredly introduce error to the estimates making them noisier and randomly more inaccurate. This will be most noticeable in county estimates.

The authors go on to note that some farm program payments depend on county-level yield estimates (which the above note notes are now less reliable).  As such, this isn't just some academic curiosity, but an issue that could literally affect millions of taxpayer dollars.    

The problem of declining response rates isn't just with farmers.  This paper, appropriately titled "Household Surveys in Crisis", points out it is an issue with other government surveys of households as well. These are the surveys that attempt to provide statistics on people's incomes, employment, and so forth.

The solutions to these problems are not obvious or easy.  Here is the authors' take:

Some research suggests that tailoring survey approaches to differing audiences within the survey population could improve response rates (Anseel et al., 2010). Other data sources like remote sensing, weather data, modeling, machine data, or integrated datasets may also be useful in providing additional information. NASS already makes use of some of these other data sources and methods in developing estimates, but as a supplement, not a replacement, for survey data. Further use of such sources is costly. For now, the best approach remains encouraging greater producer response.