Here at Greater Good, we cover research into social and emotional well-being, and we try to help people apply findings to their personal and professional lives. We are well aware that our business is a tricky one.
Summarizing scientific studies and applying them to people’s lives isn’t just difficult for the obvious reasons, like understanding and then explaining scientific jargon or methods to non-specialists. It’s also the case that context gets lost when we translate findings into stories, tips, and tools for a more meaningful life, especially when we push it all through the nuance-squashing machine of the Internet. Many people never read past the headlines, which intrinsically aim to overgeneralize and provoke interest. Because our articles can never be as comprehensive as the original studies, they almost always omit some crucial caveats, such as limitations acknowledged by the researchers. To get those, you need access to the studies themselves.
And it’s very common for findings to seem to contradict each other. For example, we recently covered an experiment that suggests stress reduces empathy—after having previously discussed other research suggesting that stress-prone people can be more empathic. Some readers asked: Which one is correct? (You’ll find my answer here.)
But probably the most important missing piece is the future. That may sound like a funny thing to say, but, in fact, a new study is not worth the PDF it’s printed on until its findings are replicated and validated by other studies—studies that haven’t yet happened. An experiment is merely interesting until time and testing turns its finding into a fact.
Scientists know this, and they are trained to react very skeptically to every new paper. They also expect to be greeted with skepticism when they present findings. Trust is good, but science isn’t about trust. It’s about verification.
However, journalists like me, and members of the general public, are often prone to treat every new study as though it represents the last word on the question addressed. This particular issue was highlighted last week by—wait for it—a new study that tried to reproduce 100 prior psychological studies to see if their findings held up. The result of the three-year initiative is chilling: The team, led by University of Virginia psychologist Brian Nosek, got the same results in only 36 percent of the experiments they replicated. This has led to some predictably provocative, overgeneralizing headlines implying that we shouldn’t take psychology seriously.
I don’t agree.
Despite all the mistakes and overblown claims and criticism and contradictions and arguments—or perhaps because of them—our knowledge of human brains and minds has expanded dramatically during the past century. Psychology and neuroscience have documented phenomena like cognitive dissonance, identified many of the brain structures that support our emotions, and proved the placebo effect and other dimensions of the mind-body connection, among other findings that have been tested over and over again.
These discoveries have helped us understand and treat the true causes of many illnesses. I’ve heard it argued that rising rates of diagnoses of mental illness constitute evidence that psychology is failing, but in fact, the opposite is true: We’re seeing more and better diagnoses of problems that would have compelled previous generations to dismiss people as “stupid” or “crazy” or “hyper” or “blue.” The important thing to bear in mind is that it took a very, very long time for science to come to these insights and treatments, following much trial and error.
Science isn’t a faith, but rather a method that takes time to unfold. That’s why it’s equally wrong to uncritically embrace everything you read, including what you are reading on this page.
Given the complexities and ambiguities of the scientific endeavor, is it possible for a non-scientist to strike a balance between wholesale dismissal and uncritical belief? Are there red flags to look for when you read about a study on a site like Greater Good or in a popular self-help book? If you do read one of the actual studies, how should you, as a non-scientist, gauge its credibility?
I drew on my own experience as a science journalist, and surveyed my colleagues here at the UC Berkeley Greater Good Science Center. We came up 10 questions you might ask when you read about the latest scientific findings. These are also questions we ask ourselves, before we cover a study.
1. Did the study appear in a peer-reviewed journal?
Peer review—submitting papers to other experts for independent review before acceptance—remains one of the best ways we have for ascertaining the basic seriousness of the study, and many scientists describe peer review as a truly humbling crucible. If a study didn’t go through this process, for whatever reason, it should be taken with a much bigger grain of salt.
2. Who was studied, where?
Animal experiments tell scientists a lot, but their applicability to our daily human lives will be limited. Similarly, if researchers only studied men, the conclusions might not be relevant to women, and vice versa.
This was actually a huge problem with Nosek’s effort to replicate other people’s experiments. In trying to replicate one German study, for example, they had to use different maps (ones that would be familiar to University of Virginia students) and change a scale measuring aggression to reflect American norms. This kind of variance could explain the different results. It may also suggest the limits of generalizing the results from one study to other populations not included within that study.
As a matter of approach, readers must remember that many psychological studies rely on WEIRD (Western, educated, industrialized, rich and democratic) samples, mainly college students, which creates an in-built bias in the discipline’s conclusions. Does that mean you should dismiss Western psychology? Of course not. It’s just the equivalent of a “Caution” or “Yield” sign on the road to understanding.
3. How big was the sample?
In general, the more participants in a study, the more valid its results. That said, a large sample is sometimes impossible or even undesirable for certain kinds of studies. This is especially true in expensive neuroscience experiments involving functional magnetic resonance imaging, or fMRI, scans.
And many mindfulness studies have scanned the brains of people with many thousands of hours of meditation experience—a relatively small group. Even in those cases, however, a study that looks at 30 experienced meditators is probably more solid than a similar one that scanned the brains of only 15.
4. Did the researchers control for key differences?
Diversity or gender balance aren’t necessarily virtues in a research study; it’s actually a good thing when a study population is as homogenous as possible, because it allows the researchers to limit the number of differences that might affect the result. A good researcher tries to compare apples to apples, and control for as many differences as possible in her analysis.
5. Was there a control group?
One of the first things to look for in methodology is whether the sample is randomized and involved a control group; this is especially important if a study is to suggest that a certain variable might actually cause a specific outcome, rather than just be correlated with it (see next point).
For example, were some in the sample randomly assigned a specific meditation practice while others weren’t? If the sample is large enough, randomized trials can produce solid conclusions. But, sometimes, a study will not have a control group because it’s ethically impossible. (Would people still divert a trolley to kill one person in order to save five lives, if their decision killed a real person, instead of just being a thought experiment? We’ll never know for sure!)
The conclusions may still provide some insight, but they need to be kept in perspective.
6. Did the researchers establish causality, correlation, dependence, or some other kind of relationship?
I often hear “Correlation is not causation” shouted as a kind of battle cry, to try to discredit a study. But correlation—the degree to which two or more measurements seem to change at the same time—is important, and is one step in eventually finding causation—that is, establishing a change in one variable directly triggers a change in another.
The important thing is to correctly identify the relationship.
7. Is the journalist, or even the scientist, overstating the result?
Language that suggests a fact is “proven” by one study or which promotes one solution for all people is most likely overstating the case. Sweeping generalizations of any kind often indicate a lack of humility that should be a red flag to readers. A study may very well “suggest” a certain conclusion but it rarely, if ever, “proves” it.
This is why we use a lot of cautious, hedging language in Greater Good, like “might” or “implies.”
8. Is there any conflict of interest suggested by the funding or the researchers’ affiliations?
A recent study found that you could drink lots of sugary beverages without fear of getting fat, as long as you exercised. The funder? Coca Cola, which eagerly promoted the results. This doesn’t mean the results are wrong. But it does suggest you should seek a second opinion.
9. Does the researcher seem to have an agenda?
Readers could understandably be skeptical of mindfulness meditation studies promoted by practicing Buddhists or experiments on the value of prayer conducted by Christians. Again, it doesn’t automatically mean that the conclusions are wrong. It does, however, raise the bar for peer review and replication. For example, it took hundreds of experiments before we could begin saying with confidence that mindfulness can indeed reduce stress.
10. Do the researchers acknowledge limitations and entertain alternative explanations?
Is the study focused on only one side of the story or one interpretation of the data? Has it failed to consider or refute alternative explanations? Do they demonstrate awareness of which questions are answered and which aren’t by their methods?
I summarize my personal stance as a non-scientist toward scientific findings as this: Curious, but skeptical. I take it all seriously and I take it all with a grain of salt. I judge it against my experience, knowing that my experience creates bias. I try to cultivate humility, doubt, and patience. I don’t always succeed; when I fail, I try to admit fault and forgive myself. My own understanding is imperfect, and I remind myself that one study is only one step in understanding. Above all, I try to bear in mind that science is a process, and that conclusions always raise more questions for us to answer.
– Jeremy Adam Smith is producer and editor of Greater Good, an online magazine based at UC-Berkeley that highlights ground breaking scientific research into the roots of compassion and altruism. He is also the author or coeditor of four books, including The Daddy Shift, Are We Born Racist?, and The Compassionate Instinct. Before joining the GGSC, Jeremy was a 2010-11 John S. Knight Journalism Fellow at Stanford University. Published here by courtesy of Greater Good.