View source | |
# 2024-08-12 - How To Spot The Truth | |
## 1 INTRODUCTION | |
'Truth' is under attack, more so now than ever before, and for many | |
reasons one of which is social media. We hear and read remarkable, | |
often preposterous claims from many sources. This may be in political | |
debate, the presentation of new products, or new health-enhancing | |
exercises ranging from hot water pools to cold water swimming. These | |
frequently claim to be 'scientific findings' often reporting 'new | |
studies have shown' stories, underpinned by 'expert'opinion. They are | |
amplified in the media until the next fad comes along. | |
This pervasive form of persuasion is a war of beliefs, which in many | |
cases may contradict accepted knowledge. It is always possible, in | |
fact likely, that some of the more absurd claims may not involve, or | |
even be properly aware of, current scientific understanding, in which | |
case these claims may be logical, but based on incorrect assumptions | |
or understanding. Flat earthers have a consistent world view, which | |
is probably logical to them; it just is not compatible with other | |
known facts. But truth is the first casualty of war, and now more | |
than ever, we must equip ourselves and others with the skills needed | |
to judge how valid the information we are presented with is. | |
This is not as simple as it might appear. The context is | |
all-important. Interestingly, there are far fewer exact rules, firm | |
guidelines and exact cut-off levels than people might imagine for | |
establishing the truth. Most scientific knowledge is rarely expressed | |
in terms of utter validity, but rather expressed as 'fits' or 'is not | |
inconsistent with' what we know already, or 'suitable for predicting | |
performance'. For example, we now know that gravity can be bent; but | |
Newton's simple straight-line approximation has taken astronauts to | |
the moon and back (sorry, flat earthers). In addition, although | |
statisticians use words consistently and exactly, they do not use | |
words such as 'population' and 'sample' in the way they are used in | |
general parlance. Nor is the logic of statistics straightforward. For | |
example, the most commonly used tests of likelihood assume 'if, and | |
only if, these random samples were drawn from a single population, | |
then…' Logical and consistent, yes, but not well understood, even by | |
some scientists. For example, in one study, trainee doctors, who | |
should be reading this sort of stuff all the time, were given a | |
simple statement using this test. When asked to choose the correct | |
conclusion out of four possibilities, almost half made a wrong choice | |
(Windish et al., 2007). | |
https://jamanetwork.com/journals/jama/fullarticle/208638 | |
## 2 WHY IS GETTING AS CLOSE AS POSSIBLE TO THE TRUTH IMPORTANT? | |
The truth helps you make 'adequately correct' decisions and act | |
accordingly. Such decisions depend on the situation, and the risks of | |
making a correct or incorrect decision. Uncertainty doesn't mean we | |
know nothing, or that anything could be true: it just means you don't | |
bet your house on an outsider. | |
Some years ago, a district court decided that a particular vaccine | |
was responsible for an adverse outcome (which was scientifically | |
doubtful). This triggered a disastrous decrease in child vaccinations | |
for a whole range of diseases. It also showed convincingly that the | |
transmission of the faulty conclusion was related to internet | |
broadband access: more broadband, greater decrease in vaccinations | |
(Carrieri et al., 2019). | |
https://onlinelibrary.wiley.com/doi/10.1002/hec.3937 | |
In another case, however, a US court rejected a manufacturer's | |
defence that there were insufficient data to meet the usual | |
scientific criteria to demonstrate a causal link between a drug and a | |
serious, but rare, adverse event; and this is why the drug was | |
marketed without a warning. The court was unwilling to accept this | |
statistical threshold, preferring to heed the reports of infrequent, | |
but important, adverse events after the use of the drug, and thus | |
awarded damages (Matrixx initiatives, Inc. et al. vs Siracusano et | |
al., 2011). | |
https://supreme.justia.com/cases/federal/us/563/27/ | |
Here, we shall try to show the reader the processes applied in | |
scientific evaluation, in the hope that you can apply them in your | |
day-to-day decision-making. Facts don't speak for themselves--context | |
is vital. An experienced scientist, who "knows the ropes", is more | |
likely to use their knowledge, experience and judgement to tease out | |
the full story. The central question is not 'can we be certain?', but | |
rather 'can we process this information and adjust our ideas?' | |
Uncertainty is always present, but we may be able to be 'confidently | |
uncertain'. | |
## 3 A CHECKLIST FOR TRUTH | |
(ELEMENTS OF THE CONTEXT AND QUESTIONS THAT SHOULD BE ASKED OF ANY | |
CLAIM) | |
* Who is making the statement, and what is their qualification for | |
making it? | |
* What was the original question? Has it been correctly framed? | |
* What is the underpinning evidence for the statement? What is the | |
provenance of the supporting data? Where has it been published? Are | |
there alternative explanations, have these been explored, how | |
possible are they? | |
* Has the best measure been used? The best way to express 'typical' | |
is as the median value, as is done by the Office for National | |
Statistics. However, many reports use the average, which could be | |
far from the same thing and make, for example, the 'typical' person | |
apparently better off (if we put incomes in order of size, from the | |
least to the greatest, the 'median' is the one closest to the | |
halfway point in this order. Many more incomes are small, only a | |
few are whopping, so the median is closer to the bottom. The | |
'average' or 'mean' is the sum of all the money in the incomes | |
[lots of paltry ones, some whopping ones] divided by all the | |
incomes considered in the sample. For example, median UK household | |
disposable income in the financial year ending 2022 was about £32K, | |
and the average was £40K.) | |
* Have basic scientific principles been used: for example, how was | |
the sample of people that was tested obtained? The concept of a | |
'random' sample, scientifically, is that it will contain people | |
from all walks of life, ages, states of health of the target | |
population: so that the results can be applied to that population. | |
If we study healthy students, then the answer may only apply to | |
healthy students. | |
* Were sufficient people tested to reliably and confidently find an | |
effect? The most reliable and frequent (but rather clumsy) study | |
design is a 'randomised controlled trial', often used to test new | |
drugs against old ones. Such studies often need hundreds of | |
participants if the drugs aren't that different in effect. Smaller | |
studies may not reliably find an effect: if they do, by chance, | |
then this change exaggerates the benefit (this is known as the | |
'winner's curse' [Sidebotham & Barlow, 2024]--attempts to verify or | |
replicate this first observed effect often fail!). | |
https://associationofanaesthetists-publications.onlinelibrary.wiley.com/doi/10.… | |
* It is not easy to prove that something does not exist, and a large | |
study is needed to reach valid conclusions. This is important if | |
you are investigating a rare but serious complication or a new | |
technique. For example, if a new surgical procedure is carried out | |
20 times without a problem, it is not necessarily safe. If the same | |
procedure were carried out 100 times, and the death risk were | |
randomly distributed in the same way as for the first 20, there is | |
a 95% chance that the number of deaths will be between 0 and 16 | |
(and it is likely that fitter patients were selected first in the | |
original study--see 'bias' below). | |
* Was there a 'control group'? If an intervention is being assessed | |
(e.g., the health benefits of cold-water swimming), then a control | |
group is needed that will carry out the same activities but without | |
the hypothesised 'active ingredient' (e.g., cold). The control | |
group should include all other factors that could be at work, such | |
as similar locations, similar companions, same food, same exercise, | |
same bedtime and sleep profile, etc. | |
* Humans vary a great deal, so experiments comparing human | |
participants are difficult. This is particularly obvious in | |
responses to medication, and can lead to unexpectedly different | |
results. An elegant way of getting around this is to 'cross-over' a | |
treatment and compare the same individuals, each given both the | |
'control' and the 'active' treatment. However, without care this | |
can also lead to complexities. Ideally half the participants should | |
start with the active treatment, and half with a 'neutral' | |
(control) treatment, but how can we be sure that the active | |
treatment has worn off ('washed out') before testing the control | |
treatment? For example, hormones may have effects that last long | |
after the actual drug has left the body, and some | |
psychophysiological changes can be long-lasting. Indeed, some would | |
argue that, in some studies, with some people, wash out may never | |
fully occur (Tipton & Mekjavic, 2000). | |
https://link.springer.com/article/10.1007/s004210000255 | |
* What measurements are made? Are these measurements, like blood | |
pressure, blood levels of hormones? Or questionnaires? What | |
questions get asked? It is very easy to ask leading questions, | |
particularly if the person taking part believes something is doing | |
them good. A far better (but far less likely) outcome would be | |
health assessments a year after an intervention! Do the scientists | |
making the measurements know the treatment, and what do they expect | |
to find? In one study, when a pain-killer was tested, the testers | |
(who were kept unaware of the drug being tested) found different | |
effects if the tester had different expectations of the drug's | |
effects (Gracely et al., 1985). | |
https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(85)90984-5/full… | |
* Are tests being used as 'proxy' or 'surrogate' measurements for | |
something that is more important but not as easy to measure? | |
Examples include using exam scores as an index of ability, or body | |
mass index (BMI) for health assessments. How reliable, and exact, | |
are such surrogate assessments? | |
* Does the proponent have any conflict of interest? Does what they | |
argue benefit them? | |
* Is there any 'bias'? Bias can creep in at lots of stages in the | |
process of getting information and presenting it. Scientific | |
publications are very varied: papers in highly regarded journals | |
have met demanding acceptance standards, with stringent peer | |
assessment, compared with some 'open access' journals, where papers | |
are also assessed, but the author pays, or 'vanity journals' where | |
the author only has to pay to get published! However, all journals | |
are looking to attract readers and citations, and there is nothing | |
better than controversy to boost readership and citations. | |
Additionally, presentations at conferences often turn up as | |
'publications' but have had virtually no peer assessment, and such | |
conferences can be international, national or local. | |
* The funding of research affects what gets published. Published | |
research papers funded by companies and dealing with available | |
products are more likely to give a "positive" result than studies | |
independently funded (Bourgeois et al., 2010). Product evaluation | |
can be designed to be flattering in terms of the variables | |
assessed, avoiding observing later adverse effects, and selecting | |
those tested (age, sex, race). It is now necessary to register | |
clinical studies before they start: but lots of studies funded by | |
drug companies are not published. Even trivial effects can be | |
'statistically significant' if the study is large enough. | |
Regulatory oversight of large scale, urgent studies can be limited | |
and poor practice can be concealed (Powell-Smith & Goldacre, 2016). | |
https://www.acpjournals.org/doi/10.7326/0003-4819-153-3-201008030-00006 | |
https://f1000research.com/articles/5-2629/v1 | |
* Survival bias is relevant. Are the data already selected? A | |
salutary application of the study of survivors was the analysis of | |
damage found on aircraft returning to base after combat. Clearly, a | |
returning aircraft could take damage in those areas and still fly | |
well enough to return safely to base. Thus, it would be best, if | |
possible, to protect areas that were not seen to be damaged in | |
these aircraft. Hits in undamaged areas presumably were more | |
crippling (Mangel & Samaniego, 1984). | |
https://www.tandfonline.com/doi/abs/10.1080/01621459.1984.10478038 | |
Overall, as a result of failure to meet some of the requirements | |
listed above, about half of published medical papers are unlikely to | |
be true (Ioannidis, 2005). In 2023, the number of retractions for | |
research articles internationally reached a new record of over 10,000 | |
(Noorden, 2023) due to an increase in sham papers and peer-review | |
fraud. Furthermore, despite a requirement for disclosure, a lot of | |
government research is never released, or is delayed until interest | |
in the topic has declined. | |
https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124 | |
https://www.nature.com/articles/d41586-023-03974-8 | |
A recent study (Briganti et al., 2023) reviewed the papers published | |
on the health and recovery benefits of cold-water exposure. They | |
found 931 articles, and then carefully weeded out irrelevant studies. | |
The authors were left with 24 papers, and in these the risk of bias | |
was 'high' in 15 and 'gave concern' in four. Thus, only five papers | |
had a 'low' risk of bias: three of these looked at cold water | |
immersion after exercise and two at cognitive function. So, a very | |
small percentage of the studies examined had anything really useful | |
to say. | |
https://onlinelibrary.wiley.com/doi/10.1111/apha.14056 | |
## 4 WHAT ABOUT THE 'FINDINGS' YOU ARE PRESENTED WITH? | |
Watch out for percentages (Bolton, 2023). A simple change is easily | |
understood as a percentage, but 'scientific' studies involving | |
comparisons between groups can require more careful consideration. | |
These comparisons should always trigger the question 'percentage of | |
what, exactly?' The headline, 'New drug/product/intervention cuts | |
mortality by 50%' sounds impressive, and attracts attention, but the | |
reality could be less spectacular. Perhaps using the old drug, the | |
death rate was 20 per 1000 patients, and when the new drug was first | |
used, the rate became 10 per 1000 patients: a 50% reduction. But the | |
absolute risk reduction in death rate was 10 per 1000, or 1%, a less | |
impressive headline. | |
https://commonslibrary.parliament.uk/research-briefings/sn04446/ | |
Also, beware of correlations. Just because two things relate to each | |
other, for example, a diet and a sense of well-being, does not mean | |
that one causes the other. The world is full of accidental (spurious) | |
correlations (Van Cauwenberge, 2016). One of our favourites is the | |
high correlation between the divorce rate in Maine, USA and the per | |
capita consumption of margarine! Also, ask the question 'how many | |
false positives and negatives will I get if I use this correlation to | |
make a decision' (Tipton et al., 2012). | |
https://www.datasciencecentral.com/spurious-correlations-15-examples/ | |
https://link.springer.com/article/10.1007/s004210000255 | |
For the moment at least, artificial intelligence cannot quantify | |
uncertainty very well. Generally, AI uses stuff from 'out there' as | |
if it were true. Thus, a high proportion of garbage in will give you | |
garbage out (which increases the proportion of garbage that AI uses | |
next time round)! | |
We hope that, armed with the above checklist, you can challenge and | |
interrogate the polarising information, from 'spin' to the outright | |
falsehoods presented to you on a daily basis. We are at risk of being | |
overwhelmed by an increasing number of dubious, unregulated and | |
disparate sources. The next time you hear phrases like 'they say this | |
is great' or 'this is scientifically proven' start by asking 'who are | |
they?' and 'which scientists, using which methods?' Be cautious and | |
questioning; snake oil and its vendors still exist, they come in many | |
guises. | |
From: https://physoc.onlinelibrary.wiley.com/doi/10.1113/EP092160 | |
tags: article,science | |
# Tags | |
article | |
science |