/~bencollver/log/2024-08-12-how-to-spot-the-truth on tilde.pink

	View source

	# 2024-08-12 - How To Spot The Truth

	## 1 INTRODUCTION

	'Truth' is under attack, more so now than ever before, and for many
	reasons one of which is social media. We hear and read remarkable,
	often preposterous claims from many sources. This may be in political
	debate, the presentation of new products, or new health-enhancing
	exercises ranging from hot water pools to cold water swimming. These
	frequently claim to be 'scientific findings' often reporting 'new
	studies have shown' stories, underpinned by 'expert'opinion. They are
	amplified in the media until the next fad comes along.

	This pervasive form of persuasion is a war of beliefs, which in many
	cases may contradict accepted knowledge. It is always possible, in
	fact likely, that some of the more absurd claims may not involve, or
	even be properly aware of, current scientific understanding, in which
	case these claims may be logical, but based on incorrect assumptions
	or understanding. Flat earthers have a consistent world view, which
	is probably logical to them; it just is not compatible with other
	known facts. But truth is the first casualty of war, and now more
	than ever, we must equip ourselves and others with the skills needed
	to judge how valid the information we are presented with is.

	This is not as simple as it might appear. The context is
	all-important. Interestingly, there are far fewer exact rules, firm
	guidelines and exact cut-off levels than people might imagine for
	establishing the truth. Most scientific knowledge is rarely expressed
	in terms of utter validity, but rather expressed as 'fits' or 'is not
	inconsistent with' what we know already, or 'suitable for predicting
	performance'. For example, we now know that gravity can be bent; but
	Newton's simple straight-line approximation has taken astronauts to
	the moon and back (sorry, flat earthers). In addition, although
	statisticians use words consistently and exactly, they do not use
	words such as 'population' and 'sample' in the way they are used in
	general parlance. Nor is the logic of statistics straightforward. For
	example, the most commonly used tests of likelihood assume 'if, and
	only if, these random samples were drawn from a single population,
	then…' Logical and consistent, yes, but not well understood, even by
	some scientists. For example, in one study, trainee doctors, who
	should be reading this sort of stuff all the time, were given a
	simple statement using this test. When asked to choose the correct
	conclusion out of four possibilities, almost half made a wrong choice
	(Windish et al., 2007).

	https://jamanetwork.com/journals/jama/fullarticle/208638

	## 2 WHY IS GETTING AS CLOSE AS POSSIBLE TO THE TRUTH IMPORTANT?

	The truth helps you make 'adequately correct' decisions and act
	accordingly. Such decisions depend on the situation, and the risks of
	making a correct or incorrect decision. Uncertainty doesn't mean we
	know nothing, or that anything could be true: it just means you don't
	bet your house on an outsider.

	Some years ago, a district court decided that a particular vaccine
	was responsible for an adverse outcome (which was scientifically
	doubtful). This triggered a disastrous decrease in child vaccinations
	for a whole range of diseases. It also showed convincingly that the
	transmission of the faulty conclusion was related to internet
	broadband access: more broadband, greater decrease in vaccinations
	(Carrieri et al., 2019).

	https://onlinelibrary.wiley.com/doi/10.1002/hec.3937

	In another case, however, a US court rejected a manufacturer's
	defence that there were insufficient data to meet the usual
	scientific criteria to demonstrate a causal link between a drug and a
	serious, but rare, adverse event; and this is why the drug was
	marketed without a warning. The court was unwilling to accept this
	statistical threshold, preferring to heed the reports of infrequent,
	but important, adverse events after the use of the drug, and thus
	awarded damages (Matrixx initiatives, Inc. et al. vs Siracusano et
	al., 2011).

	https://supreme.justia.com/cases/federal/us/563/27/

	Here, we shall try to show the reader the processes applied in
	scientific evaluation, in the hope that you can apply them in your
	day-to-day decision-making. Facts don't speak for themselves--context
	is vital. An experienced scientist, who "knows the ropes", is more
	likely to use their knowledge, experience and judgement to tease out
	the full story. The central question is not 'can we be certain?', but
	rather 'can we process this information and adjust our ideas?'
	Uncertainty is always present, but we may be able to be 'confidently
	uncertain'.

	## 3 A CHECKLIST FOR TRUTH

	(ELEMENTS OF THE CONTEXT AND QUESTIONS THAT SHOULD BE ASKED OF ANY
	CLAIM)

	* Who is making the statement, and what is their qualification for
	making it?

	* What was the original question? Has it been correctly framed?

	* What is the underpinning evidence for the statement? What is the
	provenance of the supporting data? Where has it been published? Are
	there alternative explanations, have these been explored, how
	possible are they?

	* Has the best measure been used? The best way to express 'typical'
	is as the median value, as is done by the Office for National
	Statistics. However, many reports use the average, which could be
	far from the same thing and make, for example, the 'typical' person
	apparently better off (if we put incomes in order of size, from the
	least to the greatest, the 'median' is the one closest to the
	halfway point in this order. Many more incomes are small, only a
	few are whopping, so the median is closer to the bottom. The
	'average' or 'mean' is the sum of all the money in the incomes
	[lots of paltry ones, some whopping ones] divided by all the
	incomes considered in the sample. For example, median UK household
	disposable income in the financial year ending 2022 was about £32K,
	and the average was £40K.)

	* Have basic scientific principles been used: for example, how was
	the sample of people that was tested obtained? The concept of a
	'random' sample, scientifically, is that it will contain people
	from all walks of life, ages, states of health of the target
	population: so that the results can be applied to that population.
	If we study healthy students, then the answer may only apply to
	healthy students.

	* Were sufficient people tested to reliably and confidently find an
	effect? The most reliable and frequent (but rather clumsy) study
	design is a 'randomised controlled trial', often used to test new
	drugs against old ones. Such studies often need hundreds of
	participants if the drugs aren't that different in effect. Smaller
	studies may not reliably find an effect: if they do, by chance,
	then this change exaggerates the benefit (this is known as the
	'winner's curse' [Sidebotham & Barlow, 2024]--attempts to verify or
	replicate this first observed effect often fail!).
	https://associationofanaesthetists-publications.onlinelibrary.wiley.com/doi/10.…

	* It is not easy to prove that something does not exist, and a large
	study is needed to reach valid conclusions. This is important if
	you are investigating a rare but serious complication or a new
	technique. For example, if a new surgical procedure is carried out
	20 times without a problem, it is not necessarily safe. If the same
	procedure were carried out 100 times, and the death risk were
	randomly distributed in the same way as for the first 20, there is
	a 95% chance that the number of deaths will be between 0 and 16
	(and it is likely that fitter patients were selected first in the
	original study--see 'bias' below).

	* Was there a 'control group'? If an intervention is being assessed
	(e.g., the health benefits of cold-water swimming), then a control
	group is needed that will carry out the same activities but without
	the hypothesised 'active ingredient' (e.g., cold). The control
	group should include all other factors that could be at work, such
	as similar locations, similar companions, same food, same exercise,
	same bedtime and sleep profile, etc.

	* Humans vary a great deal, so experiments comparing human
	participants are difficult. This is particularly obvious in
	responses to medication, and can lead to unexpectedly different
	results. An elegant way of getting around this is to 'cross-over' a
	treatment and compare the same individuals, each given both the
	'control' and the 'active' treatment. However, without care this
	can also lead to complexities. Ideally half the participants should
	start with the active treatment, and half with a 'neutral'
	(control) treatment, but how can we be sure that the active
	treatment has worn off ('washed out') before testing the control
	treatment? For example, hormones may have effects that last long
	after the actual drug has left the body, and some
	psychophysiological changes can be long-lasting. Indeed, some would
	argue that, in some studies, with some people, wash out may never
	fully occur (Tipton & Mekjavic, 2000).
	https://link.springer.com/article/10.1007/s004210000255

	* What measurements are made? Are these measurements, like blood
	pressure, blood levels of hormones? Or questionnaires? What
	questions get asked? It is very easy to ask leading questions,
	particularly if the person taking part believes something is doing
	them good. A far better (but far less likely) outcome would be
	health assessments a year after an intervention! Do the scientists
	making the measurements know the treatment, and what do they expect
	to find? In one study, when a pain-killer was tested, the testers
	(who were kept unaware of the drug being tested) found different
	effects if the tester had different expectations of the drug's
	effects (Gracely et al., 1985).
	https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(85)90984-5/full…

	* Are tests being used as 'proxy' or 'surrogate' measurements for
	something that is more important but not as easy to measure?
	Examples include using exam scores as an index of ability, or body
	mass index (BMI) for health assessments. How reliable, and exact,
	are such surrogate assessments?

	* Does the proponent have any conflict of interest? Does what they
	argue benefit them?

	* Is there any 'bias'? Bias can creep in at lots of stages in the
	process of getting information and presenting it. Scientific
	publications are very varied: papers in highly regarded journals
	have met demanding acceptance standards, with stringent peer
	assessment, compared with some 'open access' journals, where papers
	are also assessed, but the author pays, or 'vanity journals' where
	the author only has to pay to get published! However, all journals
	are looking to attract readers and citations, and there is nothing
	better than controversy to boost readership and citations.
	Additionally, presentations at conferences often turn up as
	'publications' but have had virtually no peer assessment, and such
	conferences can be international, national or local.

	* The funding of research affects what gets published. Published
	research papers funded by companies and dealing with available
	products are more likely to give a "positive" result than studies
	independently funded (Bourgeois et al., 2010). Product evaluation
	can be designed to be flattering in terms of the variables
	assessed, avoiding observing later adverse effects, and selecting
	those tested (age, sex, race). It is now necessary to register
	clinical studies before they start: but lots of studies funded by
	drug companies are not published. Even trivial effects can be
	'statistically significant' if the study is large enough.
	Regulatory oversight of large scale, urgent studies can be limited
	and poor practice can be concealed (Powell-Smith & Goldacre, 2016).
	https://www.acpjournals.org/doi/10.7326/0003-4819-153-3-201008030-00006
	https://f1000research.com/articles/5-2629/v1

	* Survival bias is relevant. Are the data already selected? A
	salutary application of the study of survivors was the analysis of
	damage found on aircraft returning to base after combat. Clearly, a
	returning aircraft could take damage in those areas and still fly
	well enough to return safely to base. Thus, it would be best, if
	possible, to protect areas that were not seen to be damaged in
	these aircraft. Hits in undamaged areas presumably were more
	crippling (Mangel & Samaniego, 1984).
	https://www.tandfonline.com/doi/abs/10.1080/01621459.1984.10478038

	Overall, as a result of failure to meet some of the requirements
	listed above, about half of published medical papers are unlikely to
	be true (Ioannidis, 2005). In 2023, the number of retractions for
	research articles internationally reached a new record of over 10,000
	(Noorden, 2023) due to an increase in sham papers and peer-review
	fraud. Furthermore, despite a requirement for disclosure, a lot of
	government research is never released, or is delayed until interest
	in the topic has declined.

	https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

	https://www.nature.com/articles/d41586-023-03974-8

	A recent study (Briganti et al., 2023) reviewed the papers published
	on the health and recovery benefits of cold-water exposure. They
	found 931 articles, and then carefully weeded out irrelevant studies.
	The authors were left with 24 papers, and in these the risk of bias
	was 'high' in 15 and 'gave concern' in four. Thus, only five papers
	had a 'low' risk of bias: three of these looked at cold water
	immersion after exercise and two at cognitive function. So, a very
	small percentage of the studies examined had anything really useful
	to say.

	https://onlinelibrary.wiley.com/doi/10.1111/apha.14056

	## 4 WHAT ABOUT THE 'FINDINGS' YOU ARE PRESENTED WITH?

	Watch out for percentages (Bolton, 2023). A simple change is easily
	understood as a percentage, but 'scientific' studies involving
	comparisons between groups can require more careful consideration.
	These comparisons should always trigger the question 'percentage of
	what, exactly?' The headline, 'New drug/product/intervention cuts
	mortality by 50%' sounds impressive, and attracts attention, but the
	reality could be less spectacular. Perhaps using the old drug, the
	death rate was 20 per 1000 patients, and when the new drug was first
	used, the rate became 10 per 1000 patients: a 50% reduction. But the
	absolute risk reduction in death rate was 10 per 1000, or 1%, a less
	impressive headline.

	https://commonslibrary.parliament.uk/research-briefings/sn04446/

	Also, beware of correlations. Just because two things relate to each
	other, for example, a diet and a sense of well-being, does not mean
	that one causes the other. The world is full of accidental (spurious)
	correlations (Van Cauwenberge, 2016). One of our favourites is the
	high correlation between the divorce rate in Maine, USA and the per
	capita consumption of margarine! Also, ask the question 'how many
	false positives and negatives will I get if I use this correlation to
	make a decision' (Tipton et al., 2012).

	https://www.datasciencecentral.com/spurious-correlations-15-examples/

	https://link.springer.com/article/10.1007/s004210000255

	For the moment at least, artificial intelligence cannot quantify
	uncertainty very well. Generally, AI uses stuff from 'out there' as
	if it were true. Thus, a high proportion of garbage in will give you
	garbage out (which increases the proportion of garbage that AI uses
	next time round)!

	We hope that, armed with the above checklist, you can challenge and
	interrogate the polarising information, from 'spin' to the outright
	falsehoods presented to you on a daily basis. We are at risk of being
	overwhelmed by an increasing number of dubious, unregulated and
	disparate sources. The next time you hear phrases like 'they say this
	is great' or 'this is scientifically proven' start by asking 'who are
	they?' and 'which scientists, using which methods?' Be cautious and
	questioning; snake oil and its vendors still exist, they come in many
	guises.

	From: https://physoc.onlinelibrary.wiley.com/doi/10.1113/EP092160

	tags: article,science

	# Tags

	article
	science