Gender Rights, Human Reproduction and Economics: Is there a triangle to square?

I have a confession to make. Sometimes I write a blog to clear up my own confusion, which often happens when my own values run into potentially contradictory data. This is one of those blogs. The three sides of the triangle I’m trying to square are

  1. Gender equality. If I was a woman, I’d be happy to be called feminist, but I’m male, and any discussion of equality must involve both genders.
  2. Human reproduction. We’re good at this, it’s part of us, and it defines areas of complementarity rather than equality. 
  3. Economics. This is the creation and distribution of value. Unfortunately, value is not infinite, and is hard to grow. 

Let’s start by reviewing the evidence against gender equality. This comes in three flavours, ranging from the silly to the plausible. I’ll start at the silly end, with the idea that women are inferior to men. 

If you weren’t thinking hard, you might want to cite the chart above as partial proof of this point. However, what it also shows is that a) some women are stronger than some men, and b) as we’re judging strength, not gender, we’re judging apples as if they were oranges. We might as well complain that a submarine can’t fly. 

The second argument is captured in the pop-psychology best seller “Men are from Mars, Women are from Venus”. 

This conceptual horror has blighted the public lives of a whole generation of cognitive psychologists, who have to explain that it’s really that men show higher degrees of variability &  subtle differences in brain connectivity, which translate into rather small differences in some cognitive abilities, of dubious importance. 

For example, the famous linguistic superiority of women has an effect size of 0.11, which the researchers considered is too small to reliably measure. The difference in female sensitivity to non-verbal cues was not much larger, at 0.19. This translates into a woman having only a 6% chance of having higher skill in this area than a randomly selected man. Looking in the other direction, the visuospatial superiority for men has an effect size of around 0.17

The third argument usually dresses itself up as evolutionary biology, and argues that men and women are differently optimised, as hunter-gatherers and homemakers respectively. Unfortunately, this begs the question of why we have different genders in the first place.  Both on theoretical and experimental grounds, it seems that males represent represent a significant evolutionary cost, for which no benefit has yet been conclusively identified. Additionally, computer simulations of family evolution (the only experimental test available), which included gender differences, did not report gender specialisation, but a general cooperation factor as being most relevant to survival. This general cooperation factor arose from social learning, not genetic mutation. Curiously, the researchers found that variation in environmental adversity predicted a slightly lower level of this factor than unvarying adversity. Thus, there is no evidence that natural selection would select for men having worse homemaking inclinations than women, though levels of cooperation might vary for both: we have already seen that their aptitudes are very similar. It does support the view that gender role differences are likely to be learned, fluid, and based around cooperation to address environmental needs. 

It’s nice when the science confirms one’s values!


Whatever we may think of our own reproductive careers, as a species humanity has been enormously successful:  there is almost no land environment on earth which humans have not been able to establish societies in. As reproductive strategy (to provide random mutations at an appropriate rate and variety) is the other side of natural selection, we must be getting it right. An obvious feature of our strategy is that we are sexually dimorphic. As we have just seen, it isn’t possible to say what advantage this brings, as we have no idea what the advantage of being gendered is. From what we have just argued, it follows that there will be some role differences based on this dimorphism, communicated by social learning, and variable according to circumstances. An obvious example is the way, in some societies, women take over previously male labouring roles in wartime, despite their strength differences, and retreat from them when peace returns. 

This naturally raises the question of whether there are any role differences which are permanent across societies, qualitatively if not quantitatively. Because of our sexual dimorphism, it makes sense to look for such roles in connection with our choices of sexual partner. What, other than appearance, do we find sexually attractive about each other?  Secondly, three of our sexually dimorphic physical characteristics, height, strength and hair length, are shared by both genders. Does variation in such physical characteristics lead to gender-related role differences in the same gender?

Sexually attractive characteristics

In children of both genders, awareness of attractiveness develops early, with awareness of female attractiveness developing both earlier and more intensely than male attractiveness.  

Young infants are more likely to get close ups of female faces than male ones

This bias for attractiveness may be biologically mediated. Consistent with the implication that male attractive qualities are less observable, women reference their choices with respect to other women. There may also be a safety component in this, as British undergraduate women find the “dark triad” personality traits of narcissism, Machiavellianism and psychopathy significantly attractive: clearly these are things one can easily have too much of.  

“I have travelled oceans of time to find you” The dark triad in action as Dracula

Fortunately, this is counterbalanced by also seeking empathy among male friends, and of course observation of empathy requires paying attention to men’s interactions with women.  These “dark triad” findings alert us that sexual choice may be more than just liking, and if we recall that economics is the social science of value it then makes sense that much psychology focuses on sexual choice as a form of economic exchange, particularly when it comes to choosing long term partners. A recent study illustrates how this might differentially affect partner choice. When producing children, we want to give them the best start in life possible, and acquiring the best partner to make them with is therefore a no-brainer. This has been conceptualised as “mate value”, which has been broken down as follows:

  1. Physical attractiveness 
  2. Personality
  3. Education 
  4. Intelligence 
  5. Career prospects (aka earning potential)
  6. Social status 

An early test of mate value: the judgment of Paris

Of course, questionnaires can be used to measure and summarise mate value. It is reasonable to presume that we guard what is valuable to us, and in relationships this intention is expressed as “controlling behaviours”, which can also be reported by questionnaire. Having set the scene, we can now interpret the chart below 

133 women rating themselves and their partners for (relative) mate value and controlling behaviour

While women rate their partners’ controlling behaviour as slightly greater than their own (though with considerable overlap shown in the standard errors) the man having higher mate value than the woman requires fewer guarding behaviours in both genders than the reverse. Women with men of lower mate value than themselves are spending more effort on guarding them than the reverse. Guarding behaviour correlated positively with relationship length in this study, so it was an effective strategy for both genders. 

Let’s look more closely at mate value. In the majority of relationships (58%) the woman had lower mate value than her partner. The results from this classic study show that men are more willing to compromise than women for short term relationships, though not for more serious ones. 

Willingness to compromise standard to obtain a partner: higher values show greater willingness.

So, summary mate value is a more important sexual characteristic for men to have than women, especially for short term relationships. Compared to men, women who haven’t been able to establish relationships with men with higher mate values than themselves appear to be behaving defensively, trying to avoid a worse future partner, rather than believing their own higher mate value would enable them to do better. Astute readers will have noticed a significant portion of mate value (education, social status and career prospects) relates to societally acquired, rather than intrinsically individual properties: we have already seen that these are more important when long-term choices are being made. This interpretation is supported by the association of sexual infidelity in marriage with income inequality (considered as a proxy for mate value). 

2757 married men and women: negative values indicate female income greater

The pattern of infidelity suggest that women with higher mate values have something to guard against, as well as thinking that they can’t replace their untrustworthy partners with someone better. 

Quantitative differences in physical characteristics

Women are, on average, shorter than men, less strong, and can grow longer hair. What happens to men and women who aren’t gender-typical in these respects?

For height, the position is clear: being taller makes you wealthier and more powerful. While the reasons for this aren’t clear, shorter men will resemble average women more closely in the wealth and power they wield: they’ll have less of it. Consistent with what we just discussed regarding mate value, height is an important physical attractor for women, less so for men, despite the potential for women’s height to signal greater mate value as well. 

We tend to think of physical strength as a bit passé when it comes societal roles 

but physical strength, expressed as muscle development, is both easily observable and a more acceptable component of male than female  attractiveness.  Height, though attractive in itself, does not suffice alone.  

Male and female bodybuilders competing. The woman just won

It seems that attractiveness is rather more important for men than women in obtaining good income, as these charts show. 

532 Women: attractiveness is an 11 point subjective scale; income is a log scale of monthly income in Euros

692 Men: axes identical to chart for women (above)

There is the expected increase in monthly earnings for women with increasing attractiveness. However, the much steeper and non-linear curve for men indicates that men are being heavily penalised for lack of attractiveness in the jobs market.  Thus, the social attributes contributing to men’s mate value are correlated more closely with physical attractiveness than those in women, which increases the less attractive a man is. This is likely to produce a longer lower tail in male mate values, compared with female ones. 

While the influence of long hair in men on societal roles isn’t discussed much in the academic literature, the popular press is full of it, and the issue seems to be in doubt. The direction is as we might expect.  There are also suggestions that the “short hair implies less need for nurture” has been a styling (and practical!) cue employed by women wishing to adopt independent roles.
Because of the social learning model we’ve presumed as a mediator of gender roles, finding universally expressed roles does not necessarily mean that the behaviours themselves are “hard-wired” in our genes. For example, aggression is generally accepted to be higher in boys and men, though in spousal relationships there is evidence of equivalent frequency, albeit with more adverse consequences of male aggression. However, consistent with our model, a recent study found it to be mediated by fathers’ but not mothers’, differential treatment of boys and girls

Equally, we should be careful about assuming that genetics has nothing to do with how we teach our offspring. One large study has suggested that 23%–40% of how we parent our children can be accounted for by genetic influence, and another big study has broken this down further: care is more strongly heritable than control, and adverse styles more heritable than positive ones

Despite these caveats over cause, we seem to be able to make four claims. 

  1. We evaluate our potential partners according to a complex scale of values, which we can summarise into a general mate value. 
  2. Mate value is harder to observe in males than females
  3. We are dimorphic in how we employ mate values: perceived male values are rated more highly than female, even by women. 
  4. The workplace scales male mate values differently from female ones, making them easier to detect, particularly at the lower end. This implies that the workplace is an important part of our reproductive as well as economic life, alerting women to differences in male mate value that otherwise might otherwise be hard to detect


At a macro level, a powerful economic argument can be made for improving female participation in the workplace. 

In developed countries at least, women are strongly represented, with the emphasis now shifting towards achieving equality across professions, seniority and salary. 

US data

However, as the chart below shows, the increased participation of women in the workforce has led to increased competition between the genders, evidenced by the changing balance of male and female unemployment. 

From what we have just covered, we would expect males with lower mate value to be worse affected by this and, using race as an indicator for this in the racially discriminatory US, this is what we find

All unemployment changes are positive, as the differences are measured during recessions

 For the dimension of race, we can see that, as predicted above, men are more at risk of unemployment in recession than women, and this effect is exaggerated for black males. The picture for the Hispanic community differs for the 2007-9 recession, with equal increases for both genders. While the pattern for education is as expected, that for age appears contradictory, as younger ages would have higher mate values.  However, unemployment at younger ages is more likely to reflect increased difficulty in entering the jobs market, while that in older ages will be redundancies. The increased costs of the latter, together with loss of skills, makes reductions in hiring new young people the easier choice. We can see this differential being maintained even during economic recovery

Change in participation in the labour force compared with before the 2007-9 recession

Young men are at particular particular risk of being out of the market. Overall, in the US nearly one fifth of 20-24 year olds was neither enrolled in education nor working in 2013

We have seen above that the labour market isn’t just about value production. It also allows women to more accurately assess the mate value of potential partners. We’ve just seen how the combination of recessionary pressure and female entry to the labour market has squeezed young people, and especially young men, out. From the above, what effects might this have?

  1. It seems likely that there will be an increase in male-female pairs where female mate value is greater than male. This might lead to an increase in guarding behaviour. 
  2. As economic mate value is more important in romantic relationships, there might be a higher proportion of short term relationships producing children.  
  3. Poorer information might lead to less stable mate choices, as random error is increased. This might be most apparent at the poorest and most discriminated end of the population, who therefore will have the lowest mate value. 

In the US, the proportion of married couples where the woman earns more than the man has been steadily increasing 

    US labor bureau data: married couples only

    and similar results pertain to education 

    We have seen above this might promote an increase in guarding behaviour. At the extreme end, this results in domestic violence, and some data suggests it might occur as predicted 

    Recession occurred 2007-9

    particularly as  the recession related rise in unemployment (which differentially affected young men) between 2007-9 occurred when murder overall was falling. 

    Homicide is likely to be an insensitive measure of guarding behaviour, especially among women. However, the U.K. has undertaken surveys of sexual attitudes and behaviour. It has produced some apparently contradictory results on sexual attitude change, which nonetheless fit our predictions. 

    • Between 1991 & 2013, disapproval of non-exclusivity in marriage has increased for both genders (18% increase for men from 45%; 17% for women from 53%). This fits our hypothesis regarding guarding behaviour, as well as validating the women’s reports of their partners in the study that discusses it: like the study, the men reported higher rates of disapproval. 
    • Over the same period, there has been no change in attitude over one night stands for men (20% saw no problems), but the proportion of women seeing nothing wrong in such behaviour more than doubled, from 5.4% to 13%. This is consistent with our hypothesis that mate choice for women is now more difficult. 

    There seems very solid evidence that a higher proportion of children are coming from less long term relationships, irrespective of racial group. 

    The pattern of spread among the different American ethnic groups is consistent with the differential effect of losing the workplace’s long tail towards the lower end of the jobs market. It is not simply people marrying later (e.g. after the birth of a child), as this US chart shows

    This would not matter if cohabitation was as stable as marriage, but it is significantly less stable,

    consistent with our hypothesis that the reduction of information about male mate value makes it harder for women to commit successfully, as the next chart shows. 

    Furthermore, despite the increasing proportion of children being born to non-married women, in a variety of relationships, the number of those women who do not work outside the home has remained relatively constant, despite increasing female employment, while fewer married women stay at home. Single women are taking the breadwinner role for themselves, rather than sharing it with a man. 

    Unfortunately, and consistent with the problem being at its worst (given assortative mating & competition) at the lower tail of the mate value distribution, they can’t do it alone. 

    What does all this mean for men?

    The market for male lemons

    My title for this subsection derives from a famous economics paper “The market for lemons: Quality uncertainty and the market mechanism” by George Akerlof. He discussed what he called “information asymmetry” in the second hand car market, which enabled the sale of “lemons”: cars which proved defective only after being bought. When market information is asymmetrical, only the seller completely knows the value of the goods; the buyer has to make a (maybe educated) guess. He set out the conditions for such a market as follows. 

    The argument we have just constructed suggests that the workplace functions to help provide men with a clear “disclosure technology” for their mate value and this has been impeded, especially regarding men with the lowest value. 

    Jane Austen’s “Pride and Prejudice” describes how information about mate value was transmitted in the early 19th Century. 


    We can see how the economic and social aspects of everyone’s mate value is either public (as a result of gossip or family connections), or exposed at gatherings (balls) designed for that purpose. 

    Staying at the top of society, we can see the same process operating at the “London Season” with its debutantes’ balls,  

    which still continues today.  While males who deceived over their mate value are well described by Jane Austen, the system was designed to expose them as much as possible. Outside (and even inside) those exalted circles, things  are much harder now 

    as greater equality combines with less workplace (or family) information to keep men’s true mate value hidden. 

    It seems like we have created the conditions for a market for male lemons (we have seen that women’s mate value is less significant). Akerlof found that, under those conditions, the buyers (women) apply an averaged discount across the whole market, so “peaches” (the opposite of lemons) get undervalued, while “lemons” are overvalued. In Akerlof’s model, owners of peaches walk away, while lemons pile in: here, because it’s reproduction, which is necessary for everyone, even the peaches will stay around. What might we expect?

    If women are less able to tell the difference between a peach and a lemon, they will maximise their chances of getting a peach by enlarging their pool of available men, and resampling as required. This effectively increases competition between women, as larger pools means more overlap. For male lemons, this is great  news, as, unfortunately for the women, their pool of available women has grown. This will be equally true for peaches, who will therefore also have a larger pool of women to choose the very best reproductive partner from. Using marriage as a proxy for this, and education as a proxy for the social aspects of mate value, we find those with more education showing a slower rate of marriage decline, 

    and a complete decoupling from the secular increase in divorce rates 

    Unlike their female peers, male peaches are doing well. 


    I have stuck to a single country to cover the economics because it allows comparison of the figures. Countries do differ in absolute statistics 

    and the OECD chart above shows that the US has the second largest number of single parents among developed nations. However, as the next chart (of divorce rate changes) shows

    the tendency is for even marriages to become less stable with time, with cohabitation being worse. The picture is similar: unsurprising, given the common approach to workforce gender balance across OECD countries. 

    It seems that women are responding rationally to the loss of information from the workplace about potential partners. 

    1. They are probably guarding the partners they have more tightly (at least, if they follow through on their attitudes)
    2. They are more prepared to have sex, and therefore risk offspring, with less permanent or even casual partners. 
    3. They provide for their offspring without partners. 

    While this strategy makes sense, there are costs. 

    The impact on children is well-known, of course.  What is less known is that these trends have also been associated with a decline in female happiness, even though men’s has increased

    While this graphic refers to the US general social survey, this result pertains across OECD countries

    The article I took this graphic from points out that neither economic burden  (including childcare & housework) nor anti-female discrimination account for this. Whether or not marriage makes you happier is contentious, and of course the changes we have just documented will affect people in marriages as well as outside it, as the rising divorce rates I mentioned above indicate. 

    However, the argument we have just developed has suggested a paradoxical consequence to the progress of female entry into the workplace.  The unavoidable displacement of male jobs through female competition has deprived women of a key tool in assessing male mate value, which has worked to men’s advantage. Given the importance of reproduction in our lives, the different trends of male and female happiness are consistent with our expectations. 

    It does seem that the triangle we have explored does need squaring, but it is not easy to see what the solution might be. Returning to the days of sexual inhibition, work restrictions and formality is no recipe for female happiness at the individual level, and is destructive of our wealth. The men who have benefited the most from this (the peaches) have only slight incentive (their daughters’ future) to lose the reproductive advantage these changes have given them: their marriages are nicely stable, so their own male offspring are less affected, and they experience the advantages before, through their daughters, they face the risks. The lemons are simply grateful for more access to more women. Nonetheless, our children require this problem to be solved, so an alternative way of evaluating male mate value needs to be found. 

    Clinical trials and how to make sense of them 

    Clinical trials are how we find out if our treatments work. They are fundamental to the debates over psychiatry. They are the evidence for the profession’s effectiveness, and instruments for its improvement. Understanding them, what their results mean, and how much they are to be trusted is therefore a fundamental skill, not just for professionals, but for service users trying to “open the hood” and see how and why the treatments they were recommended gained credence. They also reveal why diagnosis is so essential to making progress with treatment.

    This blog is therefore my attempt to demystify clinical trials sufficiently to allow people to begin to make sense of them, particularly now they are becoming increasingly available openly. As with all my blogs, I’m going to try to keep my language as non-technical as possible, while not simplifying the issues involved.

    To do that, we first need to tackle maths phobia.

    He doesn’t know it, but he’s looking at a recipe

    Yes, just like this one

    If recipes tells us how to select and process ingredients to get a result, mathematical formulae tell us how to select and process numbers to get a result.  The wonderful thing about numbers is that they can be used to represent anything at all, so we can do far more with them. In clinical trials, our first task is to try to represent ourselves, because we want the treatments to work on us, not somebody else.

    The iid Assumption

    We cannot avoid assumptions: trying to make none eventually leads to solipsism.

    Descartes’ assumption of his own existence: even that’s been challenged these days, as this programmed brain shows

    We are all unique, but also all who are reading this are recognisable as members of the human race (I’m assuming any dogs around can’t read)

    It’s essential to be aware of the limitations of any assumption you make

    The iid in the assumption we have to make stands for “identical and from identically distributed populations”. Though it sounds a bit repetitive, it isn’t.

    Here we can see two different populations; apples and oranges. So, the first i is an assumption that whatever we’re looking at can’t be divided like this.

    Everything in the picture above is a tomato. So, they meet the first i criterion of our assumption. The obvious difference we see results from them breaking the id part. The the big and little tomatoes come from groups with different average sizes (and, if we look carefully, colours). So, their two populations do not have identical distributions for size and colour.

    It’s also important to notice that iid is a decision we make, not a property of things in themselves. Take our first picture, of apples and oranges. If we were talking about fruit, then the picture would be iid, because we’ve blended the apples and oranges together.

    Apple and orange distribution

    Apart from being fun, the picture above also warns us that distributions are harder to understand than simple categories. We need to look at them in more detail.

    The standard normal distribution. Read on to find out what it is

    To make a distribution, we need three ingredients

    • A population: that’s just a group of things. Quite literally, anything will do.
    • A dimension: some quality every population member has, but to a different extent. We are going to assume that the dimension can be measured using an interval or ratio scale, as described in my previous blog.  If you don’t want to read the blog, it’s a scale that works like an ordinary ruler. 
    • A measure:  something that can tell us when the population members differ on the dimension.

    We use our measure on our population, and it gives us a range of different results, as we expect. The set of results (or values) we get is called a variable.  Using our values, we can now arrange our population along our dimension.

    All nicely arranged. Yay! We made it!

    That arrangement is called a distribution, and can be drawn, as we did for our so-called  “normal”  distribution above.  Distributions are usually drawn with their dimension (expressed as a variable) along the bottom (x-axis or ordinate). and the number of population members at each value as a stack, whose height is measured from the side (y-axis or abscissa).  In our  drawing, we’ve approximated the top of each stack by using a line.  There are actually lots of possible distributions, but we are going to concentrate on the normal distribution, for two reasons. First, even if our measure is perfectly unbiased, and therefore perfectly valid, there will always be some degree of random error (unreliability). When we measure something repeatedly, with only random error, the distribution of the measurements follows the normal distribution. Now, because of what iid means (that individuals in an iid population may be substituted for each other), we can claim

    Provided the iid assumption holds, and for a given measure, measuring different members of a population is the same as measuring the same population member repeatedly

    So, measuring an iid population with a good enough measure will result in a normal distribution of values, assuming the measure is at least an interval scale.  The second reason relates to sampling, which I talk about next.

    Sampling and the normal distribution

    God has the time, resources and immunity from boredom to fully measure iid populations.

    Fancy measuring every hair in Longfellow’s beatd?

    We do not. Fortunately, the iid assumption gives us an escape clause. Because all members of our iid population are the same, measuring the whole population won’t give us different information to a subset of it. So, we can sample. Sampling will, however, increase our measurement error, and unless we can say by how much, we’re stuffed, as the population distribution needs to be inferred from our sample. Fortunately, as our measure generates a normal distribution, this is estimable. If we look back to our drawing of the normal distribution above, we can see that we already know the proportion of our population in each part. If we know where the middle of our distribution is, and how widely it spreads, we can work out the rest. These quantities are called parameters.

    The parameters of the normal distribution

    We already know how to calculate where the middle of our distribution is; it’s the average value (mean), obtained by adding all the individual values up and dividing by the total number of values we used.

    By convention, it’s usually symbolised as μ (greek for m). We write its calculation as

    μ = Σ(x)/n where Σ means “add up all the x’s values”. Yes, it’s a recipe.

    By analogy, our estimate of the width of spread is also a kind of average, though we call it the standard deviation. It’s calculated by subtracting the mean from each value, squaring them (which stops them adding up to zero), adding them together, dividing by the total number of differences we used, and then taking the square root to get us back to our original scale.

    It’s symbolised by σ (greek for s) and we write its calculation

    σ = √Σ(x-μ)²/n

    Before moving on, it’s important to see what these parameters let us do. First, if we subtract the mean from every value, and then divide each value by the standard deviation we will end up with exactly our drawing of the normal distribution above: mean zero, standard deviation of one. Because this recipe converts any normal distribution to this one, it is called the standard normal distribution, and it’s very useful. Here it is again, in more detail

    The standard normal distribution five ways

    The first two ways are simply a repetition of its previous appearance, with the scale revealed as being measured in standard deviations. The third way is a scale of cumulative percentages of the population at and below the scale value. However, thanks to the iid assumption, that is the same as the probability that a member of our population will have a score of that value or less.  This means we can comment on either how typical the member is of the population as a whole, or, equivalently, how likely they are to be a member of our population, assuming we can trust our measure. The fourth way points out that we can as easily tick our scale using percentile as standard deviation units, if we don’t mind an ordinal scale, while the fifth shows that there are ways of adjusting the differences between different centiles so that  scale can be rendered interval once more.

    Our recipe for the standard deviation has another trick up its sleeve. If we leave out the square root stage, we end up, unsurprisingly, with σ², which turns out to be far more interesting. We call it the distribution’s variance. If we think back far enough in our education, we remember that squaring was called that because it defined a square. For that reason, the variance measures the area under the curve of our normal distribution. Now, remember that curve is the result of how our population are stacked along our dimension. That means the area below it has all the possible information our distribution can give us, which the variance has neatly summarised into a single value. Furthermore, our drawings show us that that we can slice the variance into chunks, which allows us to make attributions to different amounts of it. This allows us to judge how how much of the variance might be accounted for by an attribute. We have something that can potentially tease out cause.


    There are two problems to solve when sampling

    1. How to get a representative sample
    2. How to use the sample to measure the population it represents

    The sampling process

    Getting the sampling right for a clinical trial is just as important as for a survey, follows the same process, but aims to collect a very different sample. We are seeking a representative sample of an iid population. However, as we have seen above, what we are going to call iid is down to us.  We might have good reason to want an iid sample of fruit, as opposed to one of apples and one of oranges, but clearly how we need to sample “fruit” will be very different to sampling “apple”, and interpreting results from an apple-orange combination that doesn’t really exist could cause us problems.

    When we make the wrong iid decision in a clinical trial

    What kind of iid decision should we make for a clinical trial? Step up, diagnosis, and take a bow.

    • Diagnosis groups associated symptoms and signs together
    • These associations predict a common cause, even if that cause is unknown
    • There is a whole science (epidemiology) devoted to understanding populations of diagnoses.
    • In a clinical trial, the symptoms that make up the diagnosis are the target of our intervention.

    Using diagnosis gives us a credible iid population to sample (in the diagram above, it’s our sampling frame). Because the diagnosed population is iid with respect to diagnosis, we can simply collect as many diagnosed cases as we need, and be confident our sample can represent the diagnosed population (in real life, things aren’t quite like this, but we’ll come to that later) We can now define the basic question our measurement needs to answer.

    Does receiving a treatment stop a population being iid with those who did not?

    Notice that our question implies no direction. If the direction is one we want, we talk of “effects”, if not, it’s “adverse effects”.

    Also, because what we’ve got is a sample, that means we have a measurement gap to bridge.

    From our earlier discussion, it follows that, when we measure an iid population appropriately, the distribution measures error, so our interest focuses on the mean. But now, each time we measure an iid sample, that error will give us a slightly different estimate of the mean. These estimates form their own distribution, with its own average degree of variation, called the “standard error of the mean”, which can be calculated.

    The standard error of the mean (SE)

    • σ = population standard deviation
    • N = total sample size

    SE = σ/√N

    We’ve solved our mean estimation problem, but only if we can solve our standard deviation one. This turns out to be almost trivially easy. For a sample of any size, with a mean of m, the “sample standard deviation” s is s=√Σ(x-m)²/N-1.  Despite the confusing name, s is the estimate of  σ from the sample. As, under iid, m is an unbiased estimator of μ, we can substitute m for μ, and so calculate SE.  We’re all set.


    In an ideal world, we wouldn’t need randomisation. We could simply give our intended treatment to some of our iid sample, measure, and see if that group remain iid with the others. However we’ve only ensured our sample is iid with respect to our sampling criterion. Reality is more like the diagram below.

    The dots show they’ve all got the same diagnosis. But, they’re different colours.

    In this example, we’ve only got colour to worry about. But, the very fact that we are not clones of each other means that, however carefully we are sampled, we will differ non-randomly on many dimensions: we are separated by more than measurement error. Any of these other dimensions could contribute to the course of the diagnosed disorder: epidemiology shows that they frequently do.

    Ideally, we’d like the case and control to be the same subject

    Rubin’s causal model is a useful way to understand how randomisation works to overcome this problem. As the image suggests, Rubin imagines two futures: in one a subject gets the treatment, in the other s/he doesn’t. The difference between these two futures is the “unit-level causal effect”. As the accompanying equation shows, it is no more than the difference we measure between the subject in these two futures. Of course, in reality we can access only one of those futures

    Schrodinger’s cat: Rubin’s causal model in action at the unit level

    However, a group of people who are iid with respect to another group allows measurement of a causal effect at the group level, because  the basis of iid is substitutability: either group can stand in for the other.  In a randomisation process, any individual is as likely to be selected as any other, so the individuals are substitutable in the process. So, two groups which result from a randomisation process are iid with respect to each other. This shows us two things

    1. As the randomisation process was done without respect to any dimension, the groups are iid with respect even to things we haven’t measured or recorded. Randomisation is unique in being able to guarantee this.
    2. The larger the groups, the better the approximation to iid they will be.

    Clinical trials following these principles are called Randomised Controlled Trials (RCTs) and are generally considered the gold standard for assessing treatments, due to their unique ability to cope with unmeasured variables.

    This chart shows how much difference controlling for unmeasured variables can make

    It’s actually not from a single RCT, but from what is called a meta-analysis, which groups studies together. There’s more on them below. The lowest set of bars shows that many more studies without randomisation support the benefits of homeopathy than do not, but that the balance corrects sharply when randomisation is introduced.

    The chart also mentions another important feature of good clinical trial design: single or double blinding. This is part of accurate measurement, which we now need to discuss, as measurement is harder than it looks


    Those reading carefully will have noticed that the last chart talked about “efficacy” rather than “effectiveness”. This is because, even with randomisation, any new difference between our previously iid groups isn’t just the causal effect of the treatment. Unfortunately, what we measure has four components.

    1. Efficacy.  This is what we are after:  the part of the measured difference that is due to the treatment.
    2. Random variation. As we saw above, this is an unavoidable part of measurement, and its effect can be estimated. Even so, there are some measures that are especially sensitive to its effects. For example, hospital admissions occur when things are especially bad, and discharges when they are better. Random variation will therefore tend to exaggerate differences based on these two measurement points.
    3. Regression to the mean.  This is actually a consequence of random variation. Think back to our standard normal distribution.  Measurement of an iid sample will lead to our values clustering round the mean. So, if we measure a subject from that sample, and find an extreme value, measuring s/him again will most likely return a value closer to the mean. The size of this effect can be calculated from knowing how strongly the two measurements are associated.  This association is described by their correlation coefficient r

      Showing how the correlation coefficient relates to the strength of association between two variables

      and the proportion of an effect caused by regression to the mean is 1-r. Clearly, the more reliable a measure, the higher (and positive) r will be, and the less regression to the mean will be a problem.
    4. Placebo effect.  This is the effect expectation or desire can have on our measurement. While “placebo” implies that its effect will be benign, this is not always so, as bad expectations bias us in their direction also. When that happens, it’s called a “nocebo effect.”

    The placebo effect is not an artefact, as the diagram below shows, but a genuine psychological phenomenon, probably involving the dorsolateral prefrontal cortex.

    transcranial magnetic stimulation temporarily, and Alzheimer disease permanently, cause hypofunction in the dorsolateral prefrontal cortex. Both reduce the placebo effect

    Treatments such as homeopathy, whose remedies contain no active ingredients, rely on the placebo effect for getting their results. “Blinding” means disguising the treatment in the trial, so that expectancy cannot influence our measurements. In “single blinding” the patients are unaware of whether they receive a treatment, but the researcher does know who gets what. In “double blinding” neither the researcher nor the patient knows what they’re getting. In the review of studies of homeopathy pictured above, you can see the balance between positive and negative results shifting towards negative as the level of blinding increased.

    I have blogged about the importance of validity, and how it relates to reliability, elsewhere, so here I will simply observe that a valid measure is an unbiased measure, which we have already seen is an essential prerequisite for effective measurement

    Analysis: simple or complicated



    In a very important way, analyses of clinical trials are like the two pictures above. Superficially, simple analysis seems to be giving us a lot less information than complexity.  However, think back to what our original question was: did application of our treatment stop our two groups being iid?   That’s not a complicated question. So, additional complications are doing one of two things

    1. Answering an additional question such as “how much of the variance (see earlier) in the change we’ve measured is due to our treatment (or some other causal factor).”
    2. Correcting for some flaw in the research design. 

    The trouble with complications, in both clinical trials and watches, is that they bring additional assumptions (not  least, that they will make things better rather than worse) and possibilities of error. Do we really understand what all those additional dials mean, and how they relate?  Nowadays, very complex statistical analyses can be run very easily on a computer. Unfortunately, understanding their subtleties has got no easier. This can be seen with a type of clinical trial called a cross-over study. 

    Doesn’t look too complicated, does it?

    The advantage of a cross-over design is that each subject can act as their own control, thus bringing us closer to Rubin’s ideal. The disadvantage is that even major journals can fail to identify errors in the analysis strategies used. So, the more complex the analysis, the more carefully and clearly it should be explained. If the explanation doesn’t seem to make sense, it might be because it doesn’t!  Even charts can be misleading, if they’re not read carefully. Here’s some charts from a study (not an RCT) comparing two forms of psychological therapy in depression. 

    While not mentioned in the paper, counselling has been traditionally understood to be more appropriate for people who are struggling, but whose difficulties relate to specific life problems, that might be expected to resolve. Even experienced psychologists have been known to misinterpret the lower chart as showing patients with counselling getting worse relative to CBT.  However, the upper chart shows that counselling, as predictable from its customary use, seems to have a preferred number of sessions, (around seven), while the continuous lines used in the lower chart are misleading. Checking the Y axis makes clear that there are different groups of subjects represented at each time point, so the lines joining them should not be interpreted to suggest continuity. What the chart shows is that a smaller proportion of people in counselling recover than people in CBT, when the number of sesssions attended rises beyond seven. 

    Post-Hocery and how to avoid it

    Read some statistics textbooks, and it would be easy to think that the way to proceed is to look at the data to get an impression of what it might be telling us (that’s called exploratory analysis) and then decide what we’re going to do. For clinical trials, that isn’t a good idea. The reason is easier to understand if I use an analogy.

    Think of our treatment as an arrow, and the centre of the target as a cure. A clinical trial should tell us where on the target the arrow has landed.

    In this context, exploratory analysis is like us test-firing the arrow a few times, seeing where it lands, and positioning the target accordingly. In itself, that’s not a bad thing, and is why we often pilot before undertaking a trial. But, here, we are using the same data from the same sample.  Clearly, moving the target means we can no longer claim that the difference we find reflects the impact of our treatment  This kind of analysis is called a “post-hoc” analysis. It can be useful in identifying other possibilities and potential areas for study, but to avoid post-hoc target moving, analyses in clinical trials need to be pre-planned. Nowadays, there are registers of clinical trials, which allow readers to check back and see if what was published matches what was registered. 

    Putting everything together that we’ve covered so far, we are now in a position to set out a set of seven simple yes/no questions we should ask when ourselves when reading a clinical trial. 

    1. Has the sampling frame provided a    sample that is convincingly iid?
    2. Have the groups been properly randomised with respect to each other?
    3. Are the randomised groups large enough to be convincingly iid with respect to each other?
    4. Has blinding been used?
    5. Do we know how valid and reliable the measures are?
    6. Is there a simple analysis that shows whether the groups are different or not?
    7. Was the analysis pre-planned?

        When using this list of questions, a “don’t know” should be treated the same as a “no”.  In general, the fewer questions we can answer “yes” to, the less we should trust a clinical trial’s claims about the treatment it is investigating, and the more corroborative evidence we should seek. 

        Let’s apply these questions to the study on comparing counselling with CBT we used as an example above. 

        1. The study published a table showing both therapist types saw cases of similar severity 

          PHQ-9 is a depression measure

          which suggests that the sampling frame they used did provide an iid sample. It’s a “yes”. 
        2. That’s a clear “no”.  As the groups weren’t randomised, we can’t be sure they were the same on unmeasured variables. As discharge is driven be either patient recovery or desire to continue, then, given the charts, it seems likely the two groups weren’t. 
        3. The sizes reported in the table are very large for mental health studies, so that’s a “yes”
        4. Another “no”
        5. The study references the PHQ-9’s reliability and validity studies, so that’s a “yes”
        6. Despite my example with the charts above, there is such an analysis in the study. It reported 46.6% of patients receiving CBT improved, versus 44.3% of patients receiving counselling, if the patients met diagnostic criteria at outset. The comparable figures, including also people who did not meet diagnostic criteria, were 50.4% vs 49.6%. We can now see how misleading those charts were!
        7. The study report isn’t clear on this, so it’s a “don’t know”

        We can see that our questions don’t return a clear “yes or no” answer to how much trust we should place on the study. My own take home messages are that counselling and CBT might be best for different patients, but neither is going to get more than half of those they see better. 

        The trial has reported its results were significant. What does that mean?

        Years ago, that statement was synonymous with “Yay! It works!”. We now know better. Let’s think about a very simple RCT, with only two groups, one treatment, and one outcome measure, with an interval scale. In this context, the term is short for “statistically significant”: after the treatment, when we measured the means of the two groups, they were sufficiently different to say that, even allowing for error, there was a less than 5% chance of them being means of the same group, so the groups were no longer iid.  This 5% limit has been arrived at by custom, but if we look back at where it cuts the normal distribution it’s outside most of the mass of the distribution, so it kind of makes sense.  

        However, if we look back to how the standard errors of our two means are calculated, then we can see that they will shrink the more cases we have. At very large numbers, we will be able to detect tiny differences: statistically significant, but practically pointless. What we need is a measure of what is called effect size.  For our example, the difference between the two means, measured in standardised deviations, the “standardised mean difference” (SMD) works well. The table below gives some ways of interpreting it. 

        BESD Binomial Effect Size Display CLES Common Language Effect Size

         Most treatments available in Mental Health that have been tested by RCT have an effect size (SMD) between 0.5 and 1.2. This is pretty average compared to equivalent treatments for physical conditions,  but does show that we should not expect even our best treatments to work all the time, and we need more than one treatment for any condition. 

        Thinking back to our example study, the 50% recovery rate can be related to the 4th column (none were recovered before, around half were afterwards, so the BESD was around 0.5), and equates to an SMD effect size of 1.2. However, the lack of blinding, and the use of “admission” and “discharge” endpoints are all likely to have inflated this figure, while lack of randomisation will have had an unpredictable effect. 

        We have derived our concept of effect size as standardised mean difference from the normal distribution. However, even when our outcomes are a simple yes/no, we can calculate a SMD from it. This has been very useful when combining studies together, which is our next topic. 

        Systematic Reviews and Meta-Analysis

        Let’s start by clearing up two common mistakes 

        1. A systematic review is not a meta-analysis 
        2. A meta-analysis is not necessarily better than a well designed single clinical study. 

        In times of yore, all reviews were what are now called “narrative reviews”, which are basically stories justified by references. While obviously valuable when it comes to making sense of things, bias, and the pressure to make some sense, leads to reviews which may support either “yes” or “no”, when the right answer is “I haven’t a clue.”

        Doing a narrative review is all about choosing…

        Systematic reviews don’t start with a library, but a computer. The idea is to use one or more search engines to identify–as this is about clinical trials–all the trials that have been done on a treatment, in theory since the treatment was first discovered but usually within some reasonable time frame. At this point one of two things can happen 

        1. The reviewer decides the studies can’t be meaningfully combined, so writes a narrative review of the different stories they tell, and explains why any combination won’t work. 
        2. The reviewer decides the studies (or at least some of them) can be combined as if they were one big study. The processs of doing this, and interpreting what comes out, is callled meta-analysis. 

        Whether the studies can be combined will depend on the extent they can be considered iid with respect both to each other and the population, and that’s a big ask.  

        Same thing, different researchers, different methods, different results

        To solve this problem, we assume that better designed studies about the same thing are likely to resemble each other more, as there are always more ways of being wrong than right. So, when reading a meta-analysis the first thing to check is how they decided which studies to put in. A lot of scholarly activity has gone into designing good criteria, such as these. The chart of studies of homeopathy shown above indicates how much including studies of different quality can change results. We can check whether the studies are sufficiently similar to combine statistically, when it’s called study heterogeneity.  Here’s a graphical example, which helps explain what it means. 

        The horizontal axis reports the percentage recovered in the control group, while the vertical axis does the same for those receiving treatment: this is a way of looking at BESD correlations. The diagonal line splits the chart into two halves. For anything in the upper triangle, treatment is more effective than control, while the reverse holds for the lower triangle. The studies form a reasonably tight group, suggesting similar BESDs, supporting the view they might be regarded as iid. The dashed line is the best straight line we can draw through  all the studies. We can see that the two big studies (the largest circles) have dragged the line firmly to the left. If they were both from the same group of researchers, a reviewer might want to look at them more closely. 

        Another source of bias is called the “file drawer” problem: studies that give negative results are less likely to be published.  One way of detecting this is by a funnel plot, as shown below. 

        The horizontal axis measures the effect size (SMD), as discussed above. The vertical axis measures the standard error of each study. The analysis’ studies are all plotted as points on the plot. The two straight lines forming an isosceles triangle (the funnel) are either edge of the boundary of the 95% standard error of the effect size for a given standard error. The apex of the triangle is set at the average effect size the study calculated. The central vertical makes it easier to read.

        This is from a real meta-analysis of talking therapies for adult depression.  From what we’ve covered above about effect sizes and standard errors, we’d expect the dots splattered pretty evenly around the divided triangle,  but that’s clearly not happened here. In this analysis, a positive effect size favours treatment over control. There’s a general trend for there to be more studies than expected with results greater than the average effect size, and there is a serious dearth of small studies (larger standard errors) with less than the average effect size. People aren’t publishing small studies with small effects, but are if they have large effects, and even large studies won’t get published if the effect is sufficiently small (or negative!). The average effect size calculated by the researchers has actually been corrected for this bias prior to the plot being drawn, which is why we can see the effect so clearly. 

        Meta-analyses usually use fairly standard methods of displaying their results, but we need another concept to make sense of them. It’s actually been here throughout the blog, lurking in the background. 

        Measuring uncertainty

        In reality, we usually don’t take multiple samples to measure our iid population, we take one. After all, we’ve decided it’s iid. So, we measure our mean, and know it’s got a standard error of σ/√N. As we’re measuring an iid population, we’re pretty confident this distribution is going to be normal, just like the population distribution we are measuring. In fact, there’s a helpful theorem, the central limit theorem, that states this will be so provided our sample is big enough, so we can prove it if we really have to. As we discussed above, we can interpret the horizontal axis of our distribution chart as measuring the probability of a particular value occurring, and therefore we can think about the distance between two values as the probability that the true value lies between them: that’s what we just did in the funnel plot. We call this distance our confidence interval, and it’s usually set to cover 95% of the distribution. Here’s how they work

        In both the charts, our sample means are shown by dots, with their 95% confidence intervals as the lines stretching out either side, called error bars. The first chart, a, shows our not amazingly successful attempts to sample the population distribution at the top. Two of our samples, indicated in red, have 95% confidence intervals that do not cross the population mean. This has actually happened by chance, and the width of our error bars tells us we have been trying to use small samples. As we have seen above, the smaller the sample, the harder it will be to keep it iid with its parent population, particularly one as scattered as this distribution. The relationship between our confidence intervals and sample size is shown in chart b, with the widths of our error bars decreasing as the sample sizes used to estimate them increase (yes, that’s what makes the funnel in a funnel plot). 

        We can now say what meat-analysis is trying to do, and it’s gratifyingly simple. 

        Meta-analyses try to estimate both the mean and the 95% confidence interval of the treatment effect from the combined iid studies. 

        Let’s have a look at a similar plot (they’re called forest plots) from a real meta-analysis, also about homeopathy. 

        Everything is measured in standard errors. TE=Treatment Effect, seTE=Standard Error of Treatment Effect, CI=Confidence Interval

        As shrinking error bars with sample size make smaller studies more visible, the mean treatment effects of each study have bean displayed as black squares whose area is proportional to the number in the study’s sample. The meta-analysis’ contribution is the set of three diamonds below each column of studies. The vertical points of the diamonds are aligned with the average treatment effect, while the horizontal points define its 95% confidence interval. We can see that this meta-analysis is making the same point as the previous one about homeopathy:  as our control of possible bias gets better, so the effect size reduces: we can now see that, for the best studies, we cannot be confident that there is any effect of homeopathy at all, as the top diamond crosses the “no effect” line. 

        We can now put together a set of 7 questions to help us evaluate a meta-analysis. 

        1. Is the literature search to obtain the studies likely to have got them all? (failure invites bias)
        2. Are there enough similar studies to do a convincing meta-analysis?
        3. Have the studies been adequately screened for quality?
        4. Have they described how they addressed heterogeneity?
        5. Have they addressed publication bias?
        6. How big is the average effect size?
        7. Does its standard error overlap the “no effect” line?

          As we can see, there is no guarantee a meta-analysis will always provide better treatment data than a well designed RCT. 

          Let’s start by looking at what meta-analyses can do. If we compare the results of the meta-analysis of talking therapies (effect size around 0.5) with the large non-randomised comparison of CBT and counselling discussed earlier, we can see that less than half of the overall treatment effect (1.2, remember) is down to efficacy (our definition of variance, discussed above, lets us claim things like that). That doesn’t mean the therapists should give up and go home: the non-specific effects that are doing so much to get the patients better may well be embedded in the service, and would also disappear if the service was removed. However, it does tell us that, if around half of everybody is getting better with therapy, offering even more therapy is unlikely to change things much. It also tells us that therapy should remain an option: checking our table above shows a 0.5 effect size is still valuable. 

          Now let’s think about what they can’t do. Combining studies, as we’ve just seen, can add a whole new layer of error, and lots of additional assumptions. Large, biased studies can distort their results, as can hidden studies with negative findings. Meanwhile, lots of small studies introduce noise and unpredictable extreme results, as the smaller they are, the harder it is for their comparison groups to be iid with respect to each other. Because of how meta-analysis works, these errors propagate through the calculations. This means that a meta-analysis can never equal a single well-designed RCT of equivalent size. However, very large RCTs have their own problem, like cost, so the current system of using both as appropriate seems best. In the end, they are just different ways of answering the same simple question: does giving a treatment make a measurable difference?

          Hunting the Snark: problems in defining the causes of psychiatric diagnoses 

          I’d guess that there are more contested causes to diagnosis in psychiatry than any other branch of medicine. This blog is going to argue that these challenges misinterpret the role of cause in our discipline, contribute to misunderstandings and stigma, and undermine, rather than advance, our knowledge and understanding.

          What is Cause Anyway?

          Aristotle: formidable polymath and tutor of Alexander the Great

          To see how slippery the idea of “cause” is, we need go no further than Aristotle’s famous thought experiment, to identify the cause of a statue. He came up with four

          1. The material cause is the marble the statue was made from
          2. The formal cause is the thing the statue represents
          3. The efficient cause is the sculptor who made it
          4. The final cause is the reason the statue was made

          Today, we’d probably see both the formal and material causes as characteristics instead, while the final cause is an intention. This leaves the efficient cause being closest to what we mean, though we’d probably be more comfortable saying that the sculptor made the sculpture, rather than s/he caused it. We’re already starting to find language is giving us problems, and things are about to get a lot worse.

          Hume’s limit to understanding causation

          David Hume. At the time, his ideas about cause were so revolutionary some thought he was joking

          David Hume managed to completely demolish the idea that our everyday intuition of causality had any credibility. He made two claims

          1. That causation could not be deduced by reasoning (goodbye Aristotle’s thought experiment) but was a property of things in themselves.
          2. That the reason we perceive a relationship as causal is due to our mental processes regarding the the cause and effect. All we can assert about a cause and its effect is that they occur together: it is what Hume calls “custom” (in the sense of our response to finding cause and effect together repeatedly) that leads us to infer cause and effect.

          Hume thus sets the limit to understanding causation inside our heads. We can no more see what cause and effect really is than see the continuous electromagnetic spectrum we perceive as light. For the former we rely on our sense of custom (in Hume’s words) as we rely on our three colour receptors for the latter. In taking this stance, we are effectively asserting that identifying a cause and effect relationship is identical to predicting it. As this is easier to see in relation to a rainbow, I’ll use it as an illustration

          We can detect, with some accuracy, light of wavelengths between 400 and 700  nanometres, by noticing the light’s colour. As the chart shows, this is done by us comparing the relative activation of our three colour receptor types, which are tuned to peak at different wavelengths. Our internal experience of receptor activation thus predicts the wavelength that is hitting them.  Hume proposed that we predict cause from a similar attunement to customary repetition and, just like colour, we can’t get beyond that unaided.

          Thus, the determination of cause is based on our psychological ability (augmented as required), just like our colour sense.

          Our colour analogy does allow an Aristotelian to make an objection. If our understanding of cause is like our understanding of colour, have we picked the right cause? After all, colour would be part of an object’s formal cause, which we have already dismissed as a characteristic using modern language. It is surprisingly easy to confuse formal with efficient causes, e.g., Aquinas’ argument that that the soul is the formal cause of the body is perilously close to the popular biopsychosocial model currently employed in psychiatry. However, I am going to summon Newton to show why this argument is wrong, and simultaneously offer a further way of understanding cause, which will help us enormously.

          Isaac Newton: founder of classical physics

          Let’s conduct a physics experiment. You might even have done it already.

          Newton’s cradle

          What happens is that when the raised ball strikes the others, the ball at the other end lifts off, with the movement reciprocating until friction makes it run out of steam. Even knowing no  physics, we’re happy to say that the first ball has caused the last ball to move, though we might struggle to say why. In everyday language, we’d probably say that the first ball made the last ball move. This language use is identical to that of the sculptor and the statue, so it would be perverse to deny that we are looking at efficient causation. Let’s see what happens when we try to understand the physics of what’s going on.

          At the moment the first ball (of mass m) strikes the second at velocity v, it possesses kinetic energy of


          There are five identical balls in our cradle, so the next ball also has mass m.  At the moment of impact the the ball can be regarded as stationary (because the strike is instantaneous). It is therefore exerting a force f, because it’s been decelerated (-a) from v to zero. The magnitude of this force (according to Newton’s second law of motion) is

          f = ma

          but Newton’s third law states that “every action has an equal and opposite reaction” so the second ball exerts an equal force, -f, on the first ball. The first ball thus stops, as the two forces cancel each other out. However, there’s still the additional energy (mv²)/2 in the system. The second ball is already in contact with the third, and thus the process is repeated until the last ball is reached. At this point, as -f is absent, the energy becomes expressed as (mv²)/2 once more, and the ball swings out at initial linear velocity v.  Because of gravity, the ball returns and the whole process repeats, this time in the opposite direction.

          This account completely explains the behaviour of Newton’s cradle. However, what we have before us is a detailed description, or mathematical model of a dynamic system, which predicts the system’s behaviour. In Aristotle’s model, this would also be a formal cause!  Also, the difference between our original intuition that “the first ball makes the last ball move” and Newton’s explication is one of the degree of prediction each is capable of. I am therefore going to reframe Hume’s concept of causality as follows.

          Our understanding of a cause and effect relationship is proportional to our ability to predict the dynamic system that the cause and its effect expresses.

          We can draw three conclusions from all of this.

          1. Causality is how we experience one of our faculties, so is a matter for psychology.
          2. As it involves prediction it should be a “more or less” rather than “yes or no” faculty. However, we experience knowing the cause as “yes or no”, frequently with an added “aha!” when we come across it unexpectedly. This is not unlike our judgment of colour, when we split light into different categories of hue.
          3. Our comprehension of cause is updatable as new evidence improves our predictive abilities.

          The Psychology of Causation

          Unsurprisingly, given its importance as a tool to predict the environment, causal reasoning about events develops young, and is detectible by 24 months.

          There’s some evidence that the brain processes causation about things and people differently.

          It certainly seems to be the case that reasoning about causes in things and people shows very different psychopathology.

          Consider the following statement, taken from real life.

          A patient noticed that a ketchup bottle was stood upside down. S/he thereupon knew that the wombs of all the women in Ireland had been turned inside out.

          This is called a delusional percept, and is a symptom of psychosis. Irrespective of whether the relationship implies a cause (who knows?) it’s clear that the process of customary, predictive association we need for cause perception has completely broken down.

          The picture is rather different when we try to work out causes of human behaviour because we have a separate Theory of Mind (TOM).

          A simple test of theory of mind

          While ordinary children have no trouble saying, some children with autistic spectrum disorders struggle with this. Specifically, while they have no trouble understanding that the marble has been moved to the box, they are unable to work out that nothing has happened to change Sally’s behaviour.

          This corresponds to our everyday intuition that persons are fundamentally different from  things. Causes can have effects on both things and persons, but the kinds of causes are radically different, and denoted by different vocabularies.

          Dilbert on treating persons like things and things like persons

          Applying Causes to Psychiatric Diagnoses

          As we’ve just shown, trying to ascertain causes of psychiatric disorders begs the question: are we to identify our causes with respect to persons, or things?  This, of course, sets the stage for anti-psychiatry.

          Anti-psychiatry’s original spokesman

          Anti-psychiatry is strongly wedded to the idea that the causes of diagnosis related to our status as persons. Szasz, who considered that mental illness was a metaphor for human problems in living, tried to establish his argument by exclusion, claiming  that, outside obviously “organic” syndromes like the dementias, no biological evidence to support or refute psychiatric diagnoses exists. Essentially, he denied a difference between most psychiatric disorders and malingering, and considered psychiatry was succeeding to religion and defining our moral order. The alternative causal model was to suggest that mental illness arose out of a disjunction between individual needs and societal  constraints, and appropriate adjustments of the latter would either/or cause symptom remission, or permit successful reappraisal of the disorder as an alternative but valid way of living.

          What about treating psychiatric disorder as a thing? As long as it’s not a person, “thing” is entirely general. Even “personality”, despite the name, can be treated as no more than a collection of traits. It is the traditional approach within conventional medical psychiatry and, unlike Szasz’s claim, does not necessarily assume or require a definitive biological cause to be useful or valid, allowing cause to be approximate, to be determined later.

          Adolf Meyer, originator of the psychobiological approach to psychiatry

          Adolf Meyer called this approach psychobiology, but it implicitly included a social dimension (eg he pioneered the use of occupational therapy & child guidance) and so is equivalent to the biopsychosocial model used today.

          Our previous argument means this is actually a measurement question: which of our two causal faculties will be best at explicating causal models of psychiatric disorder? Entirely unsurprisingly, we need both, but need to be aware they model different things.  Psychiatric  disorder as a metaphor?

          Compare my screengrab of the meaning of the term with the delusional percept described above. The very fact we can deduce lots of possible, but equally unconvincing metaphors

          They’re sort of the same shape, and can gloop red stuff unexpectedly

          warns us we can’t reliably identify any sense in it.  Calling it a metaphor makes as much sense as claiming the fuzz on our TV at home is performance art, because we saw something similar once at Tate Modern. Meanwhile, the other strand of anti-psychiatry can readily be reformulated as a “strong environmentalist” position within conventional psychiatry: it’s saying we only need to understand how to change a patient’s social circumstances in order to achieve a successful outcome.

          Conversely, trying to work out how to help a patient achieve the best in life purely mechanically, without considering their self-awareness, understanding, motivation and ambitions, together with both their values and those of the society surrounding them, is doomed to failure.

          What the language of the previous paragraph illustrates is that, consistent with our model of twin causal faculties, the two are irreducible to each other.

          The causes of psychiatric diagnoses

          Having adopted Hume’s perspective on cause, it’s easy to say what causes psychiatric disorders.

          The causes of psychiatric disorders are the systems which lead to the associations between their component symptoms. 

          I agree this is rather like saying “the first ball hitting the second in Newton’s cradle makes the last one swing out.” However, even with this terribly feeble causal model, we can make some useful predictions.

          As we can reliably demonstrate these associations, Hume’s model lets us assert that a set of psychiatric diagnoses and their unknown causes are “out there”. To paraphrase a current anti-psychiatry slogan “complaints ain’t all there is” and to claim the contrary is to disappear down a solipsistic rabbit hole, never to emerge.

          Psychiatry has neither red nor blue pills, just methodology

          Next, we don’t have to be upset that our model has lots of missing pieces.  We can update our model as we go, and our sense of dissatisfaction is a useful motivator for us, but not a property of our cause-effect system. Conversely, claims by anti-psychiatrists to the effect that because a cause hasn’t been found, there isn’t one, are simply untrue. In mathematics, this is the difference between proving a problem is solvable, and finding the solution.

          Finally, it is very clear we are talking about things. Whatever causal model finally evolves to fully explain the association between mental symptoms will not be able to predict the meaning the diagnosis will have for the person. Quite simply, that’s not its job.

          We can now see where the anti-psychiatrists went wrong. They were, and are correct to insist that the practice of psychiatry included understanding patients as persons. But, they were wrong to assert that the personal order of causality was sufficient to capture all diagnostic functions.  I’ve already blogged extensively about diagnosis, so I won’t rehearse what diagnosis is and isn’t here, but diagnosis as cause is probably the least important, and certainly the most disposable of its uses.

          Consequences of the Diagnosis Wars

          Anti-psychiatry went to war over diagnosis, for ethical reasons, and also over the validity of the systems then in use.  At the time, they were right to have such concerns, and they were also right to focus on the human dimension of causality.

          This was published in 1973

          It contains a “Mental Patients’ Bill of Rights” written from a US perspective, which I reproduce below

          If we look over the 15 rights, only one, the ability to always refuse involuntary admission, remains  unmet, at least in the UK. Equally, the demands for a represented right of appeal, and freedom from discrimination, are enshrined in UK mental health and equality legislation. The fact these needed listing said how bad things were, at least in the US, when I began my medical training, and the fact these ideas are now mainstream is vindication of anti-psychiatry’s focus on human well-being and rights.  It also reflects the power and influence of the movement.

          Unfortunately, their insistence that just the human order of causality was sufficient for psychiatry has done enormous damage.

          In the UK in 1968, strongly influenced by the sociological models which also informed anti-psychiatry, the Seebohm committee recommended the replacement of separate specialist strands of social work training with a single generic qualification. Many UK Social Workers are now better trained in anti-psychiatry than psychiatry, with only the most superficial knowledge of either the disorders their clients have, or the benefits and risks of the treatments they are receiving, when they graduate. Compare the syllabus just linked to with the tasks a mental health social worker actually performs and there is a significant gap, which social worker must make up, while lack of appropriate knowledge contributes to the maintenance of stigma.

          Szasz’s equation of psychiatric patients without biological causes with malingering strips patients with medically unexplained symptoms of their dignity, and the anti-psychiatry proposal that they should receive psychosocial understanding would be met by fury.  Unfortunately, the drumbeat of insistence that psychiatric disorders require biological validation to be true has obscured the fact that they are medical diagnoses, and, like all medical diagnoses, are pragmatic, so may not have an identified biological aetiology.  The confusion that results can lead to harm from inappropriate medical attitudes, investigations, and failure to accept effective treatments because they are deemed to be presuming an unacceptable causal model.

          The final mischief I will mention relates to personality disorder.  As they are the topic of a separate post, I will not discuss what I said here, but the failure of anti-psychiatry to recognise that disorders of personality may be diagnosed as disorders has left those with this diagnosis in a dreadful position.  If they receive the diagnosis, and do not realise it does not refer to them as persons (as the anti-psychiatrists deny the possibility of this) they have to choose between stigma of being in some way a “faulty person”, or denial, leading to refusal of treatment until it becomes very difficult, or even too late.

          It is time the diagnosis wars halted.  Diagnosis has come a long way since antipsychiatry raised its objections, which in consequence are no longer valid.  Meanwhile, the harm continues.  It would be superb if anti-psychiatry repurposed itself to address the human causal order, where it has already done so much good.

          The Science Behind Modern Psychiatric Diagnosis 

          In the 1950s, diagnosis and formulation were the results of very similar processes: informed opinion from trained experts. We now know that, for diagnosis, this is simply not good enough, and a whole industry dedicated to improving diagnosis has worked to totally transform them.  This blog is about the scientific principles underpinning that effort.  


          Without measurement, science is impossible. If we think of psychiatric diagnosis as our effort to measure mental symptoms, the ruler analogy above suggests we have a daunting challenge. Fortunately, there are more ways of measuring than that, as shown below 

          Yes, naming something is a form of measurement! The tape measure in our first picture illustrates the strongest form of measurement, the ratio scale. It’s called this because ratios make sense e.g., 10cm is double 5cm. This is because 0cm is an absolute zero. 

          Here’s an interval scale. 

          Looks very similar to a ratio scale, doesn’t it? However, for both the Fahrenheit and Celsius scales, while it makes sense to say that 20 degrees is ten degrees cooler than 30 degrees, it makes no sense to claim that 40 degrees is twice as hot as 20 degrees. To see why, let’s use the ratio scale for temperature, degrees above absolute zero (degrees Kelvin). 40 degrees C is 313 degrees K, while 20 degrees C is 293 degrees K. On the ratio scale, they’re practically the same. 

          Ordinal, or ranking scales, means that while we can put things in order, we can’t say that the differences between them are constant 

          Simple naming does no more than define differences between things

          In this rainbow, it makes no sense to say that the colours we see are anything other than different from each other. Diagnosis is closest to this type of scaling. 

          The rainbow also shows an important issue with categories. While we might draw our rainbow with each colour clearly demarcated 

          In reality the colours fade into each other, with fuzzy boundaries. I’ll have more to say about that later, but for now just notice that, despite the fuzz, the colours are there in reality too. 

          The Curse of Dimensionality

          While dimensions have a starring role in the debates over diagnosis, the curse is actually upon statisticians, who have to manage the things. Before we begin, it’s important to realise that all measurements define dimensions. Even naming defines a dimension, albeit of a single unit with only two values. 


          The Izunt-Iz dimension, as measured by Ricky Gervais

          The problem arises as the number of dimensions needed to describe something accurately increases, as the graph below illustrates 

          “Classifier performance” means how good we are at identifying something correctly. There are no numbers on the tick-marks because the chart is entirely general: the peak could fall at any number of dimensions, though it will be lower if more dimensions are needed

          This might seem odd, as one would expect that more features would lead to better identification. Unfortunately this isn’t so

          Imagine we’ve we’re trying to pick out people with a psychiatric disorder (represented by dots inside the red shaded area) from everyone (all the dots). With the same unit of measurement for each characteristic (dimension), we can see that the number of people we can identify drops off dramatically as the number of dimensions we have to use increases. Of course, if our measure was perfect, that would be fine, but no measure is. The proportion of cases correctly identified by a measure is called its sensitivity, while the number of non-cases correctly identified is its specificity.  We call the proportion of cases in the population prevalence, our ability to identify cases in the population positive predictive value, and our ability to identify non-cases negative predictive value.  This chart shows how they relate 

          The curse of dimensionality thus means, that the more characteristics we use to describe a psychiatric disorder, the worse we will become at identifying it, if we do not at the same time dramatically improve our measurement ability. It’s therefore not surprising that complex psychosocial formulation is hopeless at this task. The goal therefore has to be to find the minimum number of characteristics that will identify a psychiatric disorder, which leads to the next section. 

          Reliability and Validity

          At its very simplest, reliability is the chance of a result being the same if it is repeated, while validity is whether the measure captures what is intended to be measured. For our purposes though, it’s better to reframe them like this. 

          • Reliability is the random error associated with a measure 
          • Validity is the bias a measure might have. 

          If we think of a measure as an attempt to hit a target, this becomes clear 

          From the observer’s perspective, a high validity/low reliability condition is as bad as a low reliability/low validity one, because we can only see the arrowheads.

          If we focus on a single measurement point (arrow) it becomes clear that a target can never be more valid than it is reliable, though it can be less valid. This means that diagnosis must first be measured in terms of its reliability, before validity can be considered. Diagnosis has used two approaches to this: prototypes and operationalised criteria. 

          A standard poodle being judged for conformance to its prototype

          Prototypes are the conventional means of identifying species in biology: a typical example is kept for comparison, which is one of the main scholarly functions of natural history museums worldwide. This system remains recommended for use in the clinical identification of psychiatric disorders in the World Health Organisation’s system 

          The alternative approach is to use “operationalised criteria”.  There is a “strong” and a “weak” version of this approach. 

          • The strong approach defines a number of criteria which have to be met, usually from a larger total set (inclusion criteria) and criteria which must not be present (exclusion criteria) together with a specified method for identifying them. 
          • The weak approach has inclusion and exclusion criteria, but does not specify a method for assessing them. 

          The strong approach is largely used for research, when it is implemented by structured interviews.  These often allow assignation to more than one diagnostic system. The weaker version is employed in the American DSM5. 

              The reliability of both systems is extensively tested before release, and a huge literature covering the reliability of their different diagnoses exists. In general, reliability using structured interviews has been found to be better, but, with care and training, both the carefully specified prototypes of ICD -10 and the inclusion/exclusion  criteria of DSM5 show sufficient reliability though, unsurprisingly, variation between different diagnoses exists. 

              Unfortunately, validity is altogether trickier than reliability because, while all validity introduces bias, there are many ways that bias can be introduced. This leads to there being several kinds of validity. 

              • Face Validity this is the best known type of validity. It simply means that the measure should seem to refer to its target. 
              • Content Validity requires a measure to cover all aspects of the target.  For example, a depression measure should include enough questions to cover all the ways depression can present. 
              • Predictive Validity requires the measure to be able to predict other characteristics of its target, not included in the measure.  These might include response to treatment, associated features, or prognosis. 
              • Criterion Validity means the measure should be able to detect some specified characteristic of its target.  For example, a depression screen should be able to recognise when there are enough symptoms to make a diagnosis. The curse of dimensionality means we want no more. 
              • Construct Validity is the extent to which the measure truly reflects the nature of the target. 
              • Convergent Validity is when the measure tracks another measure of known validity when measuring the same target. 
              • Divergent (Discriminant) Validity is when the measure gives a different result to another measure, known to measure something else, when used on the same target. 

              Which type of validity is important depends very much on the purpose of the measure. For example, it is currently thought brain imaging provides good construct validity for many disorders. However, for most of these, criterion validity has not been established, so it is not widely used for diagnosis except in a few conditions, such as dementias. 

              The prime value of a diagnosis lies in its predictive validity, because that tells us what to expect, what to prepare  for, and what treatments might work. It can be measured by correlating the measure with what it needs to predict, as this chart of different personnel assessment tools shows.   

              Predictive validity of different assessments of likely job performance

              Here’s a more immediately relevant example 

              Asterisks indicate that the correlation is significant

              I’ve chosen to use personality disorders, as these are highly contested diagnoses which I’ve blogged about before. Here, what it shows is that the association between an avoidant adult attachment style and either physical or psychological intimate partner violence is actually explained by the propensity of this attachment style to predict borderline and antisocial personality disorders. It is the presence of these two diagnoses that predict intimate partner violence, not the avoidant attachment style itself. 

              Of course, to be used, a diagnosis also needs criterion and content validity, which leads to our next section. 

              Cutting the Rainbow

              Psychiatric diagnosis has three components 

              1. A set of signs and/or symptoms, defined as above. 
              2. An abnormality criterion: the diagnostic features should be developmentally and socially unexpected. 
              3. An impairment criterion: the diagnostic features should cause harm either to the patient or others, or both.

               Avoidant adult attachment style doesn’t make the cut as a diagnosis, because, alone, it doesn’t meet either criterion 2 or 3, even though it is a risk factor, as we have just seen. However, both these criteria beg an important question. How should we set our cut-offs?  After all, it’s pretty obvious that there are going to be borderline examples of both “abnormality” and “harm”. Only a little more thought is needed to apply the same boundary question to the symptomatic criteria also. We are no longer with our convenient rainbow cartoon, but the real thing, and need to tackle its fuzziness head on. 

              Latent traits and latent classes

              The normal curve. The percentages refer to the proportion of the population in the labelled segment of the curve

              Many things in our population either follow, or can be transformed to a curve like the one above, where the unit of measurement of the thing we’re measuring is standard deviations from the mean score.  Long tradition has suggested that either a 5% or 2.5% cut-off works well in defining abnormality. If we think back to our discussion of the curse of  dimensionality, to impose such a cut-off in addition to adding the extra dimension (or two) associated with the distributions of abnormality  and impairment is pretty stringent, so if we are able to identify such a diagnosis reliably means it has passed a high bar, albeit we often cannot do more than guesstimate these. However, we can also tackle the issue directly. 

              Finding hidden categories in continuous measurement

              This model outlines how it is possible to look for hidden categories if we are using continuous measurement. Here’s an example, where the classes are distinguished by different profiles.  

              The colours simply indicate how the questions reference different disorders

               If we assume (quite reasonably) that abnormality and impairment correlate with symptom count,  then, despite using continuous measurement, we can identify four distinct classes, including a group without sufficient symptoms to meet disorder criteria. Just like our rainbow’s colours, we can find evidence of separate categories of disorder.  Here, Borderline Personality Disorder (BPD) may be distinguished from both simple and complex presentations of Post-Traumatic Stress Disorder (PTSD). 

              We can do the same trick the other way round. 

              “Indicators” can include one or more diagnoses


              In this example, the finding of a single latent trait covering both dependence and abuse led to a recommendation to combine these into a single category in DSM5. 

              Two categories (dependence and abuse) sharing a single latent trait

                To understand what’s happening with these two examples, we need to go back to why we see a rainbow the way we do. Light is, of course, a continuous electromagnetic spectrum. However, we detect this using only three detectors 

              As the diagram shows, our rainbow arises because we model the continuous spectrum by different levels of excitation of these three receptor types. This is effectively a kind of latent trait analysis. Conversely, neuronal measurement of different receptors allows us to deduce the latent classes also present in our rainbow, as well as precisely modelling the wavelength they receive. Even though the receptors represent latent classes, when combined they provide enough predictive validity to let us model the entire spectrum of visual light. In diagnosis we have begun to do something similar, with increasing use of the concept of comorbidity, while the term “spectrum” is now formally applied to autistic disorders in DSM5. Unlike the rainbow, in diagnosis we frequently cannot be sure whether dimensions or categories have better construct validity. However, as our primary goal is to establish predictive validity, the science allows us to see that diagnostic categories and dimensions (with cut-offs) may be interchanged, so our modelling may be fit for the purpose we intend. 

              We can also say that the diagnoses we now use are no longer expert guesstimates, but reliable and valid categories that are backed by good science.  They will continue to evolve as our ability to measure mental symptoms improves.  


              In Defence of the Medical Model in Psychiatry 

              The Medical Zombie

              Even zombies stop for a selfie

              To listen to most commentators, the medical model lumbers around the mental health landscape like some kind of zombie. It’s dead (or at least out of date), bits of it are always being refuted, we should run screaming when it appears, and if we let it, it will eat our brains, leaving us mindless husks. Even philosophers, who should know better, criticise it in passing without clearly saying what it is, leaving us to guess its evil dimensions from their own prose about what needs changing. 

              The Medical model is usually defined by contrast with something better

              So, this post is going to introduce us to the medical model as it really is. We shall see that it is nothing like what the negative accounts suggest: indeed, many of the “improvements” suggested are actually parts of the model. But, before I say what it is, I need to make clear what it isn’t, which is unfortunately how most commentators treat it. 

              The Medical Model isn’t a Concept

              Think back to when you learned to drive a car (or if you never have, imagine it). The instructor tells you to put your hands here, your feet there, pull this lever in this direction, adjust your feet on these pedals like so, and you’re moving. What you have learned is a procedure, and a procedure is profoundly different from a concept. 

              A centipede discovering the difference between concepts and procedures

              Students learn medicine the way we learn to drive cars.  Our examinations, diagnoses and treatments are expressions of these processes. The brain encodes processes so differently from concepts that we give different terms to the memory systems used to store them.

              The many kinds of memory

              Concepts end up in explicit memory, while procedures are stored in implicit memory. To understand the difference, think back to the car driving example. Your explicit memory could probably have told you what you needed to know about the controls almost immediately. However, it took weeks of practice before your implicit memory could reproduce the necessary movements sufficiently reliably for you to pass a test.

              The Medical Model: everything above the water is concept, everything below is procedure. The ship is full of philosophers

              This is of course why all branches of medicine, including psychiatry, have practical as well as theoretical examination. So, criticisms of the medical model on theoretical (i.e., conceptual) grounds are missing what the medical model is about. 

              The Medical Model is a Skill-Set

              Precisely because skills involve procedures, they can be hard to define. The best definition I could find for our purposes turned up in, perhaps unsurprisingly, in a dictionary of business terms 

              The basic medical skills

              After diagnostic skills, which, as I discuss in another blog post, date back to ancient Egypt, practice skills are the oldest component of the medical model. They were first set out in the  famous oath, entirely incorrectly attributed to Hippocrates (to give it added force) sometime between the fifth and fourth century BCE. 

              Despite the religious preamble, it’s obvious that we’re looking at a contract. The doctor has committed s/himself to practice in certain specific ways. Ethics are an integral part, but by no means all, of what the Oath covers. 

              The first paragraph, by far the longest, commits the novice practitioner to support and help maintain his teacher’s practice. 

              The second promises, to “use diets” reflects the practice of medicine at the time: diet (which actually referred to a combination of recommended food, exercise and sexual activity) was the preferred intervention, to be adjusted according to the patient’s state of health. Ancient Greek medicine had a spectrum between drugs and food so this recommendation did not exclude the use of drugs as part of a therapeutic regimen. “Injustice” here refers to the doctor’s own judgment, so this is a guarantee of quality (which would of course also reflect favourably on the trainer). 

              The third paragraph says a lot about time-specific ethics (clearly abortifacients were as controversial then as now) but there is a key ethical guarantee: a doctor may not provide what is asked for, if it is harmful. There is also a general requirement for good ethical standards

              The fourth paragraph promises not to claim untrained expertise, even if the problem is understood and the need is urgent. Procedural knowledge trumps conceptual knowledge. 

              The final two paragraphs introduce the ideas of sexual continence and confidentiality in relation to practice. 

              Updating the language and, mutatis mutandis, the skills’ descriptions, we can now define the basic skill-set of the Medical Model

              1. When practising, a doctor must deploy s/his best training to s/his best ability 
              2. A doctor must act according to s/his best judgment, to optimise the benefit/harm ratio for the patient. 
              3. A doctor may not act on requests that, in the doctor’s estimation, will hurt s/his patients. 
              4. A doctor will not do things s/he cannot do in practice, even if s/he understands the theory, so will have a good understanding of the limitations of s/his skills. 
              5. A doctor must regulate s/his own behaviour  to exclude sexual relationships with patients, ensure confidentiality, and live to high ethical standards.  

              Having set this out, what is so surprising is the longevity of the model. These principles, with some additions, still remain at the heart of modern medical practice, and remain standards doctors are judged by. Psychiatry is a branch of medicine, and the doctors who practice it, called psychiatrists, must adapt this basic skill-set to the needs of their patients. 

              Applying the Medical Model’s Skill-Set to Psychiatry

              The first rule of adapting the model is that the basic rules haven’t changed. As we no longer live in Ancient Greece, let’s switch to the up-to-date version. The British General Medical Council captures it under four headings. 

              1. Knowledge, skills and performance 
              2. Safety and quality 
              3. Communication partnership & teamwork 
              4. Maintaining trust 

              We can see that what’s changed since Ancient Greece is mostly under heading 3, where stuff like teamwork and openness sit, consistent with our much more democratic and complex society. These days, consent, not mentioned in the Oath (as the arrangement was commercial, consent was implied), is under 4. 

              The first thing to notice is how general the model is, regarding the range of knowledge and skills it can use. If a profound knowledge of literature, or the ability to dance superbly, improved our patients’ conditions, then we would be expected to have those skills. However, the model does expect us to be judicious and competent, which are pre-requisites for trust, safety and quality. What does this mean in psychiatry?


              Not all judgment in medicine is medical

              Judgment is possibly the most important of all our medical skills. It is nothing to do with justice, but refers to our ability to make distinctions, so we can do different things to help our patients under different conditions. This is what diagnosis is for, as it has been since Ancient Egypt.  I have already blogged about how doctors use diagnosis, and that it may be used differently by  other professions, so all I will say here is that diagnoses work as aids to our medical judgment. Provided the judgments reliably lead to ways in which we can help our patients, the Medical Model is entirely agnostic on how true they are.  However, the medical ability to diagnose is a procedure, which takes years to acquire. Without that procedural knowledge, which is what leads to treatment choices and prognostic judgments, our understanding of the meaning of diagnosis is incomplete. It is important that psychiatric diagnoses are not completely based in language, as they may refer to conditions that may not be appropriately described linguistically: a label may be the best we can do with words. We can see that, from this perspective, that is nothing reductive or restrictive in the use of diagnosis: if the current one doesn’t fit, we can change it or develop a new one, provided we are competent to do so.


              I guess we can think of competence as a kind of meta-skill: it says how good we are at the skills we claim. What are psychiatrists expected to be competent at?

              One thing that makes the Medical Model medical is the centrality of good ethical practice. Like everything else, moral behaviour is something we need to learn, and psychiatric ethics presents us with some of the most challenging problems in all of medicine. Psychiatry therefore explicitly includes lifelong ethical training, which is both elaborately systematised and constantly developing. This is consistent with research findings suggesting psychiatrists are at low risk for malpractice claims, compared to other medical specialities. The good ethical care psychiatrists give their patient arises directly out of the Medical Model’s requirement that ethical skills should not be distinguished from technical skills, combined with recognition of the special ethical problems that psychiatry presents over issues such as consent, meaning a higher level of ethical competence is necessary. 

              Of course, psychiatrists need technical skills too. These can be broadly divided into 

              1. Assessment skills. These come into play as soon as patients are referred or seen, and are required throughout the psychiatrist’s involvement. Diagnostic skills are probably the best-known of  these, but are by no means the only ones, as the psychiatrist must also assess how the diagnosis affects the patient’s life, what the impact of different treatments is likely to be, and how the patient responds. Without all these assessments the psychiatrist cannot know that the benefit/harm ratio (there is no such thing as a risk-free treatment) is correct. 
              2. Treatment skills. A psychiatrist must be capable of selecting the best treatment (which might be none at all); providing, either directly or indirectly, the recommended treatment or the best available, and adjusting or changing it according to the patient’s changing needs.
              3. Boundary skills. These are rarely mentioned, but are crucial for identifying when the limitations of the psychiatrist’s expertise are reached. An example might be the ability to recognise a psychiatric presentation of a physical disorder.  

              Notice that the model says nothing about what assessments or treatments should be used: the constraints arise from the requirement for competence, as we are clearly being incompetent if we choose an inappropriate treatment or assessment. 

              By now, it should be obvious that saying things like “the medical model is excessively biological”, “the medical model puts people in boxes”  or their various less flattering synonyms is paying attention to only those parts of the model that are visible as concepts, without also reflecting on the procedures which engage them, without which they cannot properly be understood. It’s time to join the dots.

              Working with a Living Fossil

              The Medical Model is very old, probably much older than the Oath we used as its starting point. Even that recently, concepts weren’t abstracted the way they are now. 

              Virtues are descriptors of people, so directly observable, not deduced concepts

              What they did understand was tools, and so medicine, of course, has always used tools. We have no problem recognising surgical tools

              But here’s a picture of a modern psychiatric tool 

              which, like any good tool, is subject to redesign and improvement over time. Of course, tools are only one part of a system, which requires skill to use properly 

              As the image above suggests, we are back to my first blog, which is about how to use our tools to do the job we intend. 

              Another tool is even more important 

              The Literature, before it went digital

               If we refer back to the Oath, we can see  that, once we have faithfully followed its precepts, we are not supposed to modify our therapeutic stance for harmful requests. These days, it’s more about effective communication and teamwork, as the GMC mentions in its third principle. That, however, does not relieve us of our duty to ensure we are indeed doing our best with our training for our patients. Evidence is the tool we should use to convince both ourselves and our colleagues: this is especially important if our approach is contested. This is not just theory: psychiatric services typically use evidence in  practice, to a similar degree to physical medicine, where the use of the medical model is uncontested. Also, despite the handicap of poorer understanding of many psychiatric disorders, psychiatric drug treatment stands comparison with many accepted drug treatments for physical conditions. This does not preclude the use of non-physical treatments either instead, or in addition to the drug treatment I’ve just discussed. Interrogation of the literature for clinical use is also a skill, demanding additional training.

              So, from inside the medical model, diagnosis, the various explanatory models of disorders, treatments, their choices and the evidence which supports them all, are simply tools to be used for the benefit of our patients. This of course does not mean that any idea or action is as good as any other, for without corroborative evidence these are no more than engaging stories or possibilities that cannot offer guidance. It is embedded in empiricism, not  theory, and has been so for 3,600 years

              Using Tools Without Training

              People who aren’t surgeons generally don’t buy scalpels. Psychologists jealously guard access to many of their tests, for fear of misuse. However, many psychiatric scales, and the core diagnostic manuals, are “out there”, to be used by whoever picks them up. If an untrained person picks up a scalpel, it will still cut, just as an IQ test will provide a score, and a diagnostic manual may offer a diagnosis. But, the outcome can be as different as using amphetamine as a drug of  abuse, and as a treatment for ADHD. 

              What could possibly go wrong?

               So, without the procedural knowledge to employ them correctly, it is easy to raise concerns about how the tools of the medical model might be misused, or unwittingly misuse them oneself. Suspicion and mistrust are likely to follow, as the outcomes do not live up to expectations, and, unlike concepts, it can be hard to know what is not understood. Before you started to learn to drive, could you understand why it would be so hard?

              What are psychiatrists for?

              A psychiatrist isn’t there to “give a diagnosis”, though you might get one. They aren’t there to “offer medication”, though that might happen. They aren’t there to promote a “biological model” however you conceive it, though they may offer one as an explanation. A psychiatrist is there to do the same as any doctor since as far back as history can remember: use the medical model for your benefit. We have now seen that is honestly and fearlessly exercising their skills and knowledge on your behalf, if necessary in collaboration with others, and without ideological limitation. It might be incredibly old, but I don’t think it’s reached its sell-by date yet. 

              Personality and its Disorders 

              For those who find the image below distressing, I’ve explained my choice at the end. 

              Last winter, I had to diagnose a young woman with an eating disorder as also having a Borderline Personality Disorder (aka Emotionally Unstable Personality Disorder). A capable researcher, she had googled her own symptoms, so was unsurprised, but despairing. By the time we had discussed current views in treatment and prognosis, we both had tears of relief in our eyes; hers because my take on her diagnosis gave new hope of recovery, mine because I was able to overcome the incorrect stigma the diagnosis carries. This blog is about trying to strip that stigma from these unfortunately named diagnoses, so that they can be used better. 

              Personality as Our Soul

              Almost no-one reared in a Christian environment will have trouble interpreting this image: folks struggling to get to heaven, encouraged by angels and saints, but some being dragged to their doom by pesky demons. We know they’re not people’s physical bodies, but souls, as they look the same at the top of the ladder (heaven) as they do at the bottom (earth): there is no sign of them leaving a physical body. We also know that the demons are being fair in their choices and actions. or the saints and angels, let alone Jesus, would be intervening. When we look at the souls, we can see that they still have the characteristics and identities of the living people they once were. Though other faiths take different views, the Christian conception of the soul is thus very similar to our everyday understanding of personality. In Western philosophy, personality was a metaphysical concept, synonymous with moral character, which only recently acquired an empirical dimension. This concept can also be found in law, with new offenders having been said to have “lost their good character”, which in turn affects their ability to access certain societal benefits, e.g., it can bar immigration, and restrict jury service. I am therefore going to suggest a rather strange everyday interpretation of personality, which will however be very useful in understanding why “personality disorder” gets under so many people’s skins, and which I think captures the moral nuances of the term.  

              Personality encompasses those aspects of ourselves about which we make moral judgments

              From this perspective, a diagnosis of  “personality disorder” carries within it a potential negative moral judgment. 

              Personality as a psychological construct.

               Let’s now take a different perspective and definition. Here’s the currently agreed psychological one 

              Personality refers to individual differences in characteristic patterns of thinking, feeling and behaving. The study of personality focuses on two broad areas: One is understanding individual differences in particular personality characteristics, such as sociability or irritability. The other is understanding how the various parts of a person come together as a whole.

              Our technical definition has completely removed the ethical dimension apparent in our everyday approach. Instead of our personalities being something metaphysical, they are simply either a class of individual differences, or an estimate of how our various characteristics integrate with each other. This makes personality disorders no more than a subset of all psychiatric disorders, referring to some disabling disturbance in these characteristics. 

              However, despite these differences, both definitions have the potential to overlap upon at least some of the same qualities. For example, “trustworthiness” is a quality on which individuals may differ, and which has a clear moral valence. 

              Where we go from here depends very much on the assumptions we make on mind and brain. If we assume that the mind is in some way non-physical, then we have no difficulty: we simply assert that the ethical dimension of personality belongs to the non-physical part of mind, and is separable from psychiatric disorders, which reflect brain disturbance. Of course, that gives us other problems, which I’ve discussed in a previous blog post on this site. 

              If however, we do accept that mind is simply how the brain organises part of itself, then we have to admit the possibility of psychiatric disorders existing which will attract negative moral judgments, even though we agree that psychiatric disorder should not be subject to such judgments. It follows that this is exactly the cleft stick we find ourselves in with personality disorders. 

              Personality Disorders as Psychiatric Diagnoses which Attract Negative Ethical Evaluations

              It was not so long ago that all psychiatric disorders were morally connoted. The combination of early developments in genetics with hybrid terms such as “degeneracy” (implying both physical and moral decay within or across the generations) led to possibly the worst ever failure of the medical model: eugenics, which still casts its shadow over biological theories of mental illness. 

              Why eugenics is a bad thing: Nazi-style ideology in Oregon in the 1920s

              We now know that eugenics was genetically as well as morally misguided, but does that mean that there are no biological failures of “moral character”?  

              The strange story of gambling. 

              Curiously, for so enduring a vice, gambling (unlike greed) isn’t mentioned in the Christian Bible, though it does make it into the Koran. Excessive indulgence in it has been correctly associated with the complete destruction of family fortunes 

              The 7th Duke of Leinster, who gambled away the fortune of one of Ireland’s wealthiest families

              Historically, it has also been associated with companion vices of promiscuity and intoxication, making it a fine topic for instructive paintings

              However, there is another side to this story. 

              Parkinson’s disease is a neurological condition, named after the doctor who first described it, which induces tremor, interferes with movement, and can impose mental, as well as physical inflexibility, with dementia as a severe consequence.

              Its mechanism is reasonably well understood 

              Insufficient dopamine produced by the substantia nigra

              and it’s long been treated, with some success, with drugs that increase dopamine levels, most famously dopamine’s metabolic precursor, L-DOPA (called levodopa when prescribed). 

              If we look on its list of side effects, we find 

              Drug induced moral turpitude?

               It turns out that dopamine does more than let us move properly. It also is the major neurotransmitter for the brain’s reward system, amongst much else. 

              ACC Anterior Cingulate Cortex; PFC Prefrontal Cortex; NAcc Nucleus Accumbens; HC Hippocampal Complex; VTA Ventral Tegmental Area

               The key bit that concerns us here is the Nucleus Accumbens, falsely called the brain’s “pleasure centre”; it’s probably better described as the brain’s encouragement centre. The relationship between it and dopamine can be summed up as

              Anything that puts up dopamine in the Nucleus Accumbens is something we want to do more of, and the more we do it the more dopamine levels there will rise. 

              To show this, here’s what happens to our brains when we gamble 

              Yellow indicates raised dopamine levels

              If we compare this picture with the map of the dopamine system above, we can see the Nucleus Accumbens is highlighted. The L-DOPA story shows that the same relationship can also work in the opposite direction. It’s also been found that the effect occurs when particular genes encoding a particular type of dopamine receptor DRD4 is present. These last two studies were not done on folk with pathological gambling. so we are talking about ordinary genetic variation in ordinary brains.  Our worst fears are realised: moral behaviour is just as dependent on brain states as anything else we do. If so, then impaired mental health could disrupt our moral functioning, and not just as a result of being cut off from reality. 

              Mental health and moral responsibility in society 

              Our everyday notion of moral responsibility assumes freedom of will, and the latter seems necessary for retributive justice. However, brain states are about anatomy and biochemistry: things determined and irrelevant to “will”.   One could argue that this, as much as religion, has encouraged a dualist approach to mind: our brain is the horse, but we are the rider, and while it might throw us from time to time, we are still responsible for what we make it do. 

              The exercise of will against inclination

              This lets us try to judge whether the brain has thrown its rider, or whether the unacceptable conduct was the rider’s decision.  However, as we have already assumed that states of mind are no more than expressions of brain states, we have to reject this as a convenient fiction. 

              Fortunately, we don’t have to mire ourselves in the intricacies of the relationship between moral responsibility and freedom of will. Instead, we may simply claim that it wouldn’t be fair to treat differently functioning brains the same way. If we build on my previous blog about brain-mind identity, and assume that diagnoses are imperfect but useful indicators of systematic and impairing differences in brain function, then diagnosis may be used to guide us.  

              Let’s start with our formal statement of the identity hypothesis, as developed in that blog.  

              “For every state of mind (∀M), any individual state (Mi) can be mapped to a particular state of brain (Bi), contingent on that brain’s characteristics (Vi)” 

              In symbols, we write 

              ∀M(Mi ≡ Bi) | Vi

              Let us assume, with English law, that criminal (or vicious, it doesn’t matter which in this context) requires both an evil intention and its related action. All our vicious and evil intentions (let’s call them wicked)  {W} are part of {M}, so, allowing someone to be anything up to totally vicious and evil  {W} ⊆ {M}.  Furthermore, our definition allows us to assert that a wicked intention includes the wicked action in terms of brain states, otherwise it wouldn’t have been wicked (because we would have rejected it and done something different). This enables is to write, for a wicked intention/action

              Wi ≡ Bi | Vi

              Remember, Vi is the relevant brain condition i.e., the brain organisation that makes Bi possible It therefore follows that {Vi} includes the brain state associated with any relevant diagnoses {Δi} which in symbols is {Δi} ⊆ {Vi}. Also, the relationship between Vi and Bi is one of conditionality, not causality.

              Unfortunately, neither Vi nor Δi are directly accessible to us, so we have to make do with the admittedly imperfect proxy of descriptive diagnosis itself Di.  Because it’s the best we have, we write 

              Wi ≅ Bi | Di

              This means that no diagnosis can be held to cause a wicked act. To see the implications of this in action, let’s look at something that used to be thought wicked, but is now more accepted: suicide. People may choose to take their own life for a range of reasons: we also know that suicidal intent is one of the most dangerous symptoms of depression.  However, it makes no sense to claim that what we normally understand an intention to be can also be a symptom, as a symptom is no more than an expression of a pathological brain state. It would be like saying that snow or interference in the picture of a badly tuned TV was part of the programme. 

              Part of the picture, but not the programme: how symptoms affect our states of mind

              This means that, if we decide someone’s suicidal intent is a symptom of depression, it is pointless to debate whether they “really want” to do it, any more than someone “really wants” to have a headache.  It’s there in the same way that the headache is.  As wickedness requires both act and intention, we can assert that the suicide was a fatal outcome of depression’s brain state,  so not wicked, irrespective of our views of suicide otherwise. Why have I said “outcome” and not simply claimed that depression caused suicide? Because it hasn’t, as our symbol-writing has shown. The correct term for what’s happened is called “moderation”, as I’ve explained in a previous post on this site.  Let’s look at what all this means for how we should treat people with psychiatric disorders in general, because that’s what we’re discussing right now.  

              1.  people should be held to account for Wi ≡ Bi.  
              2. How they should be held to account should be influenced by Di.

              This seems to fit comfortably with current approaches to forensic mental health, so is unlikely to be far wrong. 

              What our model has also shown is that, once we accept the admittedly uncomfortable idea that our ethics simply reflect a set of brain states (which we possess for excellent reasons) and can therefore become disordered like any other brain state: – 

              1. There are no grounds for awarding a different moral status to those with personality disorders, from those with any other disorder. 
              2. Equally, the nature of the cause of the disorder, be it trauma, deprivation or genetic variability, makes no difference to disorder’s moral status, because no disorder can have one. 

              Some may well recognise this as being one way of stating the principle of Parity of Esteem.  As the brain is an organ of the body, we should no more morally evaluate disorders of the brain than disorders of the liver. 

              Understanding the symptoms and signs of personality disorder

              Let’s see what happens if we try to make sense of personality disorders as just another kind of psychiatric disorder.

              Currently, personality is described in terms of 5 overarching qualities, easily remembered if we use the acronym OCEAN

              1. Openness 
              2. Conscientious
              3. Extraversion 
              4. Agreeableness 
              5. Neuroticism 

              Correlating personality disorders with personality dimensions

              As the table above shows, though there is inevitably some variation, the various  personality disorders have been shown to relate to the various dimensions of personality across a large number of studies. So, calling them all “personality disorders” isn’t too bad a description. However, it’s important not to overinterpret what this means.  Here’s a picture illustrating how even the strongest associations reported in the table are pretty fuzzy.  

              Visual representation of strength of association between variables reported as correlations

              Also, conditions that are not classed as personality disorders may be associated with the dimensions, e.g.,  anxiety disorders and Neuroticism. It might be better to understand them as (among others) “disorders which affect personality”, particularly if we wish to remind ourselves that we are wanting to denote brain-states. 

              However, as I’ve argued previously on this site, the value of diagnoses for clinicians and patients lies in their predictive validity, which is how good they are at letting us know what to expect from them, and what will best work to ameliorate their impact.

              Borderline Personality Disorder is a good example to take. I’ve already mentioned it can be successfully treated in the introduction.  Here are its symptoms 

              No-one wants to go through life in that way, so being able to reliably identify it, and thereby discover what’s needed to prevent it as well as treat it, would be good. In fact, it can be identified very reliably indeed, and its epidemiology can be explored like any other psychiatric disorder; nothing special is required. 

              We can also go a bit further, and visualise some of Δi.

              Meta analysis of differences in amount of grey matter HC = Healthy Controls BPD = Borderline Personality Disorder

              For BPD at least, our model fits, and this is the commonest personality disorder presenting in psychiatric clinics. 

              Denial of Personality Disorder is Unethical

              Not so long ago, we thought that the best way to stamp out racism was to become “colour-blind” and simply enforce a rule that black skin tones meant nothing. We found it didn’t work. 

              • Thanks to previous discrimination, black people had inequality of access to qualifying characteristics for rewarding roles in Western society. 
              • Black skin tone reflected a different cultural identity & different physical needs, from haircare to health risks, none of which could be accommodated in a colour-blind approach. 

              The denial of personality disorder as a diagnosis has identical effects to the colour-blind approach to racism, and does at least as much harm. 

              Let’s do the theory first. Personality Disorders are simply a subset of {Di},  which means, conditional upon the diagnosis. their symptoms shouldn’t be subject to moral censure. However, we have already seen that, in the everyday theory of personality, their symptoms are exactly those characteristics which are likely to lead to moral judgments. So, in the absence of a diagnosis, we will assume that the person is culpable in the same way as anyone else, which we have already argued is unfair. 

              Instead of being symptoms, the overweening arrogance of narcissistic personality disorder, the dependency and unreliable emotional expression of BPD, and the aggression of antisocial personality disorder become invalidating moral defects, leading us to avoid, criticise or punish the sufferer, rather than helping them overcome their disorder. Sadly, this view also holds away amongst some ill-informed (and sometimes would-be) professionals, included in the view that “personality disorder isn’t a psychiatric disorder”. For example, it is currently fashionable among some psychiatrists and psychologists to claim that President Donald Trump has a narcissistic personality disorder. However, it is clear that this is deliberately using the stigmatising power of the term for political ends. This abuse arises precisely because these psychiatrists and psychologists are blurring the distinction between everyday and technical definitions of personality and its disorders, so hiding the distinction between the brain state associated with narcissistic personality disorder Δi, and his inflammatory pronouncements (Wi ⊂ Mi) ≡ Bi. This is not a diagnosis, but an insult: the diagnosis is being recruited as a synonym for ordinary wickedness, and its separate validity denied in consequence.  This is also why proper assessment (which was not conducted by Trump’s  accusers) is essential for all psychiatric disorders; it is what lets us distinguish between Mi (or Wi) and Di in the first place 

              Would these make you more or less likely to seek help?

              While little research has been done, narcissistic personality disorder sufferers make significantly more lethal suicide attempts than other personality disorders, and are also amenable to treatment, though research is also more scarce than for BPD. From this perspective, if they’re right, the accusatory clinicians are (probably minimally) harming rather than improving Donald Trump’s life expectancy and quality of life. Far worse is the barrier this creates for those who suspect they might have this, and take the attack to reflect how professionals might treat them.  As the two images above show, they may very well be right. Under these conditions, it is understandable that service users with personality disorders may eschew and ridicule these diagnoses. However, they may unwittingly be helping to perpetuating the very prejudice they are trying to fight against, and make it harder to get help which can literally be life-saving, for themselves and others. 

              I have always taught my students that the  nature of psychiatric disorders means that they can be hard to be with. This is especially true of the personality disorders, and is one of the reasons they can be so hard to treat. However, we have seen that we have no ethical reason to judge folks with personality disorders more harshly than those with any other kind of psychiatric disorder, and failing to recognise and treat them as psychiatric disorders makes us more likely to do so. 

              Why “Silence of the Lambs”

              Since the blog was published, I’ve had several comments arguing that this image was both distressing, and maintained the very stigma this blog post opposes.  I’ve removed it from the title screen, but have kept it as my initial image, setting out my reasons for choosing it, rather than simply replacing it with something more inoffensive. 

              1. As you’ll have realised if you’ve read this far, this blog is about all personality disorders, and that is the everyday perception of them. Though fictitious, Hannibal Lecter challenges us to realise that he has a psychiatric disorder, and it isn’t always easy to find it in ourselves to accept that those as bad as he need help as well as punishment. The alternative is to talk of “better” and “worse” personality disorders in moral terms, and if you’ve followed my argument that would never do. 
              2. Hannibal Lecter is also a psychiatrist. The idea of the deadly, dangerous amoral psychiatrist who sacrifices people for knowledge continues be fed to us in the media, and sadly reflects social contagion from these patients, who we do our best to treat. 
              3. In the picture, Hannibal Lecter is restrained.  No-one who commented to me has mentioned the level of restraint, but it is outrageous. It brings home how much training is actually needed to treat this class of patient humanely. The film itself flags the inhumanity of his confinement, when untrained staff were in charge, but, for someone with a personality disorder, we read it as a sign of what he needs or deserves, rather than cruelty towards him.  It is high time our attitude changed. 

              Social Justice Warfare and Mental Health 

              If your ambition is to rescue people from oppression and misery, then mental health is a good place to start.

              People with poor mental health struggle to find employment

              They die younger…

              …and the gap does not diminish as longevity advances generally, even in the most supportive societies.

              They’re more likely to be homeless…

              …been maltreated

              …and that continues through their lives

              Of course, these discoveries about mental health could only be made because people had ways of telling the difference between good and bad mental health in the first place. This difference is codified in either diagnostic criteria or questionnaire cut-offs, and judicious use of either (or sometimes both) is pretty good at identifying people with difficulties in their mental health.

              Now consider this

              The idea that diagnosis identifies mental disorders which may become objects of study has created theoretical and practical divisions between ‘normal’ and ‘abnormal’ which have hindered understanding of behaviour and experience in general – not just that said to be symptomatic of mental illness. Abandoning diagnosis is therefore an important step in practising what we preach – in creating a unified approach to our subject.

              This is part of a position statement in UK Clinical Psychology. My previous blogs on this site explain, at some length, why I think this is mistaken in principle. Here, I’ll merely point out that without any ability to identify the mentally unwell, none of the statistics which justify interventions, and can measure improvement would have been collectible. As this statement has been made by people who have dedicated their working lives to helping this group, something must be seriously amiss. It’s time to meet

              The Social Justice Demon

              Lucifer: according to Milton, the first social justice warrior

              Social Justice sounds so good that it’s tempting to agree with it without further ado. It also has a long and honourable history going  back to Augustine of Hippo, who argued that justice was the yardstick by which states could measure their legitimacy. Its most important recent proponent, John Rawls, both defined it and extended it to all justice using the concept of “justice as fairness“.  Unpacking this apparently simple definition eventually required an awful lot of words

              which expand on a few basic principles

              The original position

              Imagine we get together to agree on a society. If everyone is to receive fair treatment, it follows that they should not know of their own personal characteristics or position in society when deciding its rules, to avoid bias in rule selection. Rawls called this “the original position”, and saw it as the first principle to be fulfilled, his view being that fair rules could not be decided otherwise. He called the state of mind associated with the original position “the veil of ignorance”.

              The difference principle

              If, like Rawls, we accept that there has to be some difference in society’s roles and rewards, then the veil of ignorance means we don’t know whether we are going to be winners or losers when we agree society’s rules.  It is therefore no more than prudence to propose that society’s differences should be arranged so that the poorest and weakest are protected from adverse effects of the differences we accept.

              Principle of equality of opportunity

              By the same kind of reasoning, to avoid keeping  ourselves from the roles that would most benefit ourselves in our society, the opportunity to attain these roles must be equally available for all.

              Rawls saw these principles as fundamental to liberal and social democracies, and most folks agree with him. After all, what’s not to like?

              Given that we are currently in the land of pure principle and fluffy bunnies, it’s not surprising that we have to be positively demonic to see what could possibly go wrong

              When Lucifer demanded equal rights for angels

              It is striking that the sentiments Milton gave Lucifer were very similar to those that we’ve just heard from Rawls, and probably account for some of the sympathy we feel towards him when we read Paradise Lost. In a society which insisted reason was ultimately bounded by revelation (and the authorities which claimed the right to interpret it), this is obviously good propaganda. However. revelation has been replaced by law as the justification for our society, and following Rawls, this is conceived as a form of social contract. As we can no longer turn to an omniscient God to rescue us, let’s think about what happens when Rawls’ principles hit the real world.

              The real world is an uncomfortable place for pure principle

              Our precious veil of ignorance gets replaced by a veil of approximate knowledge, whose reliability varies according to topic. We have absolutely no guarantee that the rules we make will have the results we desire. Physical realities such as distance and time, as well as societal decisions, impinge on equality of opportunity.

              If we try to get back to Rawls’ original position from here, we end up with the first rule of the social justice demon

              True justice is based on refuting difference 

              What this does is to reverse the direction of Rawls’ veil of ignorance. If Rawls said “if we have a veil of ignorance then we may have justice” this claims “if we are to have justice then we need a veil of ignorance.” Unlike Rawls, who only used the original position as a rational basis for his other principles, this goes against the basic conception that Justice is about treating equal things equally and unequal things unequally, as we are now seeking the same implementation of rules for different things, rather than simply designing rules to fairly cover all possible circumstances. This leads to the second demonic rule

              If people experience different degrees of benefit in their lives, then unfairness has occurred

              This is  the exact reverse of Rawls’ difference principle, which specifically allows for differences in benefit, according to what people do in practice. Of course, if we retain ignorance of what people are doing differently in relation to our benefit assessment, then equality of benefit is the inevitable consequence. A similar reversal occurs with the third demonic rule, which is

              Opportunity should be regulated to ensure equality of benefit. 

              This of course arises because inequality of benefit has already been declared unfair. Though the implementation will look superficially like positive discrimination, it is different, because the target is not prior inequality of opportunity, but inequality of benefit.

              Politically, the social justice demon belongs to the Left, just as the Nazi demon belongs to the Right…

              Pol Pot giving voice to the social justice demon “only several thousand Kampucheans might have died due to some mistakes in implementing our policy of providing an affluent life for the people”

              …but its relationship to socialism is like that of lung cancer to the organ. If we think of the classic socialist quote

              “From each according to his ability, to each according to his needs”

              It’s clear that evaluation of personal differences, including abilities and needs, are at the heart of this statement, which disappear when we consider its monstrous progeny. It’s like cancer in another way too: it starts small, but grows through the social body it infests until that body is destroyed.

              How the social justice demon influences our actions

              The person who first wrote about how such schemas as the social justice demon influenced us was Fyodor Dostoevsky.

              Fyodor Dostoevsky: the man who wrote about demons

              He coined the term I’m using, and named one of his greatest novels after them. Writing from a rightist, patriotic, and religious perspective, he described how the introduction of Western, progressive ideas destroyed the social structure of a provincial Russian town, by shaping the behaviour of those who believed, or came to believe in them. More recently, they have entered public consciousness as memes: fragments of thought or learned procedure that travel from person to person by either deliberate communication or mimesis.

              There are three qualities which, in combination, make the social justice demon so virulent.  The first we’ve already discussed: it closely resembles a genuinely virtuous moral position, so statements and actions based on it can be hard to argue against, and conforming to its precepts is likely to feel good. Secondly, it’s a warrior meme. Milton’s myth has Lucifer going to war with God over a perceived injustice to angels. Other warriors for the social justice demon have fought for heaven, to rescue souls from feared damnation through attachment to the wrong beliefs

              The social justice demon’s warriors fighting to get everyone to heaven

              Because the social justice demon ultimately opposes knowledge of difference, and in the real world most differences are relative, it can fight on either, and frequently both sides, in any dispute. Its third quality, (which may be deduced from the previous two, and its base in pure principle), is its absolutism. The idea of incrementally approximating towards perfection, which might thus never be reached, is anathema to the demon. Imperfections must be rooted out, and the promised perfection makes the cost worthwhile

              A 15th century illustration of the costs of insisting on perfection

              This also means that, while the social justice demon claims to be responsive to evidence, in practice no evidence can convince it otherwise, because all evidence, and especially scientific evidence, carries a margin of uncertainty. So, it will only accept (and promote) evidence which supports its a priori view: confirmatory bias. As this involves denying knowledge of difference, the evidence it uses will tend to be skeptical.

              The social justice monster in mental health

              As the social justice demon offers itself as an ethical position, it makes sense to look for it in ethical discourse. In mental health, it has taken up residence in what was previously called “anti-psychiatry” and more recently the “critical” movement in psychiatry and clinical psychology. It’s not hard to find statements indicating its presence: we’ve already quoted one insisting that we deny difference between those with mental illness and those without, in terms of the signs and symptoms which denote disorder. This of course is the first demonic rule.

              It is also not hard to find examples supporting the second rule. This quote is from a fringe mental health treatment group “re-evaluation counselling“.

              “Mental health” oppression is the systematic suppression of discharge (their term for symptomatology)  and the invalidation of people’s minds. It is the attempt to control people by enforcing standards of conduct, invalidating the discharge/re-evaluation process, categorizing people into diagnoses, pressuring them to take drugs and other harmful treatments, and punishing attempts to stand up for their liberation. The point of “mental health” oppression seems to be to oppress “mental” patients. However, it is actually to maintain the status quo by reinforcing and obscuring the functioning of other oppressions, and enforcing conformity.

              Compare that with this, from a very senior UK Clinical Psychologist, prominent in the “critical” movement

              ‘If the authors of the diagnostic manuals are admitting that psychiatric diagnoses are not supported by evidence, then no one should be forced to accept them. If many mental health workers are openly questioning diagnosis and saying we need a different and better system, then service users and carers should be allowed to do so too.’

              The authors of the diagnostic manuals are of course saying that their evidence for diagnosis has uncertainty and variability, not that is no evidence for any of them. This transformation from doubt to denial is a good sign our demon is at work, particularly when it leads to a claim of oppression: the major difference between the quotes is that here the claim is expressed implicitly.

              The third rule, limitation of opportunity to ensure equal benefit,  has recently had an airing in the popular media.

              Despite the balancing qualifier “a tiny minority”, the message is clear: take these pills and you risk becoming a monster and ending up here

              Hell: the destination for all sinners (and some medication users)

              The social justice demon has two problems with pills, or indeed physical treatments of any kind. First, if they are successful, then they have been successful through modification of individual difference, which the social justice demon forbids: all such modification is “oppressive”. Secondly, physical treatments come with side effects, and side effects, however balanced with benefits, are barriers to perfection, however impossible. Hell beckons.

              I’d agree this is a bit extreme for a professional position, but it can nonetheless be found in professional recommendations influenced by the “critical” movement.

              Many people find that ‘antipsychotic’ medication helps make experiences such as hearing voices less intense, frequent or distressing.

              It can be particularly useful at times of crisis when the experiences can feel overwhelming.

              However, the drugs appear to have a general rather than a specific effect: there is little evidence that they are correcting an underlying biochemical abnormality.

              There are significant risks as well as potential benefits, especially when people take medication over many years.

              Prescribers need to help people to weigh up the risks and benefits of taking particular drugs or indeed taking medication at all. People need to be able to try things out and arrive at an informed choice.

              Services should not pressurise people to take medication.

              While measured and thoughtful, there’s no doubt that the mood music of this quote is the same as the television programme, albeit less intense. It’s worth taking a moment to dissect these superficially reasonable statements, to uncover the baleful influence of the social justice demon within them.

              Consider the second and final paragraphs together. A psychotic crisis involves much more than simply having odd experiences: thought processes themselves can become incoherent and incomprehensible.

              Florid psychosis written down

              Imagine someone who talks to you like this (and I can assure you that some psychotic folk do). Do you know what they’re talking about? Do they? Take it from me, after they recover, they won’t be able to explain this stuff to you — they’ll struggle to remember it at all.  If they say “yes”, or “no”, do we even know if it’s being directed at us, or part of the conversation we think we’re having? We also know what someone in this state is capable of

              Note for animal lovers: the lions were shot for behaving like lions, and discovering humans were good eating

              As stated, the final paragraph is virtue signalling, making clear that whatever the reasons, giving drugs without agreement is oppressive. The social justice demon is setting its battle lines according to its second rule: it simply cannot accept that there are individual differences that might affect capacity to consent.

              The conversion of doubt to denial is also well in evidence. To read this summary, it would appear that nothing more is known about antipsychotic drugs than that they they are calming, ie sedative. This is misleading

              Image result for neuroimaging antipsychotic medication bentham

              Across multiple studies, antipsychotic drugs affect brain areas associated with schizophrenia

              Brain regions structurally and/or functionally affected in schizophrenia

              Antipsychotic drugs have visualisable targets of action, that correspond to those areas affected by psychosis (eg frontotemporal & parietotemporal) and their side effects (eg striatum and cerebellum) as the above images show. We do not fully understand how the drugs work because our theory of schizophrenia isn’t complete, but the summary has taken this uncertainty and used it to convert our understanding of antipsychotics from a specific treatment for psychosis to something that will keep patients calm and biddable. In fact, the sedative effect of antipsychotic medications is temporary, but the summary gently introduces the idea that they are no more than “chemical coshes”. Once again, drug use has been linked to oppression, now irrespective of consent.

              On the other hand, psychotherapy, especially that which aims towards an idealised human relationship, fits the social justice demon’s bill perfectly.  Human relationships are universal: no differences need apply. Also, if they are perfect, then there is no difference in benefit, as benefit is distributed through relationships. We are in heaven

              The social justice demon’s take on psychotherapy

              When we look for the equivalent summary for professional recommendations for psychotherapy, we read

              Psychological therapies – talking treatments – are helpful for many people.

              The National Institute for Health and Care Excellence (NICE) has reviewed the evidence and recommends that everyone with a diagnosis of schizophrenia should be offered talking therapy. However, most are currently unable to access it.

              The most researched therapy is cognitive behaviour therapy (CBT). Trials have found that on average, people gain as much benefit from CBT as from medication.

              ‘Family interventions’ have also been extensively researched and many people find family meetings very helpful.

              Talking therapy is very popular: demand vastly outstrips supply in the NHS.
              There is an urgent need for further investment in psychological approaches to ensure that all services come up to the standard of the best, and so that people can be offered choice.

              Different approaches suit different people. Not everyone finds formal psychological therapy helpful and some find it positively unhelpful. We need to respect people’s choices.
              All staff need to be trained in the principles of a psychological approach as outlined in this report so that it can inform not only formal therapy but also the whole culture of services and every conversation that happens within them.

              We are invited to conclude that the only reason schizophrenia isn’t being treated with psychotherapy is either lack of resources or some patients’ distaste for formality.

              Compare this with the section on drugs and it’s a no-brainer

              To unpick  this, let’s start with the NICE (National Institute for Clinical Excellence, which evaluates treatments) guidance. What the standard states is

              CBTp (Cognitive Behaviour Therapy for psychosis)  in conjunction with antipsychotic medication, or on its own if medication is declined, can improve outcomes such as psychotic symptoms. It should form part of a broad‑based approach that combines different treatment options tailored to the needs of individual service users.

              The meaning’s quite clear: a combination approach is best, though acceptability is important, too. (Offering any treatment that the patient won’t use is pointless).

              As might be expected, given the demon’s preference for confirmatory bias, the equivalence of drugs and psychotherapy is asserted without any caveat or qualification, even though this area is highly contested

              The effects of CBT may be less than small

              While the effect of antipsychotics is moderate or greater

              with their non-preferred studies giving different results.   Entirely different standards are being used to bias decision-making towards psychotherapy, irrespective of the quality of evidence of benefit.

              Lurking behind this false contrast between psychotherapy and drug therapy is the third demonic rule: access to physical therapies must be restricted as they lead to imperfect and intrinsically unequal benefits.

              Exorcising the Social Justice Demon

              Social justice is a good thing. We therefore need ways of deploying it in mental health that avoid invoking its demon.  Two philosophers, Beauchamp and Childress, have developed a system of principled biomedical ethics which can include it. They propose four, equally important principles

              • Acting to respect our patients’ autonomy
              • Trying to do good
              • Avoiding doing harm
              • Acting with justice

              And, being philosophers, are well aware that principles struggle in the real world

              A principle being tormented by some philosophers

              They recommend a 3-step process to connect principle to reality. First, the relevant principles must be specified in terms of the real situation: hopefully it’ll be less fiendish than that in the trolley problem.  It is, however, more than likely that the principles will point in different directions: for example, drugs have intended and adverse effects. Beauchamp and Childress recommend, in their second step, to trying balance the opposing principles. Their third step is to employ dialectical reasoning to synthesise the opposing, weighted and balanced principles (for none can be ignored) into a course of action that expresses them all to the fullest extent the constraining circumstances allow. The result is what the social justice demon hates; an ethical compromise.
              Compromise lacks the grandeur and purity of the heroic stance pushed by the social justice demon. However, ethical compromises are constrained to benefit their subjects. Dostoevsky reminds us that the endless battles over principle that the social justice demon promotes can destroy what was being fought over

              which in our case are patients, services, scientific integrity and professional credibility. As my introduction implied, achieving social justice for our patients is a goal we should all be contributing to. However, this is one of those occasions when the best really is the enemy of the good.

              Why Psychiatrists Should Care about the Identity Hypothesis 

              The other day I was reading a blog post by a psychiatrist who I greatly respect. Her post was about how worthy, dreary and unexciting she found modern neuroscience, and how it had failed to live up to its clinical promises, compared to more psychosocial approaches of understanding mental health. However, she also said this.

              I would really like to hear more how we can link up a little more across the multidisciplinary divide – and try to understand the interactions between the person, their environment and their brain.

              This post is my attempt to say a little more about how I see those links developing, and why, far from being drearily “mindless”, they represent a massive intellectual challenge to clinical psychiatry (and psychology), which we have not begun to address. This challenge may be stated in three words, which make up my first heading. 

              The Identity Hypothesis

              The identity hypothesis is for cognitive neuroscience what the “efficient market hypothesis” is for economics, an unproven assumption which acts as a conceptual foundation for the discipline. It claims that states of mind are also states of brain. Most people these days know the hypothesis, even if they don’t know its name, but tend to say “Yeah, OK” and carry on as before. To show how big a mistake this is, let’s start with an analogy. 

              This is, of course, the famous image which can be interpreted either as an old woman in profile, or as a young woman turned three-quarters away. It follows that every feature of the old woman is also a feature of the young woman, so here we have not just an identity hypothesis, but a truly dual identity. Despite this, we cannot see both images simultaneously, even though we can readily choose which image to observe. This is true even if we try to focus down on the individual features which are key to the transformation. 

              Both the optical illusions, and our inability to see past them,  provides valuable information about how our visual understanding of the world is processed, including information about when and how much the apparent certainty provided by it may be trusted. This diagram, courtesy of Richard Gregory, shows how much of our visual reality arises from internal processing

              What we are doing when we look at a cocktail glass

              Our “old-young lady” illusion demonstrates just how powerful that hypothesis generator is.  Once we have decided, say, that we have seen an ear, rather than an eye, our visual hypothesis generator imposes such a large confirmatory bias as we process other forms in the image that we must perforce encode the whole as a young woman, despite knowing things can be otherwise. 

              We may also be affected by illusions of identity, as wittily demonstrated in this painting by Magritte 

              We are confronted by a painting of the sea, set on an easel in front of, and depicting, the same sea, in the same moment that is recorded in the painting. Most viewers will feel this is uncanny. What is happening is that the painting, through our long familiarity with pictures, is creating an illusion that the background sea is more real than that in the frame, when the frame, the frame’s contents and the background, arise simply from cunningly contrived reflections of light on painted canvas: there is no sea. The flaming tuba illustrates the greatest power of the illusion of identity: it can make the impossible seem possible, as we now see so often in our films

              An impossible duel brought to us by computer generated imagery

              We need to unpack the identity hypothesis a little, to appreciate what it is really saying. 

              The Identity Hypothesis

              We are talking about all our possible states of mind: in symbols ∀{Μ}

              We are also talking about a set of possible brain states {Bi}

              We go on to claim that, for all our states of mind, there will be corresponding brain states, and that each of those brain states will be reflected in the equivalent state of mind. 

              In symbols we write ∀M(Mi ≡ Bi)

              Note that this allows the existence of a complementary set of brain states {Bj} that do not contribute to states of mind. That’s needed to cover things like coma, the brain’s internal housekeeping etc. 

              ∀M(Mi ≡ Bi) allows us to predict any state of mind from its corresponding brain state (assuming we can identify it), and vice versa.  

              However, we also know no two brains are identical, and as brains develop and senesce, an individual brain’s function will also vary through the life cycle, as well as being  affected by e.g., illness.  We also need to include this brain variability {V}

              In symbols, we write ∀M(Mi ≡ Bi | Vi)

              Which means, “states of mind and brain states may be used to predict each other, conditional upon relevant brain functioning” 

              Here is a visual representation of all three components of the identity hypothesis.

              The image shows brain activity related to two different tasks, performed by two different groups of people. We can see that tasks involving symbols and those involving numbers involve different (albeit overlapping) brain geographies. This is Mi ≡ Bi. However, which geographies are involved also depends on which language the brain employs (Vi). This doesn’t prove the hypothesis: we have only tested it in one direction, as we cannot trigger brain states with sufficient accuracy, and it only exemplifies one set of tasks and conditions. But, to date, no exception has been found. 

              Thinking about the Identity Hypothesis from Inside the Box

              Read books of philosophy and one could easily believe that thinking happens in an unrestricted and infinite mental space, where thought can freely move without obstacles or pitfalls. However, if we think with our brains, then this is nonsense. Optical illusions teach us that the brain struggles with dual identity in the visual world, and there is good reason to think similar limitations hold more generally. 

              At present, we write about states of mind using language, while states of brain are described using mathematics (either visually or algebraically expressed). We have already seen that these all use different brain systems. Are there limits to how we can put them together?  While we don’t fully know the answer, we do know that different languages support learning mathematics to different extents, and we are also aware of mathematical objects which exist, but cannot be denoted in language, such as the square root of -1. There is also an everyday example, which is particularly telling as it involves our emotions: music. 

              A musical challenge to psychosocial formulation

              I love opera. Richard Wagner (who also helped to prove that talent is independent of moral rectitude) described it as “Music Drama.” These days, when we go to a performance, we are assisted by “surtitles”: translations of the lines into our local vernacular displayed above the stage. However, even before surtitles, people who did not speak the opera’s language could still follow most of the development of the plot, from a combination of the action on stage and the music that was playing. In ballet, opera’s close relative, we follow the story simply from music and dance. From this, as well as musical forms such as tone poems, we know that music can tell a story. Furthermore, the story is being told through our emotions. 

              Psychologists argue that music bridges language and emotion, eliciting the latter through temporal information common to language and music, so suggesting an innate dimension of emotional encoding

              This encoding can be visualised as a brain state  

              When people are familiar with a tune, their brains show increased activity in the regions shaded in green in this fMRI image. Red areas respond to salient autobiographical memories, and blue areas respond to tunes that a person enjoys. The brain region known as the dorsal medial prefrontal cortex responds both to familiarity and autobiographical associations (yellow).

              We are seeing dual identity, as this description of the research makes clear 

              A lifelong music buff, Janata had earlier created a model for “mapping” the tones of a piece of music as it moves from chord to chord and into and out of major and minor keys. By making tonal maps of each musical excerpt and comparing them to their corresponding brain scans, he discovered that the brain was tracking these tonal progressions in the same region as it was experiencing the memories: in the dorsal part of the medial pre-frontal cortex, as well as in regions immediately adjacent to it. And in this case, too, the stronger the autobiographical memory, the greater the “tracking” activity.
              “What’s cool about this is that one of the main parts of the brain that’s tracking the music is the same part of the brain that’s responding overall to how autobiographically salient the music is,” Janata (the researcher) said.

              In this particular case Mi is the state of mind associated with familiar, emotionally charged music, while the the dorsomedial prefrontal cortex is necessary for Bi. This state of mind is pretty universal, and we can recognise its commonality, mutatis mutandis, across individual memories and pieces of music. However, like the rest of us, the researcher ends up with an ungainly mix of memory, music and emotion to describe something we experience as a single state of mind, without fragmentation being apparent in Bi either.  

              Now, let’s think back to our old-young woman. Every feature was both a part of the young and the old image: a perfect correspondence. What we have expressed here is Mijk ≡ Bi, where the subscripts ijk refer to the combination of memory, music and emotion that is equivalent to Bi.  Notice, however, that there is nothing in the language-based definitions of music, memory and emotion to link them (in symbols D(i,j,k,) ≉ Mijk). We can therefore write, for our everyday definitions, (Di ⊥ Dj ⊥ Dk) | Bi, where ⊥ signals conditional independence, and |, as previously, denotes conditionality. It is obvious that, simply staring at the images (Bi) without knowing the mental states being explored (Mijk) wouldn’t be enough to deduce the latter. Our demonstration of  the independence of D(i,j,k) from each other, conditional on Bi, shows that we also have no way of reasoning from how we normally understand these states to the idea that they are conjointly instantiated in a single brain state, as independence means “no link is as likely as any link.” So, we may not even put the three together, and would be able to argue with anyone who does. What we have shown is that our linguistic and brain based formulations are as separated as our young and old female images; without separate knowledge that they correspond, we cannot combine them, even though we can move smoothly enough between them once we have ascertained their joint existence. 

              Having chopped all the logic we need, we can now set out our musical challenge 

              1. In the absence of brain–based evidence, no psychosocial formulation, no matter how reasonable, and even if the patient agrees, can be assumed to describe how the patient has encoded the formulation’s topics in the brain. 
              2. If we are formulating a patient’s emotional trajectory, then a musical formulation, validated by the patient’s confirmation of the experiences elicited by the music, is likely to have a closer connection with the brain than a linguistic one,  as the additional layer of linguistic definition, and requirement for reason, is absent. 

              The purpose of this challenge is not to suggest we should all start humming to our patients, and music therapy is not the topic of this post. Instead, what this does is to show that the constraints imposed on psychosocial formulations by the identity hypothesis, when combined with our cognitive limitations, undermines our received wisdom that any psychosocial formulation, however well constructed, and even if checked with the patient, is the best way of understanding our patients. The next section thinks about how this happens.  

              My Tuba is on Fire! The spurious rationality of the psychosocial formulation

              Most psychiatrists, even religious ones, accept the identity hypothesis in their practice. In fact, it does not deny spirituality per se, as it simply asserts that, like all other states of mind, a spiritual experience will have an idenfiable brain state associated with it. However, though there are a few exceptions, psychiatrists make an additional assumption that, even if it occurs at all, parapsychological influence on the brain is too rare and/or faint to account for the hallucinations and delusions presenting in their clinics. 

              This version of the identity hypothesis asserts that brain states cause (using Aristotelian terminology, are the efficient cause of) states of mind. Attempts to help people with mental illness by assuming otherwise tend to end badly. 

              It’s worth listing what follows from this version of the identity hypothesis. 

              • The brain is connected to the world (through its sensory and motor nervous systems) and interacts with it
              • Accessible states of mind are how it communicates to us (and partly also to itself)
              • Mental health business is changing certain brain states, recognised through states of mind
              • If it is to be effective, any treatment, be it physical, psychological or social, must improve the brain state causing the problem

              So, any understanding of our patients must be able to predict how their brain states will respond to the intervention we choose, even if that brain state is measured through a corresponding state of mind. The previous section showed how our psychosocial formulations could not be trusted to make the necessary link. This section examines how they can nonetheless trap us in committing to them. 

              Meaning and its imitators

              Consider this list of words 

              furiously sleep colourless ideas green 

              They appear pretty random, don’t they?  However, if we rearrange them we get

              Colourless green ideas sleep furiously.

               This famous sentence was coined by Norman Chomsky in 1957, to illustrate how meaning arises from syntactic structure 

              Syntactic structure giving meaning to a set of otherwise random words

              Now read this


              Jean Arp with a piece of his work

              While Chomsky was trying to create an illusion of meaning, and Paul Eluard was trying to capture the thought and work of an abstract artist, both independently discovered a similar technique; to arrange words so that their juxtaposition suggests meaning, even though their individual definitions did not. The discovery that symbols actually derived their meaning from connection to other symbols, rather than what they  symbolise, was actually made by Ferdinand Saussure, Chomsky’s great predecessor. If we look back to our Matisse, we can see exactly the same technique being used on our visual sensorium. It works because both language and vision rely on feature detection to construct the reality we observe. Chomsky’s sentence is a verbal analogue of Matisse’s burning tuba: the plausible juxtaposition of Matisse’s images and Chomsky’s words are both perceived as meaningful, even though the individual interpretations of each word or form warn us they should not be combined.  

              The hidden surreality of psychosocial formulation

              Both psychiatry and psychology have broadly  similar approaches to formulation. Summarising greatly, they are purposeful documentary accounts of the patient’s condition, which specifically include and link to relevant theory, and thereby provide both understanding and guidance to appropriate remedial action. 

              This statement is part of the psychology guidance 

              However, psychological formulation starts from the assumption that ‘at some level it all makes sense’ (Butler, 1998, p.2). From this perspective, mood swings, hearing voices, having unusual beliefs and so on can all be understood as psychological reactions to current and past life experiences and events, in the same way as more common difficulties such as anxiety and low mood.  

              In terms of the identity hypothesis, the authors are asserting that the states of mind associated with psychosis, {Mp}, should not be understood differently from ordinary mental responses to the environment {M}, so we should write Mpi ≡ Bi ≡ Mi. Because equivalence is commutative, we are also asserting that Mpi ≡ Mi, which is obvious nonsense.  Why then is the claim so plausible? First, it uses descriptors of psychosis (hearing voices, unusual beliefs etc) {D(p)} instead of the set of psychotic experiences itself {Mp}. As our musical challenge showed, substituting one for the other leads to different conclusions. It also proposes an entireIy different cause for states of mind; “psychological reactions”.  To the identity hypothesis, a psychological reaction is simply another state of mind, reflecting a brain process intermediate between two others. It is therefore no more than a subset of either {Mp} or {Mi}.  However, its linguistic position within the verb part of its sentence biases us to believe, impossibly, that something non-physical can be an efficient cause. The truth being presented here is the same as that found in Eluard, Arp or Matisse, not everyday reality.

              Why we should never confuse life with literature

              Psychiatric  formulation is less committed to continuity with the normal population, and more accepting of diagnosis, than psychological formulation. However, current UK guidance on formulation introduces its  value for psychiatrists as follows 

              At the same time, a major criticism of psychiatry at present is that there is a reductionist overemphasis on diagnosis and biology. Psychological care is sometimes seen as the domain of psychologists, with psychiatrists’ roles becoming increasingly confined to prescribing and managing psychiatric problems that have a physical basis. A less limited view of what it is to be a good psychiatrist depends on psychiatrists being able to offer psychologically minded care. Formulation is a key part of this, and of making psychiatric practice more therapeutic.

              It’s clear that a “reductionist…overemphasis on biology” is not consistent with an appreciation of the significance of the identity hypothesis. Our current understanding of brain states requires more biology,  not less. The implied claim that the identity hypothesis’ biological approach impedes psychological care is equally wrong. Think back to our old-young woman. We can use either our “old” or “young” understanding of her features, provided we know either denotes the same feature set. If we know that our psychological descriptions match brain states, there is nothing to stop us using them. In the absence of knowledge about the corresponding brain state, knowing that a treatment is effective for a condition implies it can alter brain states in the desired direction. So, far from being problematic, the identity hypothesis predisposes us to adopt an evidence-based approach to the choice and delivery of psychological treatments. The fact that the same arguments are entirely general, so can also be applied to the equally sketchy theories surrounding drug or social therapies, is an added bonus, as it gives a theoretical base for using physical, psychological and social therapies conjointly, as well as separately.  

              By now, it should be clear that there is something seriously wrong with how we think about our mental health, if we accept the identity hypothesis.  The cognitive errors we’ve explored explain how they have arisen; it’s time to move on to thinking about why. 

              The Religious Interpretation of the Biopsychosocial Model

              This model underpins both sets of guidance we’ve just critiqued.  It is usually ascribed to George Engel’s papers in 1977-80,  though it was actually developed in the early 20th century by Adolf Meyer. The identity hypothesis has no problem with it, as  biology, psychology and sociology simply describe different classes of influence on brain states (though in sociology, the brains are studied en masse, and their states therefore assumed to be averaged, rather than proceeding brain by brain). Despite the  comments quoted above, this is not reductionistic, as it does not preclude patterns  which are only apparent when interactions are considered, as the picture below illustrates. 

              Dots interacting to make an image. Now imagine them as brain states

              Given the problems with studying brain states, and especially if brain states are not our object of  study, it makes good sense to move from the suspected cause to an observable consequence, as, according to the identity hypothesis, this should be mirrored in the brain state(s) which generate the consequences we see.    

              Where the Biopsychosocial Model happens

                However, this is very different from claiming that the cause leads to the consequence without the necessary mediation of the brain. Then, the psychological and social domains become separate orders of metaphysical reality, superimposed upon but interacting with the physical world. Having done that, it’s easy to add more layers e.g.,

              This kind of reasoning has a venerable (and venerated) history, as it was used by St Thomas Aquinas to describe how body and soul combined to create human beings 

              There are two requisites for one thing to be the substantial form of another. One requisite is that the form be the principle of substantial being to that whereof it is the form: I do not mean the effective, but the formal principle, whereby a thing is and is denominated ‘being.’* The second requisite is that the form and matter should unite in one ‘being’; namely, in that being wherein the substance so composed subsists.

              With the difference that he’s talking about four parts, rather than two, here’s Steven Rose with the modern version 

              every aspect of our human existence is simultaneously biological, personal, social and historical

               Interestingly, this modern version is cited in both the psychology and psychiatry guidance quoted above. Though religious psychiatrists (and others) might take comfort in the possibility of building levels all the way to God, as Aquinas did, this is of course unnecessary: Steven Rose is himself an atheist.

              However, the problem with all metaphysical accounts is that there is no way for the non-physical to influence the physical, which gets us here 

              This approach thus creates an impassable causal barrier between states of mind (which is where we find the issues we subject to formulation) and brain states (which is what we want to change). Unfortunately, it follows from our previous arguments that, without the possibility of such a link, we will not be able to see past whatever illusion the language of our formulation puts before us.

              An Unsatisfactory Solution, but the best we’ve currently got: diagnosis as Sancho Panza.

              Formulation and diagnosis exploring the unknown

              Compared to the intellectual sophistication of generating a formulation, assigning a diagnosis seems almost pathetically simple:  elicit a set of symptoms and/or signs, match them to a standard template, and offf you go. However, if the diagnosis has been well established, the list has been standardised across lots of brains in lots of states of mind.  Let’s now think back to our full version of the identity hypothesis. It was “every state of mind M can be mapped to a corresponding brain state B, conditional upon relevant brain functioning V“.  In symbols we wrote this as ∀M(Mi ≡ Bi | Vi).  Provided the validity samples for the diagnosis were sufficiently large, all relevant states of mind and brain states will have been captured, even if we don’t know precisely what they are. That means our diagnosis is pointing squarely at relevant brain functioning Vi.  Better still, if we make the not unreasonable assumption that Mi ≡ Bi would be normal in the absence of Vi, then we can restrict our attention to Vi, rather than the much more challenging (because it’s so much larger and more variable) set of Bi.  Unfortunately, in mental health we still mostly have to infer Vi through the set of symptoms and signs we observe, summarised as a diagnosis. Furthermore, the symptoms and signs we usually use to make a diagnosis will very often be a subset of the complete list of criteria. In symbols, we write this as follows 


              • {} indicate sets 
              • () indicate terms to be read together 
              • s is the symptoms and/or signs we observe 
              • S is the full set of diagnostic criteria 
              • Δ is the diagnosis 
              • V is brain functioning 
              • ⊆ means the former term is a subset or equal to the latter term
              • X ⊃ Y means “If X then Y”
              • i indicates correspondence 

              {s ⊆ S} ⊃ (Δi ⊃ Vi)

              Which shows why things can go downhill. First, because of {s ⊆ S}, we should re-write it as

              {s ⊆ S} ⊃ (Δi ⊃ {V₁…N})

              Where {V₁…N} are the brain states associated with each possible subset of S.  We can’t even be sure if our initially proposed set of criteria is complete!  We can see the practical effects of this when we come to treatment; we often have to sift through several (including some combinations) before we get one that can modify Vi.  There is, however, a ray of sunshine. Because {s₁…N} overlap each other, the identity hypothesis predicts that {V₁…N} will overlap too, as the state of mind associated with any elementary symptom will associate with a specific brain state, even if we’re not quite sure what an elementary symptom is. This is consistent with our real life observations that alternative treatments do resemble each other. So, we can think of even imperfect diagnosis as a kind of lens, focusing our attention on the treatments most likely to work.

               Secondly, because we’re only talking about one-way inferences, this can be re-written as

              ~{s ⊆ S} ⋁ (~Δi ⋁ {V₁…N})

              Where “~” means “not” and “⋁” means “and/or”. 

              In English, this is the almost incomprehensible “either there isn’t a set of symptoms, which may either be a subset or the full set of diagnostic criteria, and/or there either isn’t a diagnosis and/or a corresponding set of brain states.” But, what it means is that the set of brain states may exist without the diagnosis, and either the diagnosis or the brain states might exist without the symptoms. Though theoretically awful, this can be managed empirically by validity studies.  Let’s use a simple example to see how this works. 

              Using the same notation, what he’s discovered is 

              Rub sticks ⊃ stick fire

              Let’s imagine he keeps experimenting, until he can make fire reliably. He will probably deduce

              Rub sticks & long time & stick dry ⊃ stick fire

              However, because he can do it reliably, he can also say (provided he ignores the time dimension)

              Stick fire ⊃ rub sticks & long time & stick dry

              Because he also knows that, while he’s seen fires caused by volcanoes and lighting, they’re not happening nearby right now (or the story would have a tragic ending). 

              Putting these together, he arrives at

              Stick fire ≡ rub sticks | sticks dry, long time. 

              Notice this is conditional predictive validity: he knows that spending a long time rubbing sticks together will result in fire, and if he sees a fire of sticks it’ll be because someone else has done the same, provided the circumstances are similar. Obviously, the process of validating diagnosis is much more complicated, and the conditional assumptions harder to verify. But, the basic principles, to move from “if then” to “equivalent to”, under specified conditions, is the same. However, the chained implications  that currently link diagnosis to its causal brain conditions, and their likely multiplicity, make validation prolonged, demanding and difficult to achieve.  

              We can think of diagnosis under these conditions as being rather like Sancho Panza to formulation’s Don Quixote. It’s more limited, has smaller ambitions, and its reliability (and therefore validity) can sometimes be suspect. However, it makes up for all this by offering a route to the real world, which is formulation’s Achilles heel. 

              The Promise of Cognitive Neuroscience

              The promise of cognitive neuroscience can be expressed quite simply: it offers to directly represent both brain states Bi and their associated brain variability Vi, thus drastically shortening the chain of implication and its associated uncertainty.  This would allow us to create the conjoint psychological and neurological space we need to make best use of the identity hypothesis. 

              If we set aside our religious interpretation of the biopsychosocial model, and accept that, at the very least, we can’t confidently connect a formulation to reality without a brain state, then cognitive neuroscience has already started to prune unrealistic formulations. 

              In men who had ADHD, PET scans showed that they processed a memory task in visual areas in the occipital lobe of the brain, as indicated by the yellow spots in the left image. Non-ADHD men used the temporal and frontal lobes, shown at right (

              We can eliminate the proposal that ADHD is simply medicalisation of ordinary behaviour, as we can show that men with ADHD process the same tasks Mi ≡ Bi differently from men without, so V(Normal) ≉ V(ADHD).  

              We can also show that effective psychological treatments target brain states thought to generate psychiatric diagnoses, and aren’t just soothing talk. Here’s a theory of brain states Vi relating to depression

              Neural regions involving in voluntary and automatic regulation of emotion, and emotion and reward processing shown in the human brain OFC: orbitofrontal cortex, ACC: anterior cingulate cortex, VLPFC: ventrolateral prefrontal cortex, mPFC: medial prefrontal cortex, DLPFC: dorsolateral prefrontal cortex. source: Kupfer et al., 2012

              And here’s the changes observed in the brains of depressed patients, for both CBT and an antidepressant (Paroxetine) 

              Changes in regional glucose metabolism(fluorine-18–labeled deoxyglucose positron emission tomography) in cognitivebehavior therapy (CBT) responders (top) and paroxetine responders (bottom)following treatment. Metabolic increases are shown in orange and decreasesin blue. Frontal and parietal decreases and hippocampal increases are seenwith CBT response. The reverse pattern is seen with paroxetine. Common toboth treatments are decreases in ventral lateral prefrontal cortex. Additionalunique changes are seen with each: increases in anterior cingulate and decreasesin medial frontal, orbital frontal, and posterior cingulate with CBT and increasesin brainstem and cerebellum and decreases in ventral subgenual cingulate,anterior insula, and thalamus with paroxetine. oF Indicates orbital frontalBrodmann area (BA) 11; vF, ventral prefrontal BA 47; Hc, hippocampus; dF,dorsolateral prefrontal BA 9/46; mF, medial frontal BA 10; pC, posterior cingulateBA 23/31; P, inferior parietal BA 40; T, inferior temporal BA 20; vC, subgenualcingulate BA 25; ins, anterior insula; and Th, thalamus. Slice location isin millimeters relative to anterior commissure. Numbers are BA designations. Goldapple et al 2004

              As glucose is the fuel the brain uses, glucose metabolism is a marker of brain activity. We can see that the frontal areas of the brain identified in our model have become less active following Cognitive Behaviour Therapy (CBT). As frontal brain activity is largely inhibitory, it suggests that CBT might lead to less inhibition of our innate emotion regulation systems. However, the involvement of other areas as well, and the very different pattern showed by an equally effective antidepressant, suggests that our current model Vi is overly simplistic, and needs updating. Applying the identity hypothesis means that we might also want to look for additional symptoms and signs of depression, which we know, from our previous analysis, might have no conceptual relationship with those we currently use, and so could not be guessed at otherwise.  This is using constant comparison of our psychological and neurological maps to improve both, rather like this

              The Cave Brothers encouraging each other to go faster

               We’re now in a position to say six things about cognitive neuroscience in clinical practice. 

              1. We will only find cognitive neuroscience helpful if we learn to understand the identity hypothesis correctly. As we have seen, that may not be easy, or even acceptable for some. 
              2. Cognitive neuroscience has the potential to significantly improve our diagnoses, which are what connects our formulations with the bit of reality we need to change, if our patients are to get better. 
              3. Cognitive neuroscience (used with the identity hypothesis) enables us to require that any formulation is consistent with cognitive neuroscience studies.
              4. In the event of an inconsistency between a formulation and a cognitive neuroscience study, the onus is on the formulator to demonstrate errors in the study, irrespective of the plausibility of the formulation. 
              5. Including neuroscience in our formulations helps us define what we do not know about our patients. The linguistic structure of psychosocial formulations makes them bad at this. 
              6. Just as with diagnosis, cognitive neuroscience offers the opportunity to identify potential treatments which could not otherwise have been imagined. 

              It is now generally accepted that there have been no major clinical advances in treatment for a generation. While this can be regarded as a failure of the promise of neuroscience, an alternative could be that our religious interpretation of the biopsychosocial model has prevented us from generating good clinical hypotheses that neuroscience could test.  

                In the blog that inspired this response, my colleague commented that she 

                wanted to be a doctor who listened, thought carefully about the options, discussed them, and then tried to help using every avenue available.

                Unfortunately, given the current state of our knowledge, “trying to help” (especially without guidelines) can involve an awful lot of false alleys. It would be great if this could be improved, and maybe thinking differently about neuroscience in clinical practice, and the identity hypothesis, might help. I agree with her that there is certainly still lots to think about. 

                If diagnosis is so good, why do so many people hate it?

                Diagnosis is at the core of what I do when I practice psychiatry.  I’ve already blogged about how I use it, and I get good feedback from my patients (yes, I still use that term, as I don’t think of myself as a service to be used: I’m nothing like a mobile phone).  So, today’s blog is setting down some of my impressions about why things seem to go wrong with diagnosis for so many.  And, as it’s a blog, I’ll be making assertions without always quoting data, so please correct me in the comments section if you think I’m wrong, saying why.

                The Perils of Chinese Whispers

                I currently do my clinical work in private practice.  OK, my practice involves families and children, so assessments take longer than usual, but I typically take 3.5 hours across two appointments and some background work if an assessment is uncomplicated.  That’s despite using an online interview schedule to cover the routine questions.  What am I doing?

                • Explaining to the patient what the assessment is going to cover, and why.
                • Reading the results of the online assessment, and preparing my candidate list of diagnoses.
                • Reviewing my candidate diagnoses with the patient.  They’re usually correct, as the online assessment tool is reliable and valid, but I explain how they’ve been arrived at.
                • Checking that the diagnoses made make sense to the patient: in particular, whether they still support what they’ve put into the online assessment.
                • Undertake some standardised tests of cognitive function, and see if any of my diagnoses need reviewing in the light of those findings.  There may be blood tests, brain scans (rarely) and the like too.
                • Describe the scientific background to the diagnoses, and how they have been modified or clarified by any additional tests.
                • Discuss the implications of the diagnoses for the patient’s life, and how their identification can improve it.  This may include a discussion of appropriate and available treatments.

                Health service assessments are now often made by teams of people with different backgrounds and skills. The general rule followed is that the cheapest person (based on salary) with the skills does the part of the assessment that their skills can cover, and are then handed on to more skilled (and expensive) team members for additional assessment components if required.  Diagnosticians (mostly doctors) are among the most expensive team members, and may make routine or obvious diagnoses from information provided by other team members, without seeing the patient themselves, which saves scarce medical time.  Sometimes, the diagnosis may be so obvious a professional diagnostician may not be needed.

                But, that’s the health service perspective.  What about the patient’s? He or she gets to see one or more different people, whose relationships, background and training are mostly invisible, beyond them working (sometimes) in the same building.  Let’s imagine the patient is initially seen by a nurse, and a doctor confirms a diagnosis without seeing the patient.  What the patient (and probably the doctor) will not realise is that the nurse is trained to understand diagnosis quite differently from the doctor, as this figure shows

                nursing process.001

                A diagrammatic account of the relationship between medical diagnosis and the nursing process

                Here, the nurse has defined two sets of signs and symptoms from the medical diagnosis, grouped them either as assessment targets, or as new “nursing diagnoses” and used these, rather than the medical diagnosis, to inform the actions the nurse will take.  The nursing textbook I used offers a range of common nursing diagnoses occurring in DSM psychiatric disorders.  For autism, it suggests as common nursing diagnoses “risk for self-mutilation”; “impaired social interaction”; “impaired verbal communication” and “impaired personal identity.”  Doctors would only recognise the middle two as part of the autistic syndrome.  Other differences in this book are more extreme: for example, the nurses’ diagnoses associated with Tourette’s syndrome do not mention tics, the cardinal feature of the medical diagnosis.  While nursing diagnoses per se have not been widely adopted in the UK, the nursing process which underpins them has, but with nursing assessments leading directly to nursing actions.  The same argument can be made, mutatis mutandis, with respect to the other professions making up a multi-disciplinary team.

                So, dependent on which professional the patient talks to, and in what order, there are likely to be a melange of possible explanations of what having a diagnosis means.  These different explanations will be scattered across several appointments, possibly weeks apart. Each individual appointment will probably be too short to explore the patient’s understanding of what they’ve been told in any depth, even though, in total, they probably took up more time than I spend.  The end result can be patients who don’t understand what they’ve got, don’t think what they’ve been told fits their experience, or know why people think they’ve got it.

                Diagnosis by Kafka

                kafka the trial

                Confusion, incomprehension and contradiction are not the only complaints made against diagnosis.  An assertion, common to some service users (I’m happy with that term in this context) and anti-diagnosticians alike, is that diagnosis is a kind of sentence, allowing The System to make all sorts of arbitrary, unhelpful decisions about people’s lives, where appeals get lost in endless bureaucracy, and failure to agree is countered by mind-bending chemicals or deprivation of even more liberty.  Even worse, many service users describe experiences where this is exactly what is happening.  While anti-diagnosticians and I can argue the toss endlessly about whether this is inherent in diagnosis, my position means I must come up with another explanation of how this is happening, as it clearly does.  Simply saying “bad practice” isn’t good enough: very few health professional go to work to make their patient’s lives worse, and those that do are rare enough to be international news, and go on to long prison sentences.

                beverley allitt

                Beverley Allitt with a baby

                Also, while these sorts of thing occur more frequently in poorly performing units, it is part of US inpatient culture, and similar patterns can be identified in even the best UK units.  As both the presentations I’ve linked show, it doesn’t have to be that way.  The presentations I’ve linked also give one explanation, the way these over-controlling behaviours by staff are backed by a framework of false stories, that prevent the discovery of alternatives.  Pacifists incarcerated in mental hospitals in the Second World War were able to discover de-escalation techniques similar to modern recommendations very rapidly, while the usual staff were not.

                1943 pacifist mental hospital

                These stories aren’t coming from the patients’ diagnoses, but rather from the formulations we use to understand their behaviour.  I’ve already blogged about the dark side of formulation, so I won’t repeat myself here, but medical science has a long and honourable history of removing the worst of these from psychiatric patients.


                However, there is another possible reason, which deserves its own section.

                The Misuse of Diagnosis in the Nursing Process

                Nurses are the most widely employed staff in Mental Health: they serve as its helping hands, its backbone, and a good chunk of its nervous system, especially the bits related to organisation and planning.  Almost universally, they use the Nursing Process as a template for delivering care, and it is seen as essential to having defined them as a profession.  If you compare the flow-charts of the Nursing Process with how I use diagnosis (both given above) you can observe a striking difference.  In my treatment approach (which is typical for a doctor) diagnosis comes mid-way through the process, which begins with symptoms, and ends with an agreed treatment.  However, in the Nursing Process the medical diagnosis is the beginning, with the signs and symptoms observed by the nurse following from it. This is actually a serious error

                but to understand why it’s important, and what the consequences can be, we need to be clear about how psychiatric diagnoses are constructed.

                How to build a psychiatric diagnosis

                As I explained before, diagnoses, including psychiatric diagnoses, don’t depend on understanding the cause of a problem to be useful.  Their primary job is to predict a range of useful treatments, and indicate a likely outcome.  Individual symptoms (what the patient complains of) or signs (what the diagnostician observes) are usually lousy at this.

                Yup, she’s worrying.  But is depression? or anxiety? or hyperthyroidism?

                after all, she looks pretty worried too, unless you realise the reason for her “worried” expression is actually lid lag, and yes, worry is a symptom of hyperthyroidism.

                Diagnoses start to become reliable when we combine symptoms and signs together: these combinations are called “syndromes”.  The image of hyperthyroidism just above is an example of a syndrome based on signs, though in reality these are also combined with symptoms and laboratory investigations to make the diagnosis: the key is to combine symptoms and signs that all relate to the condition they denote.  Needless to say, this is no easy task, especially in mental health, where there is usually no way to directly measure whatever system is causing the problem.  So, psychiatric diagnoses make it into the great classification manuals, currently ICD-10 and DSM 5, only after extensive review and testing, which never stops.  They aren’t safe even after they arrive, as they may be modified or dropped as new research findings emerge.  Because the science that validates them has, inevitably, a margin of uncertainty, subtle differences may also arise in how the different systems describe and encode the same condition.  This is why a psychiatric diagnosis should always be qualified by the system and version it has been made under.

                However it has been arrived at, making a psychiatric diagnosis involves collecting signs and symptoms from the patient, and then seeing if there are enough, that have enough severity, to fit one or more diagnostic categories.  So, the diagnostic profile arrived at is directly related to the patient’s presentation, and the categories have been extensively tested to be fit for purpose.

                Using the language of formal logic, we can state

                If a patient has a set of symptoms (and signs etc) x, then we can infer that they have diagnosis y (in symbols, this is x y)

                However, what logicians, but not a lot of other people, know is that this statement can be re-written as follows

                Either there is not a set of symptoms x and/or the diagnosis is y (in symbols, ~x y)

                So, while it’s safe to infer (a valid and reliable) diagnosis from symptoms, it’s not safe to infer symptoms from a diagnosis. We experience this in real life as the same diagnosis presenting differently in different people. More formally, it’s because a set of symptoms sufficient to make a diagnosis is only a subset of all the possible symptoms related to the diagnosis, so many different subsets are possible.

                …meanwhile, back in the Nursing Process…

                The cart has been put firmly before the horse by requiring the nurse to assess the patient in terms of a pre-existing psychiatric diagnosis.  As the diagram shows, this means that there is a risk for the diagnosis to be seen as a cause of a presenting problem.  From what we’ve just said, it’s clear that it’s nonsense to say that a diagnosis causes anything, as it’s simply a collection of signs and symptoms with predictive value. Instead, its role is as a moderator for what we are observe.  Hopefully, a diagram will make this point more clearly


                In the diagram I’ve borrowed, causes are called “Predictors”, and their effects are “Outcomes”.  It’s good practice to think this way, as all to often that’s the most we can say about a putative cause. Each arrow denotes a path of influence, hence the name “path diagram”. They’re arrows because, as we’ve just argued, the paths go in one direction only. Path “a” tracks the predictor to the outcome. Path “b” tracks the moderator. Path “c” tracks the interaction between the predictor and the moderator. Let’s create an example, to see how it works in practice.

                A patient has a Conduct Disorder

                He is served some soup with a hair in it

                He throws the soup at the wall

                We can easily fit this to our path diagram

                a) predictor = hair in soup -> rejection of soup

                b) moderator = Conduct Disorder -> aggressive behaviour

                c) interaction = hair in soup X Conduct Disorder -> rejection of soup X aggressive behaviour -> soup hits wall

                The extra step in path “c” describes the interaction.  Conduct Disorder is a moderator because its influence is on how the soup is rejected; the cause of the rejection is clearly the hair.

                I’m now going to construct two interchanges between our patient and a nurse.  In the first, the nurse focuses on the patient’s Conduct Disorder as causing the problem, in the second, s/he focuses on the hair, while treating the Conduct Disorder as a moderator.

                Focus on Conduct Disorder

                (Crash! Splat!)

                N: “Oh my! what’s going on!”

                P: “There was a f-cking hair in my soup!”

                N: “You’re angry, and I can appreciate that.  But, I’m worried that your anger might spiral out of control.  I think it’s important we tackle that”

                P: “J-sus f-ck! I just want something I can eat, that’s not sh-t!”

                N: “When you’ve managed to control your anger, we can discuss the soup.  Do you think you need some help with this?”

                P: “I’ll f-cking take your help and stick it up your a-se!  I’m hungry and want something that’s not been cr-pped in!”

                Conduct Disorder as Moderator

                (Crash! Splat!)

                N: “Oh my! what’s going on!”

                P: “There was a f-cking hair in my soup!”

                N: “Oh dear! That really won’t do!  I’ll see about getting another bowl for you.  The only thing is that we’ve now got two more problems…”

                P: “Uh?”

                N: We’ve got broken crockery on the floor, and soup dripping down the wall.  Would it be OK for you to sort those out while I tackle the soup problem?”

                P: “I suppose…”

                N: “That would be great! I can get the soup sorted, and complain to the kitchen, while you get rid of the mess.  Thank you!”

                Of course, these are not real-world dialogues, and even if they were, it might not happen that the first does lead to escalation, while the second de-escalates the problem.  However, these artificial vignettes do demonstrate how much harder it was to validate the patient, when the diagnosis was understood as the cause of the problem, rather than the soup.  They are also consistent with this study of involuntary hospitalisation, (referred to in the table as the “nonexistent fact”) as their tabulated results show.

                While law requires involuntary admission to be an explicit care topic in many jurisdictions, the pattern of commentary will be familiar to all who work, or have been patients in mental health.  We can see how so much invalidation of the patients’ experience has been through the improper understanding of diagnosis as a cause, rather than a moderator of what is happening. As the Nursing Process is used in Brazil (usually without a nursing diagnosis) it also fits with UK observations that, despite claiming a more holistic approach than medical diagnosis, the nursing process carries a risk of reductionism.

                Diagnosis as a Scapegoat for Organisational Oppression

                William Holman Hunt - The Scapegoat.jpg

                Organisations have three routes to oppress people: they can create confusion; they can dominate the discourse; and they can invalidate experience.  In relation to mental health, we have seen the first of these arise in the opacity and fragmentation of the multidisciplinary team; the second in the deployment of formulations; and the last in misusing diagnosis within the Nursing Process as a causal, rather than a moderating factor in patients’ experiences.  It is ironic that all these three are seen as protections from the tyranny of medical diagnosis, rather than contributors to the disempowerment of patients.

                “One Flew Over the Cuckoo’s Nest” is probably the paradigmatic anti-psychiatry film, and has demonised ECT, possibly forever.

                However, it’s primarily about the power wielded by psychiatric nurses: doctors are portrayed as ineffectual legitimisers of Nurse Ratched’s regime.  In the book, a striking feature is that treatments are provided to control disruptive behaviours, or justified by formulations.  The main protagonist, McMurphy, is eventually treated without a diagnosis ever being finalised for him.  Diagnosis, properly constructed and appropriately used, acts to protect patients’ rights and freedoms.

                A Hi-Fi nerd’s approach to psychiatric diagnosis

                The debate over whether formulation or diagnosis best captures psychiatric disorder waxes and wanes, but never goes away.  In the world of those who spend huge sums (think six and even seven figure sums) on perfect music reproduction, a very similar debate occurs, only this time the opponents are those who favour digital, versus those who prefer analogue reproduction.  In fact, the analogue-digital debate among audiophiles can teach us a lot about our own argument, and why we aren’t likely to resolve it anytime soon.

                Just like psychiatry, Hi-Fi has two approaches to evaluate perfection.  One involves lots of gadgetry, which can say how similar the sound that comes out of the system is to the sound that goes in.  This is the science.  The other is our ears, and the response we get to hearing the reproduction: of course, we usually don’t get to hear the original sound that was recorded, so we have to imagine it instead.  You can guess what happens: the reproduction the machinery reports as best isn’t always what our ears prefer.  Furthermore, while we understand some of why this is, some of this gap remains unexplained.  In the world of diagnosis, it’s the gap between one that’s serviceable, and one that allows perfect understanding.  In the commercial world of Hi-Fi, bridging that gap is what costs so much.  Digital and analogue are alternative approaches to Hi-Fi nirvana, just as diagnosis and formulation are to the psychiatric equivalent.  Before we get into the nuts and bolts of the alternatives, we need to understand that bias, reliability and validity impact on Hi-Fi as much as they do psychiatry.

                Despite the tendency of Hi-Fi companies to stick “Research” in their names, the science of electronic music reproduction, be it analogue or digital, is fully understood.  However, when we listen to music, there is so much more happening than modulated pressure waves in the air hitting our ears.  Our ears do not simply transfer the sounds they receive to our nervous system: instead, like our eyes, they sample and reconstruct.  We are more sensitive to sounds inside the range of human speech than outside; rhythm engages us emotionally, and other cues, such as echo, interference and binary hearing lead to different kinds of understanding e.g., spatial awareness. Our emotional responses to music are also influenced by some distortions e.g., a subtle emphasis placed on high frequencies makes music sound “brighter”, and while the addition of some additional harmonics is experienced as harshly discordant, others are experienced as enhancing. From the perspective of the Hi-Fi engineer, there is thus a constant tension between providing accurate sound reproduction, and introducing subtle tweaks that can simulate the sense of emotional immediacy which comes from the social cues and anticipation associated with a live performance.  The Hi-Fi engineer is thus seeking what we will consider “the best” rather than “the correct” reproduction.  Because “the best” involves our emotions, which signal our values to ourselves, some of us are willing to spend huge sums to get the value we seek.  So, the engineer introduces some distortions to bias the output, which is insufficient to invalidate the connection between the reproduced music and the true original, while also reliably reproducing the emotion we anticipate from our imagined original.

                The diagnostician has a very similar task to the Hi-Fi engineer, though the science of diagnosis is much less complete than that of music reproduction.  A diagnosis must be serviceable i.e., it should point to effective treatments, if available, and indicate a prognosis.  However, the purpose of both treatment and prognosis is to increase the value in our lives, either by removing or adapting to the condition the diagnosis denotes.  Thus, in exact parallel to Hi-Fi, there is a tension between description and utility, and because it involves values, people can (and do) make enormous investments in their choices here.

                The analogue and digital modes of music reproduction represent two, radically different solutions to the same problem; how to store the information encoded in sound.  They’re both shown in the image below


                The red, smooth line shows a perfect, visual analogue transcription of a single tone (a sine wave).  The blue staircase is a digital attempt to code it, using only two bits, giving four steps in total.  These are shown to the right of the figure.  The digital coding looks pretty rubbish, doesn’t it?  However, that’s because only two bits were used, and there is no limit to the number of bits that can be used.  If you look closely at the red line, you’ll notice its edges seem a little blurred, compared to those of the blue one.  Look more closely, and you’ll discover that the red line is also a digital staircase, the blurred edges resolving into tiny steps with enough magnification.  Thus, digital offers a “gradus ad parnassum” approach to perfection; able to get as close as one likes, but theoretically never reaching it.

                Purely analogue reproduction (e.g., vinyl recordings), on the other hand, suffers from a “Garden of Eden” problem


                In theory, there are no losses or distortions associated with analogue reproduction.  However, this ideal state never survives contact with the real world, and the serpents that live therein.  Materials have impurities in them; there is no tolerance without a margin of error, and there are plenty of opportunities for all sorts of added noise to creep into the system.  Formal measurement usually confirms that digital approaches provide more accurate reproduction than the best analogue systems.  However, analogue systems have never completely died out, and their popularity is now increasing again.  Curiously, while quantitatively greater, the biases introduced by analogue systems do not disrupt, and may even enhance, our musical enjoyment, which seems not to be true for digital systems.

                It’s hopefully becoming clear that I liken diagnosis to digital, and formulation to analogue methods of sound reproduction.  In fact, the earliest system of diagnosis was a 2-bit system.  The Smith Papyrus, written by Imhotep in the 17th Century BCE, defines 1) ailments I can treat; 2) ailments I may fight with (though not necessarily win) and 3) ailments not to be treated.  Not having an ailment forms the (implicit) fourth step.  Here’s an example of how it recommended its diagnoses be employed:  the principle hasn’t changed much in around 3,700 years

                smith papyrus.001

                Of course, many more diagnoses have been developed since then, ICD and DSM being the depositories for those that are most widely accepted in mental health.  Keeping our “gradus ad parnassum” analogy going, modern diagnoses are like digital staircases attempting to approximate a smooth incline: dependent on the resolution we seek, we can either say that our approximation is sufficient for our purposes, or insufficient, in which case we need to change our steps accordingly, knowing that while we will never reach absolute perfection, we may be able to get close enough for it to make no difference.  In fact, the argument we’ve developed has shown that a “perfect” diagnosis is actually a chimaera.  What we’re after is the “best” diagnosis: something that simultaneously denotes the condition sufficiently accurately, while flagging up effective treatments and reliable prognoses which may be worked with, or adapted to.  The technical term for this is predictive validity.

                Let’s now turn to formulation, the psychiatric (or, in this context, psychological) equivalent of the analogue approach.  In theory it should be a perfect model, expressed in words of equivalent meaning and value to all who use it, of everything relevant to the condition and the person experiencing it.  This is definitely Garden of Eden territory, given the current state of our knowledge.  In particular, the brain, whose functioning seems best expressed by complex mathematical simulations, tends to get left on the “too hard” pile when formulations are constructed, thus excluding the organ we use to experience our mental health.  Expressed like this, the formulation seems a futile attempt to erect a pavilion of understanding upon an ocean of ignorance.  However, the effectiveness of formulation in delivering understanding on minimal data has long been understood in the arts.  Here’s a photograph of the damage done to Guernica


                and here’s Picasso’s famous visual formulation of the same event


                While the photograph is probably sufficient to determine that something bad had happened, communicating the dreadful truth of mass bombing on unprepared towns required, not more data, but more imagination.  Picasso’s Guernica works because verisimilitude has been tempered with Picasso’s emotions, signalling the values we should apply.

                Unfortunately, it’s often possible to make opposing formulations about the same states of affairs.  Here’s what the Spanish fascists were saying


                so it therefore should come as no surprise that psychiatric (or psychological) formulations show much greater propensity to bias, demonstrated through reduced reliability and validity compared to diagnoses.  At their worst, they can simply be sales pitches or propaganda, because the information they provide really comes from the storyteller’s understanding, which stands between us and the facts themselves.  However, as previously, in psychiatric diagnosis we are after the “best” formulation, so our imagination should be limited to stories which have the closest possible link to the circumstances before us, not the circumstances we would like to believe.  This constrains the best formulation to be consistent with the best available science about the circumstances it is attempting to explain.

                It’s instructive to see how the Hi-Fi industry has approached its two, very different approaches to music reproduction.  There are both digital and analogue purists: the systems either design produce great sound, differing in character but of very similar value.  However, whichever you choose, the costs for a top-end system will be eye-watering, and the systems remarkably temperamental (especially if you have gone the analogue route).  Ordinary mortals need a different strategy.  We cherry-pick our systems across different manufacturers, trying to select components whose strengths support each other, and whose weaknesses cancel each other out.  Digital and analogue components thus frequently end up in the same system.  Does this approach work? The short answer is yes, with significant cost savings and no appreciable overall loss of sound quality.

                In psychiatry, the everyday reality is that formulations and diagnoses are used side-by-side, either explicitly or implicitly.  Diagnosis keeps us rooted in our data and evidence.  Formulation lets us co-construct imaginative stories that link the science to experience and value, offering both understanding and credible ways forward.  Purists continue to try for a “one size fits all” approach, but, just like the hi-fi purists, set themselves tasks which can only be achieved expensively, intermittently and with difficulty, if at all.