How unreliable are the judges on Strictly Come Dancing?

the_strictly

That very clean glass wall won’t hold itself up. Photo by Dogboy82 – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=44203685

Strictly Come Dancing, one of the BBC’s most popular shows involving celebrities moving in specific ways with experts at moving in specific ways while other experts check if they’re moving specifically enough contains certainties and uncertainties. We’re not sure who will be voted out in any particular week. We don’t know know what the audience are going to complain about. An injured woman not dancing! I was furious with rage! We do know that Craig Revel Horwood will use the things he knows to make a decision about whether he likes a dance or not while saying something mean. We can be pretty sure what Len Goodman’s favourite river in Worcestershire, film starring Brad Pitt and Morgan Freeman and Star Trek: Voyager character is. But can we be sure that the scores awarded by the judges to the dancers are accurate and fair?

In science, a good scoring system has at least three qualities. These include validity (it measures what it’s supposed to measure), usability (it’s practical) and reliability (it’s consistent). It’s difficult to assess the extent to which the scoring system in Strictly Come Dancing possesses these qualities. We don’t really know the criteria (if any) that the judges use to assign their scores other than they occasionally involve knees not quite being at the right angle, shoulders not quite being at the right height, and shirts not quite being able to be done up. As such, deciding whether the scores are valid or not is tricky. The scoring system appears to be superficially usable in that people use it regularly in the time it takes for a person to walk up some stairs and talk to Claudia Winkleman about whether they enjoyed or really enjoyed the kinetic energy they just transferred. In some ways, checking reliability is easier. Especially if we have a way to access every score the judges have ever awarded. And we do. Thanks Ultimate Strictly!

For a test to be reliable, we need it to give the same score when it’s measuring the same thing under the same circumstances. If the same judge saw the same dance twice under consistent conditions, we’d expect a dance to get the same score. This sort of test-retest reliability is difficult to achieve with something like Strictly Come Dancing. The judges aren’t really expected to provide scores for EXACTLY the same dance more than once. Otherwise you’d end up getting the same comments all the time; which would be as difficult to watch as the rumba is for men to dance. Ahem. However, you can look at how consistently (reliably) different judges score the same dance. If all judges consistently award dances similar scores, then we can be more sure that the system for scoring dancing is reliable between raters. If judges consistently award wildly different scores for the same dances, we might be more convinced that they’re just making it up as they go along, or “Greenfielding it” as they say in neuroscience.

To test this, all scores from across all series (except the current series, Christmas specials and anything involving Donny Osmond as a guest judge) were collated and compared. Below, we can see that by and large the judges have fairly similarly median scores (Arlene Phillips and Craig = 7, Len, Bruno Tonioli, Alesha Dixon and Darcey Bussell = 8). The main differences appear to be in the range of scores with Craig and Arlene appearing to use a more complete range of possible scores.

strictly-box-plot

Box plot (shows median scores, inter-quartile ranges, maximum and minimum scores for each judge)

A similar picture is seen if we use the mean score as an average, with Craig (mean score = 6.60) awarding lower scores than the other judges, whose mean scores awarded range from 7.05 (Arlene) to 7.65 (Len and Darcy). Strictly speaking (ironically) we shouldn’t be using the mean as an average for the dance scores. The dance scores can be classified as ordinal data (scores can be ordered, but there is no evidence that the difference between consecutive scores is equal) so many would argue that any mean value calculated is utter nonsense meaningless not an optimum method for observing central tendency. However, I think in this situation there are enough scores (9) for the mean to be useful; like the complete and utter measurement transgression that I am. At a first glance, these scores don’t look too different and we might consider getting out the glitter-themed cocktails and celebrating the reliability of our judges.

strictly-bar-chart

Bar chart showing mean scores and variance for each judge.

In order to test the hypothesis that there was no real effect of “judge” on dance scores, I did a statistics at the data. In this case a Kruskal-Wallis test because of the type of measures in use (one independent variable of ‘judge’ divided into different levels of ‘different judges’ and one independent variable of ordinal data). And yes, it would be simpler if Kruskal-Wallis was what it sounded like, a MasterChef judge with a fungal infection. Perhaps surprisingly, the results from the test used could be interpreted as showing that the probability that the judge doesn’t affect the score was less than 1 in 10,000 (P< 0.0001). The table below shows between which judges the differences were likely to exist (P< 0.0001 for all comparisons shown as red).

strictly-table

Table showing potential differences between judges in terms of scores they give to dancers

Thus it would seem that the probability that Craig isn’t have an effect on score is relatively small. In this instance, Craig appears to be awarding slightly lower scores compared to the other judges. The same could be said for Arlene, except if she is being compared to Craig, where she seems to award slightly higher scores.

So it transpires that the scores on Strictly Come Dancing are indeed unreliable. Arlene did and Craig is throwing the whole system out of alignment like a couple of Paso Doble doing a Jive at a Waltz. Tango!

Possibly not though, for a number of reasons. 4.) I am clearly not an expert in statistics, so I may have just performed the analysis incorrectly. 2.) If differences do exist, they are relatively subtle and are likely to be meaningless within individual shows, only coming to light (and bouncing off a glitter ball) when we look across large numbers of scores. That is to say, that a statistical difference may exist, but this difference likely makes no practical difference. A.) At least it’s not The X Factor.

Keep dancing. And doing maths.

Shocking evidence of stereotyping in Mr Men and Little Miss

UntitledI have very occasionally been asked the question, “Why are all Mr Men good and all Little Miss bad?” I’m sure this was meant to be rhetorical, with the underlying assumption that all Mr Men are good and all Little Miss are bad, but my admittedly limited recall was not in agreement with this statement. I was sure Little Miss Sunshine existed for start and unless exposure to her was the cause of skin disease, I didn’t remember her being bad as such. I also remembered Mr Uppity, a wealthy character who was rude to everyone and could potentially run for parliament as a member of the Conservative party. I don’t think he could be considered good per se.

For those who are unaware, the Mr Men and Little Miss are a series of semi-popular children books, originally written by Roger Hargreaves, which took shapes, gave them faces and one bit of a personality and asked us to enjoy ourselves by judging their actions. Luckily, their popularity meant other people had heard of these Euclidian protagonists. When I asked others about the Mr Men/Little Miss morality divide, the general response was not that Mr Men were good and Little Miss were bad, but that the characters as a group were sexist. It was generally felt that the characters conformed to harmful gender stereotypes. This is certainly understandable. For a start they all live in Misterland. The place they live in is actually named just after the males of the population. It’s like if the countries were called Manada, Mance or Oman. Which is obviously ridiculous. Secondly, the female characters’, the Little Miss’, creation began in 1981, much later than the Mr Men, whose creation began in 1971. I don’t know the actual reasoning behind this, but it does somewhat make the Little Miss seem like an afterthought. Finally (for this list, by no means for all reasons why Mr Men/Little Miss might be sexist) why don’t the Little Miss follow the same naming convention as the Mr Men? Why aren’t they the Ms Women? Or something better? “Little Miss” seems a little demeaning, like describing something that’s demeaning as “a little demeaning.”

1024px-Clothing_Rack_of_Jeans

They’re jeans! They’re all essentially the same. Just like people. Depth!   “Clothing Rack of Jeans” by Peter Griffin – Licensed under Public Domain via Wikimedia Commons

There is very little reason to even divide the characters based on binary gender. If they were real people, we could say that they each identify with a gender or different aspects of genders i.e. they all have different traits as people, and that would be fine. Except that these are characters which have been assigned a traditional gender and a specific characteristic. We don’t know how this decision is made other than the gendered title is not based on primary or secondary sexual characteristics. There’s nothing specific about the characters that even make them stereotypically male or female other than their names. They’re all just shapes with personalities. Technically I suppose this is true for most people.

So far these are all opinions based on perceptions. Perceptions, psychologically speaking, are prone to an enormous amount of bias. For example, Distinction Bias, where there is a tendency when considering two things to see them as more dissimilar when evaluating them at the same time than when evaluating them separately. Like when comparing different pairs of jeans in a shop and tiny differences are magnified, but really they’re all incredibly similar because they’re just blue trousers for crying out loud! Or potentially when comparing Mr Men and Little Miss. Or there’s Trait Ascription Bias; where individuals consider themselves to be variable in terms of behaviour and mood, while considering others to be much more consistent and predictable. To be fair, this may be understandable when it comes to the Mr Men and Little Miss. Our judgement on the relative goodness of Mr Men and Little Miss may therefore be influenced by such bias. Can the morality of these shapely (literally) populations be objectively examined?

Each book in the original Mr. Men and Little Miss series introduced a different title character with a single dominant personality to convey a moral lesson. The dominant personality trait was also their name. Luckily this is not how humans or Piers Morgan are named. To examine whether the Mr Men and Little Miss are separated by some sort of weird moral judgement, it should therefore be relatively easy to use their names to observe if there are any trends.

The populations of Mr Men (n=50) and Little Miss (n=37) were examined. Based on their names alone, each character was assigned a moral weighting of good, bad or neutral. For example, Little Miss Brainy was considered good, Mr Greedy was considered bad and Mr Bounce was considered neutral. These decisions were just made by me, which will almost certainly introduce a source of bias towards my own values, determined by upbringing, culture, socialisation and so on, regarding what’s good, bad and neutral. I could have attempted to correct this by hiring a suitably varied team of Hargreaves-trained research assistants and averaging their judgements, but I haven’t the money, time, inclination or money.

The proportion of the total population for each moral assignation was then calculated. No further statistical tests were performed to compare the two populations, as the numbers involved weren’t large enough to make these comparisons meaningful. Any differences observed can therefore be considered trends or as a real statistician might technically call them, “nonsense.”

As Figure 1 illustrates, contrary to what was originally proposed, there were fewer good (18 vs. 24%) and more bad (48% vs. 38%) Mr Men compared with Little Miss. So it would seem that generally Mr Men are (a bit) morally worse than Little Miss.

Figure 1. Moral Proportions of the Populations of Mr Men and Little Miss.

Figure 1

However, we know that what is considered morally good or bad changes over time. For example, it was formerly considered a moral failing to be left handed. This attitude is now agreed to be a bit sinister.  Previously there was a lot of public judgement as to the type of clothing women should wear. Nowadays, this is also done on social media. There may be one or two other examples in history. Perhaps the moral association of the Mr Men and Little Miss has also changed with time. To examine this, the populations of Mr Men and Little Miss were divided into new and old characters based on whether the book featuring them was published before or after 1990. This year was selected as a fairly natural cut-off as in 1988, Roger Hargreaves unfortunately died and his son, Adam, began writing and illustrating new stories and characters.

Figure 2. Moral Proportions of the Populations of Old and New Mr Men and Little Miss.

figure

 

Figure 2 illustrates that there are fewer good (10% vs. 24%) and more bad (56% vs. 48%) old Mr Men compared with old Little Miss. It can also be seen that there were fewer good (18% vs. 25%) and more bad (25% vs. 18%) new Mr Men compared with new Little Miss.

From a slightly different perspective we can also see from these data that (numerically at least) there are more good and fewer bad new Mr Men than old Mr Men and approximately the same number of good, but fewer bad new Little Miss than old Little Miss. So it would seem:

  • Mr Men have been historically morally worse than Little Miss and continue to be so into the present day
  • New Mr Men are morally better than old Mr Men
  • New Little Miss are more morally neutral than old Little Miss

Because we’re humans with prejudices and bias, it is easy to interpret these trends in a number of ways. For example, it may be argued that it displays the prejudice of the the Mr Men and Little Miss book series, with the Mr Men being allowed more complex characters and the Little Miss, where they have moral character at all, being relegated to the old “good, sweet and innocent” stereotype. Sugar and spice and all things nice, that’s what little female polygons are made of. Without looking in greater detail at the actual traits assigned, it is difficult if not impossible to say what this may reveal; if there is any stereotyping present or if these trends are simply random.

It could be argued that rather than morals changing over time, these data show the change in morals between Roger and Adam Hargreaves. I don’t know either of them, so can’t really say anything in that regard, but I do know that books are rarely just produced by one person on their own and the differences will at least reflect the views of two teams.

Judgement across gender stereotyping is obviously more complicated than a seemingly simple good versus bad dichotomy. The idea of gender as a binary concept is laden with all sorts of complex and subtle stereotypes and comparisons. It may be possible to broadly determine if there are any obvious stereotypical comparisons by matching the names within the Mr. Men and Little Miss populations to see if they conform to any traditional gender roles.

To examine the roles of the Mr Men and Little Miss, the populations were examined to see if their names could be paired with a counterpart with the same meaning e.g. Mr. Birthday and Little Miss Birthday, with a counterpart with the opposite meaning e.g. Mr Messy and Little Miss Tidy, or if there was no counterpart e.g. Mr. Moustache. Where pairs were available, the moral weighting (good or bad) and the meaning of the names themselves were compared. Again, it was just me that was checking, so interpretation is potentially based on any prejudice I may have lurking within my poor tired brain.

Table 1. Matched and Opposing Mr Men and Little Miss Characters

Table 1

From Table 1 we can see that is was relatively more common for Mr Men to be matched with Little Miss than for them to be opposing. We should perhaps be pleased about this meagre hint of equality, although it is perhaps notable that the majority of the matching pairs may be considered bad characteristics.

Where the Mr Men and Little Miss are compared in terms of their opposite character, they seem to be reasonably balanced in terms of which group is good or bad. However, when we look at the actual words associated with the Little Miss (tidy, neat, helpful, scary) and Mr Men (messy, brave, mean) it begins to sound too much like the parents in a sitcom for us to be comfortable about the lack of gender stereotyping. The sitcom where the husband is the silly, humorous idiot and the wife is an attractive, home-based nag. I’m sure you know the one. However, these characters represent only 13% of the total pooled population. This is perhaps too small a proportion with which to judge all of the 2D people.

In summary, we have managed to get a few bits of information by looking at the total population of Mr Men and Little Miss. We know that the population of Mr Men contains more bad characters than the population of Little Miss and this is also the case historically. Pretty much just like with humans. We also know that stereotyping is likely present in this population, but we can’t say more without cooperation between more people. Pretty much just like with humans. Finally, we know that gender and how it can be used to stereotype is a complex issue (even the word gender means different things to different individuals) and that there is a lot of thought needed to advance many issues in this field. Pretty much just like with shapes with personalities.

 

Why Pudsey Bear is awful: An annually pointless grudge.

A bear that isn't Pudsey. I wasn't sure on the copyright and didn't want to give him another reason to come after me. A bear that isn’t Pudsey. I wasn’t sure on the copyright and didn’t want to give him another reason to come after me.

Every year in connection with Children in Need I tell the story of why I don’t like Pudsey Bear. I’m told by my friends (who despite what I’m told by others, do exist) that it wouldn’t be a real Children in Need without this story. They’re humouring me of course, but humouring me is 92% of the work of being my friend, so that’s fine.  I apologise if you started reading this thinking it was a complex critique of the inadequate wealth redistribution of Children in Need or a political discourse on how if society were better we wouldn’t even require Children in Need.  I don’t know if the former is true and while the latter certainly is, there are people far better qualified than I am to discuss it. I’m afraid my story is a short, bitter, pointless grudge against a monocular bear associated with a worthy cause. If you like, at the end, you can tut and say “One night isn’t Children in Need, children are always in need.” Yes.

Are we sitting comfortably? Then I’ll begin.

As a much younger man, a child even, I was ill and had been to the see a doctor. I can’t remember what the illness was. I imagine it was probably just a virus that had gone on a bit too long or possibly the ongoing inflammation of my pedantry gland.  Of course I would be remiss if I didn’t point out that the pedantry gland doesn’t exist. After leaving the clinic, in fact just outside the clinic, I did a manly collapse (fainted). On my trajectory towards the ground, I decided that my head should take a slight detour towards the wall. I broke my glasses. Like most people who wear them, (*narrows eyes at hipsters*) I need my glasses for seeing. As a result, this was almost literally adding insult to injury. Actually, I guess it was just adding inconvenience to injury. As I lay there, bewildered and pathetic, head hurting, glasses broken, I notice a blurry figure approach out of the blurry distance into the slightly less blurry foreground. It was Children in Need at the time and this figure was Pudsey Bear! He was obviously out collecting money for Children in Need. That being the thing that he’s in to. Who better than the mascot of Children In Need to help a child in need outside a healthcare professional’s building? Pudsey stepped over me and carried on walking.

I’m not a fan of Pudsey Bear.

“Perhaps Pudsey didn’t see you, his vision can’t be that good.”

“Why did he step over me and carry on down the street instead of tripping over me and carrying on towards the pavement?”

I’m not a fan of Pudsey Bear.

Another acceptable bear. Another acceptable bear.

It is known from studies into altruism, that the decision to stop and help someone is influenced by a number of factors. If people feel they are short of time, see someone is bleeding, think there are lots of people around so one of them will help (diffusion of responsibility) or simply don’t identify with the person who needs assistance, then they are much less likely to engage in altruistic behaviour (the bystander effect).

Perhaps Pudsey was late for an important bear appointment, was put off when he saw I was losing haemoglobin, thought one of the other people would help me and noticed I wasn’t a bear like him, so didn’t help. Perhaps Pudsey’s just awful.

I’m not a fan of Pudsey Bear.

I am a fan of the work done by Children in Need. They do good work that shouldn’t be necessary. So please give generously. Because Pudsey won’t.

Or there are lots of good charities, so you can pick one. You might as well, otherwise reading this stupid story about my ridiculous grudge against a visually-impaired ursine has been a complete waste of time.

Does Sean Bean Always Die at the End?

The Alpha Sean Bean, shown here to be still alive. The Alpha Sean Bean, shown here to be still alive.
“Sean Bean TIFF 2015” by NASA/Bill Ingalls. Licensed under Public Domain via Wikimedia Commons .

There’s a quote from a character in The Lord of the Rings: Fellowship of the Ring, and J.R.R. Tolkein’s character from some book or other, that has been doing the rounds as an internet meme for quite some time: “War makes corpses of us all.”  Of course you all know it, it’s ridiculously famous, after all, one does not simply forget a Faramir quote. Much better than Boromir. In Sean Bean’s case however, the quote might as well be “appearing in a role in television or film makes a corpse of me, Sean Bean.” Sean Bean is well known for dying in films. So much so, that there exists a campaign specifically against the further onscreen killing of Sean Bean. At least, I think it still exists. It might have died.

Basically it is a fairly common assumption that if Sean Bean is in something, he will most likely not make it to the end. However, everyone knows what happens when you assume; you make a prick of yourself. Is it actually true that Sean Bean always dies? In psychology, confirmation bias describes the tendency for people to better recall information that confirms their existing beliefs than information that would refute them. The frequency illusion is where something (it can be an event or just an object) which has recently been brought to a person’s attention suddenly seems to occur or appear with greater frequency than it did before it had been noticed. This is also known as the Baader-Meinhof Phenomenon and once you know about it, you’ll start seeing it everywhere. So it is possible that the appearance of Sean Bean’s repeated celluloid mortality is a function of some common cognitive biases rather than him actually ending more times than a Sunday furniture sale. The following information that was collected to test this may contain spoilers for Sean Bean projects. Unless you believe the appearance of Sean Bean in a cast list is in itself a spoiler.

Using some sort of internet search engine (if you want to find a similar one, you can look it up on Google) all of Sean Bean’s roles in film and television were listed to create a population of Sean Beans. From here forward, the collective noun for Sean Beans used will be “population” rather than the perhaps more common “can” or “cemetery.” Sean Bean’s roles in theatre or performing voiceover in video games were not included due to a combination of being too difficult to include, laziness and the words “Sean Bean” starting to lose all meaning. The actual actor Sean Bean (the Alpha Sean) was also included, as while technically it is an ongoing role, we do know with reasonable certainly that Sean Bean will die at the end of it. The Alpha Sean was not included in any cause of death calculations in case I end up as a suspect in a future murder investigation. Jupiter Ascending was not included for obvious reasons.

The number of times Sean Bean was dead at the end of a film/TV show and the number of times Sean Bean was alive at the end of a film/TV show were counted and used to calculate the incidence of death for the total population of Sean Beans. The incidence rate is the number of new cases of a disorder or death within a population over a specified period of time. This is commonly express in terms of per 100,000 persons per year. In terms of deaths, this in some ways can be seen as equivalent to the Mortality Rate. Some basic demographics, causes of deaths and intentionality of deaths were also calculated.

The demographics for the population of Sean Beans are shown in Table 1.

Table 1. Sean Bean Demographics

Characteristic Sean Bean Numbers
N 75
Mean (SD) age, years 6,0810,851.05 (523,114,369.60)
Species, n (%)
Actor 1 (1.33)
Human 71 (94.67)
Lion 1 (1.33)
Portrait 1 (1.33)
God 1 (1.33)
Survival
Alive, n (%) 45 (60.00)
Dead, n (%) 30 (40.00)

The incidence of Sean Bean deaths across the total existence so far of Sean Beans (6000 BCE to 2072) is 4.85 per 100,000 person per year. The causes of Sean Bean death and intentionality of Sean Bean death are shown in figures 1 and 2, respectively. The most common cause of death was being shot by a gun. The best cause of death was fall from cliff due to a herd of cows. Most Sean Bean deaths were intentional (as a result of homicide) compared with accidental and orcicide.

Figure 1

Figure 1. Cause of Sean Bean death.

Figure 2

Figure 2. Intentionality of Sean Bean death.

The aim of all this Beanian death numbering was to determine if there was any truth to the common belief that Sean Bean always dies at the end. Examination of a fairly complete population of Sean Beans shows that this is not the case, with 60% of Sean Beans managing to survive the time it takes for many film and TV directors to tell a story. If you are a Sean Bean though, it seems you are most likely to die by being shot by a human. There may be some money to be made in a line of Sean Bean-specific bullet-proof vests.

So why is the belief that Sean Bean always shuffles off the mortal coil at the end so common? The application of confirmation bias to this has already been discussed, but for that particular bias to take effect, there must be an existing belief to confirm. The earliest manifestation of Sean Bean’s tendency for premature televisual corpse shenanigans that I could be found was approximately around his fourth appearance. However, at a preliminary glance, Sean Beans don’t seem to kick the bucket particularly often early on in the ascendance of Sean Beans to make any reputational impact.

If we divide the appearance of Sean Beans into tertiles (an ordered distribution divided into three parts, each containing a third of the population, not an aquatic reptile with a shell) and look at the proportion of deaths as time progresses, we get something that looks like figure 3.

Figure 3

Figure 3. Proportion of Sean Bean deaths by Sean Bean time tertile.

We can see that if 3 is the most recent tertile and 1 is the furthest in the past, then the Sean Bean death rate appears to be greatest in the middle of the population’s progression through time. In psychology, the serial position effect describes the tendency for people to recall items earlier (the primacy effect) and later (the recency effect) in a list the best, with items in the middle being recalled the least. This would not explain the Sean Bean always dies reputation, as in such a model we would expect more deaths in the first and last tertile. Besides, one explanation for the serial position effect is that earlier items are stored more effectively in long term memory than the other items, while more recent items are still present in working memory and are thus easily available for recall. This would only apply to these data if people experienced Sean Bean necrosis as a list in front of them, which most people (besides me) don’t. Even if the data matched a serial positioning explanation, it would be a stretch (i.e. wrong) to use it to explain the Sean Bean deceased at the finale reputation phenomenon.

Rise of the Nicole Kidmen would be a good episode of Doctor Who. Rise of the Nicole Kidmen would be a good episode of Doctor Who.

Characters don’t become instantly well known in popular culture. It takes time for a reputation to build and saturate society. In this respect, perhaps we can consider the middle tertile to be more akin to the starting point for a reputation i.e. Sean Beans will be more well known, with more opinions being formed about them. The Sean Bean death rate here is 52%, meaning that during this period Sean Beans were slightly more likely than not to die at the end. This may be enough to start the rumour of Sean Beans’ non-existence by the credits and establish a source for confirmation bias.

Characters don’t exist in isolation. They usually exist in a complex ecosystem of other populations. The Sean Bean population exists alongside the population of Bruce Willises (Willi?) and the population of Nicole Kidmans (Kidmen?) among others. Important data to consider would therefore be how often Sean Beans die in comparison to other populations. If the comparative death rate of Sean Beans is noticeably higher than that of other comparable populations, then this may explain the Sean Bean clog-popping conundrum. Future “research” should focus on this (I can’t be bothered right now).

It was suggested to me by KTBUG (kgwright73) that the popularity of the mode of presentation of Sean Bean would have an impact on the perception of his tendency for pushing up the daisies. It seems feasible Sean Beans die in more popular things and live in less popular things then the public perception would be that of a gentleman prone to leaving his life behind. To this end (where available) I took an average of lifetime box office takings for films where Sean Bean died and films where Sean Bean lived (figure 4).

Figure 4

Figure 4. Average lifetime box office takings by Sean Bean survival.

Figure 4 shows that films where Sean Bean shook hands with the Grim Reaper on average took more at the box office than films where Sean Bean continued respiring. If we use this as a crude measure of popularity (and it is very crude, subject to bias from missing TV shows and films where I simply couldn’t get the info) and impact on cultural awareness, then films where Sean Bean becomes an ex Sean Bean seem to have made a larger cultural impact. This could certainly be at least one source of the idea that Sean Bean always dies.

Please note, I am in no way suggesting that Sean Bean dying in it makes a film popular. As the old saying goes, “Sean Bean’s death correlation, does not prove film popularity causation.” You all know it.

In conclusion it would seem that Sean Bean’s reputation for always dying at the end is somewhat over exaggerated, with a death rate of approximately 40%. Sean Beans are most likely to die from being shot intentionally by a human or from being in the middle of their career trajectory. The Sean Bean Ex-Parrot Meme may be best explained by a high death rate at a time when Sean Beans were likely to be reaching their maximum prevalence in the public eye and by films which feature a Sean Bean death having made a larger cultural impact than films that feature a living Sean Bean at the end. These perceptions feed into confirmation bias. And then Sean Bean died.

Women are Funny.

Do not, under any circumstances, Google "funny women" to find an image for your blog post.

Do not, under any circumstances, Google “funny women” to find an image for your blog post.

When you type the phrase, “women comedians” into Google the second suggestion that appears is “women comedians aren’t funny.”Now I’ve no idea how Google works, probably librarian-trained crows, but this does seem like a worryingly common-place opinion. I have had a discussion fairly recently which involved the other person saying, “But women just aren’t funny” which made me concerned that the person I was talking to had never met or spoken to a woman. And the person I was talking to was a woman! Probably still is.

It’s not up to me to decide what’s funny. What people find humorous, while sharing many commonalities, varies wildly and so does what people say and do in an effort to be funny. Farts! This variation is obviously true of women who much like snowflakes, fingerprints or human beings are all individual and unique. Some women will be funnier on average than other women and funnier on average than some men.  The funniest woman is likely as funny as the funniest man. I don’t even though how you’d reliably judge “funniest”. What unit would it be measured in? MilliMillicans?

It’s not up to me to defend women. They are perfectly capable of defending themselves. Declaring that women simply lack the ability to be funny is odd though. While there are many theories as to what is humorous, one prevalent idea is that laughter comes with incongruity. This theory states that humour is perceived at the moment of realisation of incongruity between a concept and the real thing in relation to that concept. If this were the case (and it certainly seems to be at least some of the time) if you claim that women can’t be funny then you are claiming that women can’t conceive of ideas and situations not matching. This is an ironically difficult notion to conceive of.

Oestrogen and laughter are apparently not contra-indicated.

Oestrogen and laughter are apparently not contra-indicated.

I’m not especially interested in whether the ideas that women aren’t funny or that women aren’t as funny as men are true or not. They’re blatantly not.  The Funny Women Awards have just celebrated their 11th year with the 2013 winners being duo Twisted Loaf. The Funny Women Awards unlikely to have years where they can’t award anything due to women being unusually mirthless for a select 365 days. There are multiple examples of very funny women including Sarah Pascoe (@sarapascoe), Sarah Millican (@SarahMillican75), Rachel Parris (@iamrachelparris), and Gabby Hutchinson Crouch (@Scriblit). I have purposefully not made this list extensive as I am sure to miss out some excellent individuals and some idiot is bound to sweep a paw across the list and state that “None of dem are funny” as if it were an objective truism rather than a subjective comedic preference.

I’m more interested in considering the arguments people use to justify this opinion and whether they stand up to scrutiny (they won’t). I’m going to use a vague biopsychosocial approach to do this. Not because I think detractors of female comedy, or as it is sometimes known “comedy” do so but because it’s a reasonably simple way to manage the ideas.

Evolution/Biology

Evolutionary psychologist Geoffrey Miller (when he wasn’t busy tweeting about students being fat) proposed that human characteristics like humour evolved by sexual selection. Sexual selection: good name for a part of evolutionary theory, bad name for a box of confectionary. He argues that humour (which he states has little survival value) emerged as an indicator of other traits that were of survival value, such as intelligence. On this basis if you argue that women aren’t or can’t be funny you would be arguing that either women can’t use humour to show their intelligence (clearly wrong), that they can but they don’t (clearly wrong because of examples) or that if they did men might not appreciate it (ahem). Women are showing intelligence through humour and people are ignoring it or at worse threatened by it? They would have to be pretty small-minded, insecure people. At this stage you can assume I am giving meaningful looks.

Another evolutionary psychology theory takes a break from copying Rudyard Kipling and argues that, like male deer clashing antlers, humour is produced by males competitively to impress potential mates for breeding. Consistent with this theory is research that females indicate a preference for mates who makes them laugh, whereas males prefer a mate who laughs at their humour.

However the data are not entirely consistent with this view. Most studies find male humour appeals most to other men.  In purely evolutionary terms, if you are in search of a mate to breed with, attracting a bunch of guffaws and their supposed sexual advances from members of the same gender isn’t the best move. Secondarily this theory in no way explains why women can’t do the same thing. If you’re arguing for a theory, it’s not really enough to state that they just don’t. Any attempts by MRI to catch the ovaries strangling jokes before they leave the body have thus far failed. So we’re left with a theory that tries to make humour the exclusive domain of rutting men, but fails like a pleasant look on Piers Morgan’s face.

Psychology

Lee Mack on Radio 4’s Desert Island Discs has said fewer women become comedians because they are not so inclined to show-off or be competitive in conversation. Lee Mack stated “I am only quoting other scientific reports on it.  When men sit around together and talk they are very competitive… When you get six women in a room together they share a lot more…and it’s a more interactive. “This idea may have links to the evolutionary theories seen previously.

The concept that men are more likely to do stand-up comedy or just be funny because they are more competitive than women is pervasive. Generally, research into how groups of single and mixed sexes converse agree with what Lee Mack is saying. A sentence I never thought I’d type. But these are just tendencies. Women may be more likely to support each other in conversation, but that doesn’t mean they all do it all the time. They can also be competitive and try to show off. Same goes for men for support and chances are it’s largely context dependent.

These studies investigated conversation and weren’t about being funny and/or a stand-up comedian. Just because a woman is on average more likely not to be competitive in conversation, doesn’t mean she won’t change her style of interaction when “performing” to her friends or performing onstage as a comedian.

It was depressingly difficult to find a picture of a female clown that wasn't trying to be "sexy".

It was depressingly difficult to find a picture of a female clown that wasn’t trying to be “sexy”.

Finally and more importantly, competition and showing off doesn’t necessarily equate to funnier. For some reason people who make this argument seem to be focussing on one style of comedy. One-upmanship is fine for some things (human pyramids and so on), but a lot of comedy relies on interaction, support and listening e.g. improvisation, sketch comedy. Stand-up itself doesn’t need to be competitive as such and many a skilled comedian can build a hilarious act through audience interaction and support. Just watch Dara Ó Briain open a show.

Social (and some psychology)

The entertainment industry seems to agree with the idea that women are not or can’t be funny, or at least can’t be as funny as men. One figure tossed around is that only 10% of stand-up comedians are women and it’s relatively rare to see more than one woman on one of the ubiquitous comedy panel shows.  I don’t have the data to argue that many more women want to be or are funny and hard-working enough to be successful stand-up comedians and lack or don’t see the opportunity, but given societal and prevalent psychological bias it seems a likely explanation.

It would seem that across an alarming swathe of society, humour and the production of humour is not valued or even recognised in women.  If you think women aren’t funny and as a result ignore it when they are then what’s the incentive for women to be funny? Lo and behold you fulfil your own bias. Or you try to. if you hold the ridiculous opinion that women aren’t funny and as proof try to point out a non-existent lack of funny women then by your own logic you only have yourself to blame. Luckily there are women who defy this societal bias to produce excellent comedy.

Research shows humorous items are often remembered more successfully, in a phenomenon known as the humour effect. For example in one study (linked to already in these ramblings) related to providing funny captions, the items judged as funnier were remembered better. The analyses also provided evidence for a humour-based retrieval bias.  Individuals of both genders tended to misattribute humorous captions to male writers. This was true both for misremembering captions whose author’s sex the participants knew and for when participants were only guessing the sex of a caption author. So again it’s not that women can’t or aren’t being funny, it’s that due to existing societal bias, when they are you don’t remember or worse, you remember the humour and think it was a man that did it. Again you only have yourself to blame for thinking there are no funny women. “I don’t remember ever doing this!” you might shout. Quite.

The Guff at the Long-Awaited End

Ultimately there appears to be no strong argument that women can’t be funny or aren’t funny or aren’t as funny as men.  If you think there are, then you are contributing to the biased social and psychological forces that contrive give that appearance.  This isn’t surprising and I’m sorry if any of this has come across as patronising.  I don’t think that people who hold that opinion have even though about it that much other than as a subtle impact of prejudice. Then why bother taking-apart the arguments behind women being “not funny” at all? To paraphrase Josh Whedon, “I’ve got a theory, it could be bunnies…”

 

Medicus Ex Machina: Is the sonic screwdriver in Doctor Who a deus ex machina?

Let's hope they don't slash the special effects budget too much.

Let’s hope they don’t slash the special effects budget too much.

I like Doctor Who. “I am getting a bit fed up of the sonic screwdriver being used as a deus ex machina.” Is what I said in a brief fit of being wrong after watching a recent episode. I wasn’t wrong about me being fed up. I am capable of identifying my emotional state at least 20% of the time. I was wrong about the use of one of fiction’s most popular Time Lord’s favourite sonic tools. Yet that the sonic screwdriver gets used as a deus ex machina is one of the most common arguments involving the noise-based lock pick. So much so in fact that you might think that the people using the phrase think that the small amount of incorrectly used Latin will act as a deus ex machina in their argument and automatically solve any logical problems their point has. Quod erat demonstrandum.  However it is true that this literary device can be seen as lazy writing, leaving audiences unsatisfied. So what is a deus ex machina, is The Doctor’s sonic screwdriver a good example of one and if it is; why is the use of a deus ex machina problematic?

Doctor Who is a British science fiction programme produced by the BBC about an alien known as The Doctor who can travel through time and space.  It’s been going a little while and a couple of people watch it. The sonic screwdriver, first introduced to the programme in 1968, is a tool commonly used by The Doctor. It is multi-functional, with the most common use being as a lock pick (unless the lock is wooden or a deadlock seal because of rules). To this date the sonic screwdriver has been used to heal injuries, modify phones, scan and identify objects , probe another’s physiology, fix barbed wire, redirect the teleportation of the mayor of Cardiff, cut or burn substances, remotely control a time machine, summon a flying shark and generally put devices made by Apple to shame. This list is by no means exhaustive. Chances are if The Doctor comes across a problem, he’ll reach for his sonic screwdriver. Screwdrivers are cool.

Despite being so obviously useful (or because it was so obviously useful) the sonic screwdriver was briefly written out of the series in 1982. This was done on the instructions of the show producer John Nathan-Turner, arguing that such a device, which could help the main character out of almost any situation, was limiting to the script. It would become boring to the viewers if in response to any obstacle, the solution was always to produce this magic wand. Conversely if the screwdriver wasn’t used in response to a problem, pedantic viewers may be justified in asking why The Doctor didn’t just use one of the many known functions of this handyman’s dream tool. Luckily pedantic science-fiction fans are rare. Rare in the whole of the known universe that is. It is this omni-usefulness that has led to fans of the show to complain that the screwdriver is used as a deus ex machina.

A deus ex machina, literally a “god from the machine”, is a plot device whereby an apparently unsolvable problem is suddenly or abruptly solved, with the contrived and unexpected intervention of some new event, character, ability, or object. The potential original use of the phrase is from Horace’s Ars Poetica. Horace argued poets should never resort to a god from the machine to solve their plots. This more literally referred  to a crane or device used by actors playing gods in Greek tragedies being lowered onto or lifted up through the stage through a trap door.

There are a number of requirements for a plot development to be categorised as a deus ex machina:

1.)    Deus ex machina are solutions. They shouldn’t make things worse. They can’t be twists that only change the understanding of a story.

2.)    The plot device must be sudden or unexpected. If the relevant item is featured or referenced earlier in the story, they will not change the course of the story at that point or even appear to be a likely solution to the problem they  eventually are a solution to.

3.)    The problem the deus ex machina solves must be otherwise unsolvable. If the problem could be solved with common sense or another simple intervention, the solution is not a deus ex machina no matter how unexpected it seems. It’s just a bit fancy and unnecessary.

Popular examples of deus ex machina in literature and film include the random rescue of hobbits by giant eagles in The Lord of the Rings and the sudden arrival of King Richard in Robin Hood: Prince of Thieves to shuddenly sholve all the heroe’sh problemsh.

A deus ex machina is usually criticised as undesirable in writing and often used to imply lack of imagination in the writer. Reasons given are that it acts as a sudden disregard for a story’s logic and can challenge the suspension of disbelief required for an audience to remain emotionally involved in a narrative. Elephants on unicycles. It is usually argued it is better for characters to have agency within a story. Characters should be responsible for events with identified skill-sets leading to a more likely and perhaps more palatable story conclusion. In turn this leads to possible acceptable uses of the deus ex machina as a device.  The powerlessness of the characters in a large and mysterious universe may want to be highlighted. Or the use of a deus ex machina might be funny or used to make some other point. This point may or may not exist until after the use of a deus ex machina has been pointed out the writer.

Sonicscrewdriver2010Perhaps surprisingly there has been little research investigating why deus ex machina are
experienced as unacceptable. I could not find any apparent examples when searching PubMed, PsycINFO (search engines for a certain type of scientific research paper) or Google Scholar and nothing turned up at the last minute to unexpectedly deliver any to me. Experiments with babies show they pay more attention to unexpected events inconsistent with their rudimentary understanding of the world. For example if they are shown a doll, a screen covers that doll and they see another doll place behind that screen, they look for longer at the rigged experimental outcome of there being only one doll when the screen is lowered than when there are two. Similarly babies are shown to look longer at a ball which appears to roll on its own than a ball that is rolled by a person. Neither of these really tells us anything about the use of deus ex machina in literature and in fact could be twisted out of recognition to support some theory that says people prefer unexpected events or solutions. Sadly these shoehorned studies do not suddenly save us in exploring why deus ex machina are generally unsatisfying in stories.

Deus ex machina are definitely undesirable in science. Scientists devise hypotheses, deduce implications for observations from them, and test those implications. Any explanation that invokes some mysterious, unexpected solution to a problem without reference to the internal logic i.e. established scientific laws of the universe, is not a scientific theory at all. Even Bayesian statistics or “inverse probabilities” which start with a prior distribution and makes assumptions about probability can be used to check scientific models.  Implications of assumptions of the model are compared to the empirical evidence.  If the model makes wild claims from unlikely data that doesn’t fit the existing “good” evidence then it is likely not an accurate model. I’m talking to you Andrew Wakefield. Wakefield being another person in this post that’s not a real doctor.

None of this however answers our original (and likely now nearly forgotten) question as to whether the sonic screwdriver is a deus ex machina. As hinted I would now argue that it isn’t.  It certainly would fit our second criteria in acting as a solution or a quick fix. Also the third criteria in that the problems may be unsolvable without the screwdriver . However it is certainly not unexpected. As Andrew Ellard, script editor on such popular television programmes as The IT Crowd and Red Dwarf has argued, The Doctor as a Time Lord is an alien with extremely advanced technology. Sufficiently advanced in fact to often appear as magic. The sonic screwdriver is an example of this. The fact that it has a lot of functions appearing for the first time in certain episodes is also in keeping with this.  You don’t use all the applications of your smartphone all the time. An episode where The Doctor lists every function of the sonic screwdriver, set in stone for the rest of the series’ lifetime would not be interesting. Unless the idea of a Time Lord-inspired Top Gear-style, “Top Screwdrivers” appeals to you.

The sonic screwdriver is used to solve realistic (locked doors, wounds and flying sharks) but dull problems. We don’t want our hero to spend an episode staring at a locked door, fiddling with his scarf. We want him to use his established technology to move through the story to the more interesting problems. The sonic screwdriver allows this. It is not a deus ex machina and if used responsibly and not too frequently it is not a problem. Also Doctor Who is a thoroughly enjoyable series and even if the sonic screw driver were an occasional deus ex machina I’m not sure it would make it any less fun. Even if you are a surprised baby.