Blue Tuesday: Is there too much work against Blue Monday?


This bear is leaving home because its owners believe that Blue Monday has a scientific origin. (Attribution)

Yesterday wasn’t Blue Monday. Or to use its full name, Blue Monday (A Normal Day Of The Year Which Was Rebranded Through Marketing With A False Veneer Of Misleading Science). Blue Monday (ANDOTYWWRTMWAFVOMS) became a “not a thing” which happens as a result of holiday sellers, Sky Travel, and public relations company, Porter Novelli, selling holidays and public relating. They invented a formula which supposedly calculates that the third Monday in January is the most depressing day of the year and stuck what looks like a scientist on the front to complete its fancy-dress costume of sexy fake science concept. Needless to say, the average mood of everyone is too complex a thing to calculate with the simple equation being touted. Saying it can is a horrendous misrepresentation of the scientific method, human emotions and mental health. The added scientist, Cliff Arnall, is not a doctor or a professor of psychology. Or of anything. Saying he is is…

It’s difficult to argue with the success of the Blue Monday (ANDOTYWWRTMWAFVOMS) idea as a piece of marketing. On the day itself, the number of companies, including charities, that use the term to promote their products or causes is vast. With the general theme of spending money to improve your mood, Blue Monday (ANDOTYWWRTMWAFVOMS) is used to sell pretty much everything; be that the holidays it was designed to sell, cars, chocolate or financial advice. Perhaps more subtly, some groups have tried to re-purpose Blue Monday (I’ll stop now). They argue that while the supposed science might be a gargantuan heap o’ nonsense, it can still be a day to consider and support those who are unhappy. In addition, a lot of people have put a lot of work into explaining why, as a scientific concept, Blue Monday has the same credibility has half a brick with a picture of Dr Emmett Brown sneezed onto it by a guinea pig. So much so, that the publication of pieces debunking the science of Blue Monday have become as much of a tradition as the shower of gaudy sadverts.


This dog is more scientific than the formula for Blue Monday. (Attribution).

For the last few years, I have gained the impression that the pieces attempting to counteract the Blue Monday information have become more common than the items using its selling power. If this was indeed the case, the main thing keeping Blue Monday alive would be the valiant efforts to kill it. This could be placed in the Venn diagram of ironic things and bad things. However, whether this is the case is far from decided. While I have seen the same claim from others, my perception that anti Blue Monday work is more common than pro Blue Monday work is just that, a perception. Perceptions are at risk of bias.

Confirmation bias would mean that I might be interpreting information in a way that confirms my pre-existing beliefs. All the evidence I’ve seen shows that confirmation bias exists. The Baader-Meinhof phenomenon (or frequency illusion) would mean something that’s recently been noticed by me, suddenly seems to occur at a greatly increased rate. Once you’ve noticed the Baader-Meinhof phenomenon, you’ll start seeing it everywhere. Finally, the perception that anti Blue Monday work is more common than pro Blue Monday work might be the result of an echo chamber. I’m more likely to associate (digitally or in the great outdoors) with people who hold similar points of view to me. I’ll therefore see opinions the same as mine with greater frequency, and if I’m not careful will come to believe that those opinions are the most common. Everything I’ve seen on Twitter confirms I’m right.

One potential antidote to the plethora of human bias is correctly analysed data. I didn’t have that, so I took to the internet. On 16th January 2017, I searched for the term, “Blue Monday” on Twitter. I didn’t specifically use the hashtag because I wanted to avoid people or organisations using it just to make their tweets more locatable on the specific day. On a separate note, SEX! I then counted the tweets that seemed to believe the effect of Blue Monday, the tweets that actively opposed the effect of Blue Monday, and the tweets that didn’t believe Blue Monday, but wanted to use it to at least gain some benefit. I did this until the total tweets I’d counted reached 100. To be counted, a tweet had to at least hint at belief in Blue Monday or otherwise. It couldn’t just spout a load of a nonsense about sofas and then end with a hashtag. I also did a similar thing with Google (incognito window to avoid the influence of my search history) to count sites, news items, blog posts etc. and place them in the same categories as were used for the tweets. This was also completed when the total links reached was equal to 100. I later checked the Google search o a separate device and found the resulting list to be practically the same.

The results can be seen below. In summary, the pro Blue Monday items were much greater in the number than the anti Blue Monday items. These were both much more prevalent than items trying to re-purpose the day. My perception was wrong, and unfortunately the work to demonstrate that the idea of Blue Monday is anti-scientific rubbish appears to still has some way to go.


Pie part showing the proportion of pro Blue Monday, anti Blue Monday and re-purposing Blue Monday items.


One thing to note however, was that out of the pro Blue Monday items, 72% were advertisements. As discussed, these would make the argument that it’s the saddest day of the year so why not buy chocolate/hair gel/happiness? It is unclear to what extent the people behind these believe that Blue Monday was a scientific concept. While their adverts vaguely hint at belief, it’s just as likely that the mention of Blue Monday and its supposed effects are being used as devices to enhance how noticeable their brand is on a specific day. An increasingly difficult task given how common the use of the Blue Monday “brand” is. It seems to me that an advert that went with something other than Blue Monday marketing on the third Monday in January would be the one to stand out.

I’m not sure why efforts to educate people as to the non-scientific origins of Blue Monday are not working or even if they are actually not working in the first place. As discussed, it’s possible people know all of this, but find the term useful for their purposes; whether these are charitable or otherwise. Indeed, some news outlets may be using anti Blue Monday work to join in and take advantage of the temporary interest while maintaining an appearance of credibility. There’s no point in having your cake if you can’t eat it.

Ultimately and unfortunately, it appears that not much can be done about the Blue Monday juggernaut. I might still hold out hope for those valiantly explaining the gibberish behind the claims and even for those re-purposing the day for more noble causes. Judging by the current proportions, these efforts need to increase or change their methods to become more effective. How? I don’t know, although at least I’ve got nearly a year to think about it.

How unreliable are the judges on Strictly Come Dancing?


That very clean glass wall won’t hold itself up. Photo by Dogboy82 – Own work, CC BY-SA 4.0,

Strictly Come Dancing, one of the BBC’s most popular shows involving celebrities moving in specific ways with experts at moving in specific ways while other experts check if they’re moving specifically enough contains certainties and uncertainties. We’re not sure who will be voted out in any particular week. We don’t know know what the audience are going to complain about. An injured woman not dancing! I was furious with rage! We do know that Craig Revel Horwood will use the things he knows to make a decision about whether he likes a dance or not while saying something mean. We can be pretty sure what Len Goodman’s favourite river in Worcestershire, film starring Brad Pitt and Morgan Freeman and Star Trek: Voyager character is. But can we be sure that the scores awarded by the judges to the dancers are accurate and fair?

In science, a good scoring system has at least three qualities. These include validity (it measures what it’s supposed to measure), usability (it’s practical) and reliability (it’s consistent). It’s difficult to assess the extent to which the scoring system in Strictly Come Dancing possesses these qualities. We don’t really know the criteria (if any) that the judges use to assign their scores other than they occasionally involve knees not quite being at the right angle, shoulders not quite being at the right height, and shirts not quite being able to be done up. As such, deciding whether the scores are valid or not is tricky. The scoring system appears to be superficially usable in that people use it regularly in the time it takes for a person to walk up some stairs and talk to Claudia Winkleman about whether they enjoyed or really enjoyed the kinetic energy they just transferred. In some ways, checking reliability is easier. Especially if we have a way to access every score the judges have ever awarded. And we do. Thanks Ultimate Strictly!

For a test to be reliable, we need it to give the same score when it’s measuring the same thing under the same circumstances. If the same judge saw the same dance twice under consistent conditions, we’d expect a dance to get the same score. This sort of test-retest reliability is difficult to achieve with something like Strictly Come Dancing. The judges aren’t really expected to provide scores for EXACTLY the same dance more than once. Otherwise you’d end up getting the same comments all the time; which would be as difficult to watch as the rumba is for men to dance. Ahem. However, you can look at how consistently (reliably) different judges score the same dance. If all judges consistently award dances similar scores, then we can be more sure that the system for scoring dancing is reliable between raters. If judges consistently award wildly different scores for the same dances, we might be more convinced that they’re just making it up as they go along, or “Greenfielding it” as they say in neuroscience.

To test this, all scores from across all series (except the current series, Christmas specials and anything involving Donny Osmond as a guest judge) were collated and compared. Below, we can see that by and large the judges have fairly similarly median scores (Arlene Phillips and Craig = 7, Len, Bruno Tonioli, Alesha Dixon and Darcey Bussell = 8). The main differences appear to be in the range of scores with Craig and Arlene appearing to use a more complete range of possible scores.


Box plot (shows median scores, inter-quartile ranges, maximum and minimum scores for each judge)

A similar picture is seen if we use the mean score as an average, with Craig (mean score = 6.60) awarding lower scores than the other judges, whose mean scores awarded range from 7.05 (Arlene) to 7.65 (Len and Darcy). Strictly speaking (ironically) we shouldn’t be using the mean as an average for the dance scores. The dance scores can be classified as ordinal data (scores can be ordered, but there is no evidence that the difference between consecutive scores is equal) so many would argue that any mean value calculated is utter nonsense meaningless not an optimum method for observing central tendency. However, I think in this situation there are enough scores (9) for the mean to be useful; like the complete and utter measurement transgression that I am. At a first glance, these scores don’t look too different and we might consider getting out the glitter-themed cocktails and celebrating the reliability of our judges.


Box plot (shows median scores, inter-quartile ranges, maximum and minimum scores for each judge)

In order to test the hypothesis that there was no real effect of “judge” on dance scores, I did a statistics at the data. In this case a Kruskal-Wallis test because of the type of measures in use (one independent variable of ‘judge’ divided into different levels of ‘different judges’ and one independent variable of ordinal data). And yes, it would be simpler if Kruskal-Wallis was what it sounded like, a MasterChef judge with a fungal infection. Perhaps surprisingly, the results from the test used could be interpreted as showing that the probability that the judge doesn’t affect the score was less than 1 in 10,000 (P< 0.0001). The table below shows between which judges the differences were likely to exist (P< 0.0001 for all comparisons shown as red).


Table showing potential differences between judges in terms of scores they give to dancers

Thus it would seem that the probability that Craig isn’t have an effect on score is relatively small. In this instance, Craig appears to be awarding slightly lower scores compared to the other judges. The same could be said for Arlene, except if she is being compared to Craig, where she seems to award slightly higher scores.

So it transpires that the scores on Strictly Come Dancing are indeed unreliable. Arlene did and Craig is throwing the whole system out of alignment like a couple of Paso Doble doing a Jive at a Waltz. Tango!

Possibly not though, for a number of reasons. 4.) I am clearly not an expert in statistics, so I may have just performed the analysis incorrectly. 2.) If differences do exist, they are relatively subtle and are likely to be meaningless within individual shows, only coming to light (and bouncing off a glitter ball) when we look across large numbers of scores. That is to say, that a statistical difference may exist, but this difference likely makes no practical difference. A.) At least it’s not The X Factor.

Keep dancing. And doing maths.

Marmite: checking whether it really is a love or hate relationship


What do you get for the person who has everything? And who you also hate? By Gilda from London, UK (Marmite pop-up shop Uploaded by Edward) [CC BY-SA 2.0 (, via Wikimedia Commons

Jokes about Marmite; most people don’t have strong responses to them. This is unlike the recent news that as a result of potential Marmite price rises, one supermarket might have stopped stocking it. It was generally reported that people were furious with rage, which continued when the dispute was resolved approximately 24 hours later. And because it was opinions on the internet, people said that those opinions were wrong. And because it was definitely opinions on the internet, people went out of there way to say how little they cared about the issue. Whatever your thoughts regarding this particular spread, it’s difficult to deny that the specifics of its one “you either love it or hate it” advertising slogan have been pervasive. So much so that the name ‘Marmite’ is almost synonymous with something which polarises opinion. It’s a real Marmite situation. But what’s the question at the end of the first paragraph that reveals what the rest of the blog post is about? And is it true that people either love or hate Marmite, with no place for yeasty apathy? Luckily, surveys, maths and toast could be used to check.

The information regarding people’s opinion on Marmite was taken from the YouGov UK website. According to this website, YouGov survey approximately 5 million online panellists from across 38 countries including, among others, the UK, USA, Denmark, Saudi Arabia and Europe and China. They claim that their panellists are from a wide variety of ages and socio-economic groups, allowing them to create online samples which are nationally representative. The UK panel, from which the data used here were taken, includes more than 800,000 people. So essentially I went to the YouGov UK website, searched for ‘Marmite’ and took the numbers regarding what the people sampled thought of it. And ate some toast.


Figure 1. Numbers of people with certain opinions regarding Marmite.

Figure 1 shows the number of people who reported that they loved, liked, felt neutral about, didn’t like or hated Marmite. The actual YouGov website actually shows picture representations of heart, smiley jaundice face, straight-mouth jaundice face, sad jaundice face and angry rosacea face that I interpreted to mean the aforementioned categories. I’m good at emoticons; sideways punctuation smiley face.

You can see that the two tallest bars are for Love It (3,289 people) and Hate It (2,235 people), followed by Like It (1,870 people), Neutral (1,067 people) and Don’t Like It (909 people). However, these aren’t necessarily the groups we’re interested in. The claim is that people either love or hate Marmite. Figure 2 shows the number of people of love or hate Marmite (Love It plus Hate It) and the number of people who don’t feel that strongly about it (Like It plus Neutral plus Don’t Like It). Of the two populations, Love It or Hate It (5,524 people) is larger than Don’t Feel That Strongly (3,846 people). This is perhaps shown more intuitively in Figure 3, where it is depicted that compared with people who don’t feel that strongly about Marmite, 17.9% more people love or hate it.


Figure 2. Numbers of people who love or hate Marmite and who don’t feel that strongly.

The presence of a group of people that don’t feel that strongly about Marmite would seem to contradict the idea that there are only two populations with respect to Marmite desire. However, it could be argued that we are really examining the effect of Marmite on Marmite apathy. Does Marmite have an effect on whether you love or hate it or don’t feel that strongly about it? What is the probability of this many people loving or hating Marmite if Marmite doesn’t make you love or hate it?


Figure 3. Proportions of people who love or hate Marmite and who don’t feel that strongly.

As this was a single population (people who give their opinions to YouGov UK) and we are looking at two possible categories within that population (Love It or Hate It and Don’t Feel That Strongly About It), I used a binomial test to determine the probability that there was on effect of Marmite on Marmite emotiveness. This demonstrated that the chances of this many people loving or hating Marmite if Marmite doesn’t make you love or hate it was at least 1 in 100,000,000 (P<0.00000001). Depending on your threshold for such things, this would seem to be reasonable argument that Marmite has a tendency to make people feel strongly about it.

There are some potential problems with this reasoning. Firstly, the analysis could be wrong. I’m far from an expert in statistics, and it’s entirely possible that I performed the wrong tests or interpreted the results incorrectly. While eating toast.

Secondly, these data only covers people who provide information to YouGov UK. While YouGov UK would certainly claim that they are representative of the whole population, we can’t know this for sure. The same YouGov UK page claims that being a Marmite customer correlates with having gardening as a hobby and being a customer of Waitrose, and I can count on the finger of know hands the number of times I’ve seen someone pruning the roses, while eating a Marmite sandwich and some Waitrose pickled quail eggs. This is a real product, although I think it’s cruel to pickle quails. Although, that’s not really the issue. Ultimately, there might be something different about the people who report to YouGov (such as a tendency to feel strongly about yeast-derived devil’s treacle) compared with the general population, and we can’t know that just from these results. Basically we’re saying that these results may be influenced by self-selection bias.


Well, what would you have put a picture of? Photo licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

When people are in groups, their opinions and behaviour have a tendency to be more extreme than when they are acting as individuals. This is known in psychology as group polarisation. For example, if you have racist and sexist attitudes and join a group with racist and sexist attitudes, your racist and sexist attitudes will worsen; the group influence will trump your own lesser tendencies. Ahem. This process has also been seen to occur through social media, even though people aren’t physically interacting as groups. Observed over time on Twitter, discussion regarding political issues with like-minded individuals becomes more homogeneous and more extreme. In this instance, the hypothesis is that people identify with others who have a similar opinion to theirs regarding Marmite, and over time polarise that existing opinion until they state that they love or hate it. In reality, the truth is closer to a more moderate Marmite approval or disapproval. However, the online poll doesn’t involve group discussion and polls are completed anonymously, so even if people are basing part of their social identity on how much they enjoy a salty brown loaf goo, group polarisation seems unlikely.

Of relevance here may be a type of response bias called, ‘extreme responding’. This is a tendency for people to select the most extreme responses available to them and usually depends on the wording of the question, but has been linked to age (younger = more extreme), educational level (lower = more extreme) and cognitive ability (lower = more extreme). We don’t know how the poll was worded or the composition of the poll responders, so speculation as to the extent of extreme responding is fairly pointless even though it DEFINITELY HAPPENED!

Alternatively, the well-known advertising for Marmite may have introduced another kind of response bias called ‘demand characteristics’. Here, participants in an experiment or survey change their response because they are in an experiment or survey. This is assumed to be an attempt to comply with what they believe the aims of the experiment to be. Respondents asked about Marmite may be more likely to give an extreme response based on the advertised ‘consensus’ that people either love or hate Marmite. And so the opinion spreads like a pun-based analogy.

Finally, it could actually be the case that Marmite has such a distinct flavour that people really are more likely to have an extreme response than an ambivalent one. Although at this stage you may have stopped caring. I prefer jam anyway.

How much wood WOULD a woodchuck chuck if a woodchuck could chuck wood?


Wooducks are sometimes known as , “whistle pigs”. Why are they even occasionally known as woodchucks? WHISTLE PIGS! Picture by D. Gordon E. Robertson (Own work) [CC BY-SA 3.0 (

She sells seashells by the seashore? Does she? You wouldn’t think there would be demand for that in a place which is essentially a continuous supply of abandoned bivalve property. Or is that the warning? I suppose you never hear, “She sold shed-loads so she scandalously stowed some offshore, scheming sneakily and salvaging a steady supply of surreptitious savings. The sod.” Although, you do hear about how she (Mary Anning) was not eligible to join the Geological Society of London despite her several important paleontological finds and expertise on account of being a woman. The sods. Seventy seven benevolent elephants. Sounds like quite a specific zoo. How much wood would a woodchuck chuck if a woodchuck could chuck wood? Sounds like a question that can actually be answered. And as sure as someone who knows that the sign language equivalent of a tongue twister is called a finger-fumbler will try to shoehorn that fact into a barely-related conversation, a question that can actually be answered will eventually get answered.

A quick search finds that three potential answers to the woodchuck conundrum are already in existence. One traditional reply holds that, “A woodchuck would chuck as much wood as a woodchuck could chuck if a woodchuck could chuck wood”. While clever, tjis doesn’t really answer the question that is being asked. It’s specific in what it’s saying, but vague in how it’s of any use in the real world; like a George Osborne budget. The second potential answer is that a woodchuck can’t chuck wood. This too is not good enough in that it essentially denies the existence of a problem that it should be trying to solve. Like a George Osborne budget. The third potential answer is much better. In 1988, Richard Thomas (a wildlife technician, which is probably a thing) calculated that if a typical woodchuck burrow is 7.6 to 9.1 metres long, and the volume of dirt the woodchuck had to move to dig that burrow was translated into an equivalent volume of wood, then the woodchuck could move approximately 323 kg of wood. Much better; or at least it would be if we were asking, how much wood could a woodchuck move if the ground was made of wood? As it is, we’re left with some unsubstantiated numbers which nobody can really explain the relevance of. Which reminds me of something.

So let’s start by defining our terms. To “chuck” can mean several things, although we can safely discount most of them. It’s unlikely that the rhyme is about a woodchuck ending a relationship with some wood. Even if it were, I couldn’t find anything about paraphilia in rodents or if they could choose to abandon the object of their paraphilia, so it’s unlikely I could get an answer to that. Now if you’ll excuse me, I need to have my computer destroyed. Similarly, we can assume that we’re not trying to work out how much tree a woodchuck can vomit, as like rats and indeed most rodents, woodchucks can’t vomit (although there is one report of them vomiting due to red squill poisoning). Top tip: sit behind the woodchuck if you go on a rollercoaster. Overall, it’s likely that we want to know how much wood a woodchuck could throw if it was able to.


Some tosser. Photo by Cory Hughes [Public domain], via Wikimedia Commons

I was unable to find any reports of woodchucks throwing anything, never mind wood, so decided take the data from human wood chuckers and extrapolate. Arguably, the best example of humans throwing large amounts of wood in an environment where this bark flinging is measured can be seen with the Scottish athletic feat of tossing the caber. Here the tosser (the definite proper technical term, so shut up) attempts to throw a large wooden pole (typically made from larch wood) so that it turns end over end in a straight line. The straightest end over end toss scores the most points. It’s essentially extreme timber filing. A typical caber is 5.94 metres tall and weighs 79 kg. According to the Guinness Book of World records, the largest caber ever tossed was 7.62 metres long and weighed 127 kg. This is pretty impressive, but I would argue that for “how much wood” we need a large amount of wood to be chucked several times in a set period.

The most caber tosses in three minutes is 14 and was achieved by Kevin Fast (a strong reverend) in Canada in 2013. Kevin used two 5.02 metre long cabers, each weighing 41.73 kg. Unsurprisingly, Kevin is famous as a multiple Guinness World Records title holder. Woodchucks, also known as groundhogs, are famous for other things. Woodchucks, also known as groundhogs, are famous for other things.

Using a fairly basic equation for Power (Work/Time, where Work = Force x Distance), we can work out that in completing his one caber’s worth of his magnificent feat of tossing, Kevin transferred 55.29 Watts.

Height lifted (Kevin’s height) = 1.75 metres
Force (Mass [41.73 kg] x Gravity [9.80665 metres per second2) = 409.23 Newtons
Time (180 seconds/14 tosses) = 12.86 seconds

Power = (409.23 x 1.75)/12.86
Power = 55.69 Watts.

In woodchucks, the forelimb (their woodchuck arms for woodchucking) contain 44 muscles, with two groups, the lattissimus dorsi and pectoralis superficialis being the largest. Apparently, woodchucks have great pecs. In being specialised for digging, the highest individual power available from woodchuck forelimb muscles is 4.0 Watts. The height of a woodchuck is 0.8 metres and we’ll give our marmot friend the same amount of time to chuck his wood as we gave Kevin.

So, if Power = (Force x Distance)/Time

Then, 4.0 = (Force x 0.8)/12.86

And Force = (4.0 x 12.86)/0.8 = 64.3 Newtons = 6.56 kg.

Adjusting for scale, this means the best woodchuck woodchucker can throw a 6.56 kg of 1.99 metres length 14 times in three minutes. This is both an answer and an adorable image.

To check our calculations, we can work out maximum woodchuckage in another way. We know that Kevin Fast weighs 136.078 kg and can therefore estimate his lean body mass to be 74.57 kg. I used the Hume Formula for this. Other formulae are available, although all are just estimates and in fact, none of them are probably suitable for a man such as Kevin who is likely more muscular than the average pastor. Since skeletal muscle is, on average, 54% of lean body mass, we can estimate that Kevin has 40.72kg of muscle.

For woodchucks, their body mass is typically about 3.13 kg in the Spring and 4.20 kg in the Summer. a woodchuck definitely wouldn’t stand for a ludicrous “beach body” advertising campaign. Given that in Spring, adipose tissue is 40.31% of a woodchuck’s body mass (56.10% in Summer) and skeletal muscle is 52.41% of lean body mass (56.10% in Summer), then a woodchuck will typically have 0.98 kg of muscle (1.00 kg in Summer).


Trying to chuck a whole tree might be a bit hopeful. By D. Gordon E. Robertson (Own work) [CC BY-SA 3.0 (

So in the Spring, Kevin has 41.09 times the muscle mass of a woodchuck, and 40.27 times the muscle mass of a woodchuck in the Summer. This is assuming that as non-hibernating mammal, Kevin’s weight and adipose proportions don’t fluctuate as wildly as woodchuck’s do. Muscle strength is proportional to cross sectional area, so it is perhaps more relevant to state that woodchucks have 6.41 times and 6.35 times smaller cross sectional area of muscle than Kevin in Spring and Summer, respectively. Correspondingly, this means that a muscular woodchuck vicar could toss a 6.51 kg caber of 1.98 metres in length in the Spring and a 6.57kg caber of 1.99 metres length in the Summer. In the Winter, it would probably be asleep. You’ll note that this is satisfyingly similar to our original estimate.

So in conclusion, depending on the season, a very strong woodchuck member of the clergy could chuck a 6.6 kg stick of wood that was nearly 2 metres long 14 times in three minutes. In addition, if on that occasion it saw its shadow, it would mean six more weeks of maths.

How tall was Princess Jasmine’s mother?

Disney's Festival of Fantasy Parade at Magic Kingdom, Princess Garden Unit

For who could ever learn to love… metrology. Photo by Jennifer Lynn (


We can learn a lot of important lessons about genetics from Disney. For example, from The Muppet Christmas Carol (released in 1992 by Walt Disney Pictures) we learn that the being a frog genes are on Kermit’s Y chromosome. Thus when Kermit and Piggy have children, the boys are all frogs and the girls are all pigs. We also learn that Muppet frogs and pigs are close enough as species to interbreed, although we can’t comment how close without observing the fertility of their offspring. These are slightly more confusing lessons. I also assume that the reason that Muppet Tiny Tim couldn’t walk well, was that he was actually still a tadpole and just had pushy parents. After all there’s only one more sleep until metamorphosis.

The biological processes behind Beauty and the Beast are slightly more difficult to work out. Mrs Potts is a teapot and her son, Chip, is a cup. We know that the curse that transformed the servants of the castle into theatrical IKEA stock had been in place for 10 years. Chip seems younger than this. It should be hoped that from the moment they were transformed, the staff didn’t age and that young Chip was one of those who the witch literally made a mug of when she cast her spell. Otherwise we have to consider the idea that a teapot got pregnant and gave birth to a cup. A tale as old as time.

The biological variation within that happy crockery family is far from unique within the world of Beauty and the Beast. A person conducting a preliminary comparison of Belle and her father, Maurice, would be hard-pressed to find much of a family resemblance. Belle is tall and slim, while Maurice more closely resembles an owl that rolled itself in pastry and finished the disguise with a moustache it fashioned from leftover rodent hair. I’m not judging. I have a similar body type. The same could be said of Aladdin’s the Sultan and his daughter, Princess Jasmine. Again Jasmine is tall, with barely enough abdomen to contain her colourful Disney internal organs, while the Sultan is practically spherical and would struggle to see over a crouching slug while he was wearing platform shoes. For this to work, Jasmine and Belle’s mothers must have been 10 feet tall and essentially boneless. Either that, or Disney fathers are constructed entirely from recessive genes.

We don’t have to guess at the heights of Jasmine and Belle’s mothers. These can be calculated from the heights of the princesses and their fathers. Within medicine, a person’s adult height can be estimated from their parents’ heights, using an estimation called the mid-parental height. The calculation is as follows:

Mid-parental height = Mother’s height plus Father’s height (plus 13 for boys, -13 for girls) and divide by two.

NB: Heights are in centimetres (cm).



This method isn’t perfect. For example, it doesn’t allow for extremes of parental height. Very short or very tall parents tend to have offspring of a less extreme height through simple regression to the mean. This wouldn’t be predicted by the mid-parental height estimation. However, it is a useful tool to help assess an individual child’s growth and to calculate the height of fictional princesses’ mothers. By rearranging the equation, we find that

Disney Mother Height = (Disney Daughter Height x2) plus 13, minus Disney Father Height

This idea can be tested in cases where we see both of the parents and the daughter e.g. Aurora in Sleeping Beauty and Rapunzel in Tangled.

Unfortunately, Disney doesn’t provide us with the vital statistics of the characters. Which is pretty thoughtless of them. As a result, we’re going to have to make some crude estimates. In Sleeping Beauty, Aurora’s father can be seen holding a wine bottle. Given that he ended up sleeping for 100 years, it must have been some pretty strong stuff. Or a witch did it. A standard wine bottle is approximately 30.5 cm and from a couple of stills from the film, Aurora’s father looks to be about 5.8 wine bottles tall. As a side note, if you start to measure you’re height in wine bottles, it might be time to take out the recycling. This may not be the least of your problems. We can therefore estimate Aurora’s father to be 176.9 cm (5 ft 10 inches) tall. From more stills, Aurora’s mother looks to be the equivalent of Aurora’s father’s head shorter than Aurora’s father. A human is roughly 7.5 heads tall so Aurora’s father’s head must be 23.6 cm, which makes Aurora’s Mum 153.3 cm (5 ft 1 inches) tall.

From the film, Aurora comes up to her father’s shoulders and so appears to be about 153 cm (5ft) tall; similar to her mother. Using the mid-parental height equation, Aurora’s height is estimated at 158.6 cm (5ft 2 inches). So we’re about 5 cm off. However, in Sleeping Beauty, Aurora is 16 years old. A woman’s final adult height can be reached at around 18 years of age, so perhaps it’s not impossible for her to grow those last 5 cm, especially if she manages to eat well and get plenty of sleep. This probably isn’t a problem.

We can test our height prediction in a similar fashion with Rapunzel from the film Tangled. In one scene, Rapunzel’s mother is observed holding a book. If we assume the book to be one octavo (a unit of measurement which should be familiar to Terry Pratchett fans, and is approximately 15.3 cm) and we can see that Rapunzel’s mother is about 10.5 books tall. We can guess Rapunzel’s mother is 160.7 cm (5ft 3 inches) tall and that her librarian is messy. Rapunzel’s father is roughly another book taller than Rapunzel’s mother, making his height 176.0 cm (5ft 10 inches).

From pictures, Rapunzel is about one third of her mother’s head shorter than her mother. If we estimate her mother’s head to be 21.4 cm long, this gives Rapunzel’s height as 153.6 cm (5ft). The mid-parental height calculation predicts Rapunzel’s height as 161.9 cm, so we’re about 7 cm off. As with Aurora, Rapunzel may still grow a bit more (although she’s 18 years old in the film) and we might argue that she is shorter due to being mistreated and held captive in a tower. Perhaps the weight of all that hair is compressing her spinal column and making her shorter. Overall, our height estimates using mid-parental height are within 10% of what we see on screen, so should be adequate for estimating the heights of Jasmine and Belle’s mothers.


Quite a bitey tape measure. Photo by Tony Hisgett (

In Aladdin, at some stage, both Jasmine and the Sultan are shown next to their pet tiger. A male tiger can be 110 cm from the ground to the shoulder. Judging by where the tiger comes up to on Jasmine, we can estimate her to be 170 cm (5 ft 7 inches) tall. Similarly, we can estimate the Sultan to be 130 cm (4 ft 3 inches) tall. The tiger method for measuring height is an exciting one, but probably won’t catch on with parents. It’s difficult to see the pen where you’ve marked-off your child’s height on the side of a tiger. Also, it’s a tiger. Using the Disney Mother Height Calculator, Princess Jasmine’s Mum’s height is estimated to be 223 cm (7ft 4 inches).


To put this height into context, the World’s tallest living woman, Siddiqa Parveen is estimated to be 7ft 8 inches tall (2.1 tigers, 15.3 books or 7.7 bottles of wine). Although of course she isn’t animated. Or fictional. And we cannot work out Siddiqa Parveen’s mother’s height using the Disney Mother Height calculator. That would be a ridiculous waste of time. There are other reasons.

Now it’s Belle’s turn. Luckily, in Beauty and the Beast, both Belle and her father get attacked by wolves. Luckily for us anyway. Like most wolf attacks, it’s shown as a bad thing in the story. An adult wolf is approximately 83 cm from ground to shoulder. In terms of height, Belle looks to be a double wolfer, coming in at 166 cm (5 ft 5 inches) tall. Belle’s father, Maurice, is approximately 1.67 wolves tall and therefore has a height of about 138.6 cm (4 ft 7 inches). Using the Disney Mother Height Calculator, Belle’s Mum’s height is estimated to be 206.4 cm (6ft 9 inches).

Of course, all of this assumes that Maurice was Belle’s biological father and that the Sultan was Jasmine’s. It’s likely that they were. Belle had a whole library at her disposal, so you’d think she’d have the necessary information to hand to work out if her father wasn’t related to her. Although she may have been to busy buying new furniture. All of hers recently turned into people after all.


How bad is Stormtrooper aim exactly?


A Stormtrooper gun. It’s possible they don’t know what these are for. Photo by Roy Kabanlit.

For some unknown reason, I’ve been thinking a lot about Star Wars recently. Going forward, I’ll assume you’ll be familiar with the events and characters of at least the first six films. If not, what have you been doing? Living in a recent, recent time in a galaxy that’s very close to here? Broadly speaking, this post inevitably contains minor spoilers for Episodes II−VI of the Star Wars films. If you haven’t seen them, inexplicably want to find out about Stormtrooper aim and don’t mind knowing some plot details, then feel free to read on.

There are some characteristics of characters or groups of characters within the Star Wars register that are widely held to be fact. This may be despite them not being explicitly stated within the films. Red lightsabers are for the evil, Jar Jar Binks is rubbish and Stormtroopers have worse aim than a urinating drunk man in a vibrating chair trying to hit a toilet located on The A-Team van.

Can Stormtroopers really be that bad at shooting? There is an assumption that the Empire want effective troops to maintain their evil hold of the galaxy. Surely they get some training in marksmanship rather than signing up, being given armour that doesn’t even protect against Ewoks (weirdly, the autocorrect on my phone turns ‘Ewoks’ to ‘useless’) and told to, “go forth and do bad stuff.” In fact, Obi Wan Kenobi in Episode IV: A New Hope comments, “only Imperial Stormtroopers are this precise” when examining some blast marks on a massive used droid dealership tank. So Stormtroopers have a reputation in the Star Wars galaxy for good aim. There are a number of explanations for this:

  • Stormtroopers have good aim compared to everyone else, who is really awful (maybe the Star Wars galaxy is windy, wobbly or makes everyone slightly drunk for reasons)
  • Stormtroopers do have rubbish aim, but are good at marketing (history may contain examples where propaganda has been used by states with less than altruistic intentions)
  • Stormtroopers do have rubbish aim, but everyone is concerned about their self-esteem and tells them otherwise
  • Stormtroopers normally have good aim, but during the events of the Star Wars films develop bad aim; almost as if the Imperial Stormtrooper Marksmanship Academy has informed its troops that they should imagine themselves as antagonists in a series of films that won’t progress very far if the protagonists keep getting shot

Seems about right. Photo by The Conmunity – Pop Culture Geek from Los Angeles, CA, USA.

Why would Stormtroopers’ aim be so bad? Is it their tools? This seems unlikely given that non-Stormtroopers steal Stormtrooper weapons and seem to have no issue with shot accuracy or a gaining a reputation for terrible aim. Perhaps their helmets obscure their vision and make aiming difficult. Possibly, but the Stormtrooper helmet eye holes don’t appear to be any smaller than human spectacles, which can’t be said to obscure vision. Not if they’re doing their job. They are tinted though, which may make aiming difficult when in badly lit conditions and make Stormtroopers look like posers when wearing their helmets indoors.

Perhaps Stormtroopers are just human. In spite of the impression given to us by world events, it is actually quite difficult to get one person to actively shoot to kill another person. During World War I, British Lieutenant George Roupell reported that the only way he could get his soldiers to stop firing above their enemies’ heads was to beat them with his sword while ordering them to aim lower. Later reports of Lieutenant Roupell winning a medal for being a slightly charming human being may have been an exaggeration. Similarly in World War II, US Brigadier and army analyst S.L.A. Marshall reported that during battle, only 15−20% of soldiers would actually fire their weapons. This should perhaps be considered sceptically, as later analysis hints that Marshall may have fabricated at least some of his results. A 1986 study by the British Defense Operational Analysis Establishment’s field studies division found that in over 100 19th- and 20th-century battles, the rate of killing was actually much lower than potentially should have been the case given the weapons involved. Some reports from the Vietnam War state that the average US solder fired approximately 50,000 rounds before they hit their target.

Lieutenant Colonel Dave Grossman claims that psychologically this is a result of soldiers choosing to posture (falsely display active combat to attempt to intimidate or deter the enemy) rather than fight, flee or submit to the enemy. In this regard, posturing is chosen as the least costly (psychologically, socially and physically) of the four possible options available to a soldier in combat. In terms of Star Wars, we know that the Empire is not adverse to a bit of posturing with their giant shooty snow dinosaurs, Nazi-chic uniforms and ‘tis no moon space stations. Perhaps the legendary terrible aim of the Stormtroopers is simply due to a human tendency to try and look scary rather than murder another individual. Should they be renamed as ScaryLookingHugtroopers?

To even start to get an answer to this we need to at least get some idea of the accuracy of Stormtrooper aim. Luckily, counting exists and can be used get numbers for percentage purposes. In order to calculate the Stormtrooper hit rate, the number of shots fired by Stormtroopers in Star Wars Episodes II-VI (the ones with Stormtroopers and that aren’t currently in cinemas) was counted. The number of times that the Stormtroopers hit what they were aiming at was also counted.


Let the Wookie in. Photo by William Tung from USA (SWCA – A Stormtrooper and Chewie) [CC BY-SA 2.0 (, via Wikimedia Commons

Stormtroopers were identified as such by their armour. Han Solo and Luke Skywalker were not counted as Stormtroopers when they were wearing said armour as a disguise. The Stormtroopers wearing the special armour in Episode V: The Empire Strikes Back (the ones dress as Arctic pepper pots) and in Episode VI: Return of the Jedi (the ones with helmets like sad bulldogs) were counted as Stormtroopers. A hit was counted as such when a Stormtrooper launched or fired a projectile that hit what the Stormtrooper was judged to be aiming at. A miss was defined as when that stuff happened but the projectile didn’t hit the target. When the final resting place of a projectile was not seen on screen, it was presumed to be a miss, unless there was some kind of sound effect that hinted otherwise (like a character saying, “Ouch, this laser wound is relatively painful”). Only shots fired from hand weapons were counted. Shots fired from vehicles were not counted as some sort of computer-aided guidance may have been used. We know they have that and that’s it’s not as good as trusting your feelings when you’re a bit forcey.

It should be noted that the resulting Stormtrooper accuracy ratings will be rough estimates only. It’s quite difficult to count shots fired in the reasonably frenetic action scenes of these films and it is likely that the number of shots fired here is an underestimate. Also it’s not real and this may be a waste of time.

Table 1 illustrates the accuracy of Stormtrooper aim for each of the films and the overall Stormtrooper shot accuracy rate across all of the films. Stormtrooper aim appears to be most accurate (37.4%) in Episode III and least accurate in Episode IV. Otherwise Stormtrooper accuracy is reasonably consistent at around 7% across the other episodes with an overall accuracy of 9.8% calculated across all of the films. Of note is that Episode III is the only film where Stormtroopers can feasibly be argued to be on the side of good. It would seem that it’s being evil that’s bad for your shooting accuracy.

Table 1: Stormtrooper shot accuracy in the Star Wars films.

Table 1

However, many have noted that during the events on the Death Star in Star Wars Episode IV: A New Hope, the plan was to let Princess Leia and company escape so that the Empire could locate the Rebels’ headquarters and blow it up along with the planet they were on. The Empire is apparently not that concerned about conservation. Or about killing lots of people. As such, it is likely that the Stormtroopers firing on the protagonists had been ordered not to kill their escaping prisoners. This may change the accuracy rate for this film as we suddenly have to count every miss in these sequences as a hit. So the space abacus (calculator) was broken out again and the Stormtrooper shot accuracy rate for Episode IV and the overall Stormtrooper shot accuracy was recalculated. Table 2 shows these new figures.

Table 2: Stormtrooper shot accuracy in the Star Wars films (assuming they were aiming to miss during those bits on the Death Star in Episode IV).

Table 2

Suddenly, the accuracy of Stormtroopers doesn’t look so bad. In order to determine if this is the case, it is necessary to compare these rates with others. Ideally, this would be with other accuracy rates from the Star Wars films (probably not Greedo’s) in order to remove any confounding windy, wobbly drunken influences that the Star Wars galaxy might have. I didn’t do this for reasons of time, illness, difficulty and laziness. However, we do have some shot accuracy rates from our galaxy. These are shown in Figure 1.

Figure 1

Figure 1. Comparison of Stormtrooper shot accuracy with real-world examples.

We can see here that Stormtroopers don’t fair too terribly, with greater shot accuracy than archerfish and the average US soldier (aiming at a human-sized target) at 300 metres, but lower shot accuracy than a US sniper at 600 metres. So Stormtrooper aim suddenly doesn’t seem so bad. In terms of accuracy. Their aim is obviously “bad”. They tried to shoot Chewbacca!

If we discount the US sniper (unfair to compare to a trained specialist with more time and calibrated equipment) and the archerfish (a fish which spits water at land-insects in order to eat them and which is rarely found in conditions of modern warfare) the Stormtrooper is four-times more accurate than our only remaining comparator, the average US soldier aiming at a human-sized target from 300 metres. If we accept that reduced soldier accuracy is due to posturing in favour of other combat choices, it suddenly seems that Stormtroopers are choosing to fight rather than flee, posture or submit. This makes Stormtroopers seem less human and more terrifying. Fitting soldiers for the Dark Side indeed and certainly not deserving of their reputation for inaccuracy! Unless they didn’t read the Death Star memo. Then, they’re just average.



Shocking evidence of stereotyping in Mr Men and Little Miss

UntitledI have very occasionally been asked the question, “Why are all Mr Men good and all Little Miss bad?” I’m sure this was meant to be rhetorical, with the underlying assumption that all Mr Men are good and all Little Miss are bad, but my admittedly limited recall was not in agreement with this statement. I was sure Little Miss Sunshine existed for start and unless exposure to her was the cause of skin disease, I didn’t remember her being bad as such. I also remembered Mr Uppity, a wealthy character who was rude to everyone and could potentially run for parliament as a member of the Conservative party. I don’t think he could be considered good per se.

For those who are unaware, the Mr Men and Little Miss are a series of semi-popular children books, originally written by Roger Hargreaves, which took shapes, gave them faces and one bit of a personality and asked us to enjoy ourselves by judging their actions. Luckily, their popularity meant other people had heard of these Euclidian protagonists. When I asked others about the Mr Men/Little Miss morality divide, the general response was not that Mr Men were good and Little Miss were bad, but that the characters as a group were sexist. It was generally felt that the characters conformed to harmful gender stereotypes. This is certainly understandable. For a start they all live in Misterland. The place they live in is actually named just after the males of the population. It’s like if the countries were called Manada, Mance or Oman. Which is obviously ridiculous. Secondly, the female characters’, the Little Miss’, creation began in 1981, much later than the Mr Men, whose creation began in 1971. I don’t know the actual reasoning behind this, but it does somewhat make the Little Miss seem like an afterthought. Finally (for this list, by no means for all reasons why Mr Men/Little Miss might be sexist) why don’t the Little Miss follow the same naming convention as the Mr Men? Why aren’t they the Ms Women? Or something better? “Little Miss” seems a little demeaning, like describing something that’s demeaning as “a little demeaning.”


They’re jeans! They’re all essentially the same. Just like people. Depth!   “Clothing Rack of Jeans” by Peter Griffin – Licensed under Public Domain via Wikimedia Commons

There is very little reason to even divide the characters based on binary gender. If they were real people, we could say that they each identify with a gender or different aspects of genders i.e. they all have different traits as people, and that would be fine. Except that these are characters which have been assigned a traditional gender and a specific characteristic. We don’t know how this decision is made other than the gendered title is not based on primary or secondary sexual characteristics. There’s nothing specific about the characters that even make them stereotypically male or female other than their names. They’re all just shapes with personalities. Technically I suppose this is true for most people.

So far these are all opinions based on perceptions. Perceptions, psychologically speaking, are prone to an enormous amount of bias. For example, Distinction Bias, where there is a tendency when considering two things to see them as more dissimilar when evaluating them at the same time than when evaluating them separately. Like when comparing different pairs of jeans in a shop and tiny differences are magnified, but really they’re all incredibly similar because they’re just blue trousers for crying out loud! Or potentially when comparing Mr Men and Little Miss. Or there’s Trait Ascription Bias; where individuals consider themselves to be variable in terms of behaviour and mood, while considering others to be much more consistent and predictable. To be fair, this may be understandable when it comes to the Mr Men and Little Miss. Our judgement on the relative goodness of Mr Men and Little Miss may therefore be influenced by such bias. Can the morality of these shapely (literally) populations be objectively examined?

Each book in the original Mr. Men and Little Miss series introduced a different title character with a single dominant personality to convey a moral lesson. The dominant personality trait was also their name. Luckily this is not how humans or Piers Morgan are named. To examine whether the Mr Men and Little Miss are separated by some sort of weird moral judgement, it should therefore be relatively easy to use their names to observe if there are any trends.

The populations of Mr Men (n=50) and Little Miss (n=37) were examined. Based on their names alone, each character was assigned a moral weighting of good, bad or neutral. For example, Little Miss Brainy was considered good, Mr Greedy was considered bad and Mr Bounce was considered neutral. These decisions were just made by me, which will almost certainly introduce a source of bias towards my own values, determined by upbringing, culture, socialisation and so on, regarding what’s good, bad and neutral. I could have attempted to correct this by hiring a suitably varied team of Hargreaves-trained research assistants and averaging their judgements, but I haven’t the money, time, inclination or money.

The proportion of the total population for each moral assignation was then calculated. No further statistical tests were performed to compare the two populations, as the numbers involved weren’t large enough to make these comparisons meaningful. Any differences observed can therefore be considered trends or as a real statistician might technically call them, “nonsense.”

As Figure 1 illustrates, contrary to what was originally proposed, there were fewer good (18 vs. 24%) and more bad (48% vs. 38%) Mr Men compared with Little Miss. So it would seem that generally Mr Men are (a bit) morally worse than Little Miss.

Figure 1. Moral Proportions of the Populations of Mr Men and Little Miss.

Figure 1

However, we know that what is considered morally good or bad changes over time. For example, it was formerly considered a moral failing to be left handed. This attitude is now agreed to be a bit sinister.  Previously there was a lot of public judgement as to the type of clothing women should wear. Nowadays, this is also done on social media. There may be one or two other examples in history. Perhaps the moral association of the Mr Men and Little Miss has also changed with time. To examine this, the populations of Mr Men and Little Miss were divided into new and old characters based on whether the book featuring them was published before or after 1990. This year was selected as a fairly natural cut-off as in 1988, Roger Hargreaves unfortunately died and his son, Adam, began writing and illustrating new stories and characters.

Figure 2. Moral Proportions of the Populations of Old and New Mr Men and Little Miss.



Figure 2 illustrates that there are fewer good (10% vs. 24%) and more bad (56% vs. 48%) old Mr Men compared with old Little Miss. It can also be seen that there were fewer good (18% vs. 25%) and more bad (25% vs. 18%) new Mr Men compared with new Little Miss.

From a slightly different perspective we can also see from these data that (numerically at least) there are more good and fewer bad new Mr Men than old Mr Men and approximately the same number of good, but fewer bad new Little Miss than old Little Miss. So it would seem:

  • Mr Men have been historically morally worse than Little Miss and continue to be so into the present day
  • New Mr Men are morally better than old Mr Men
  • New Little Miss are more morally neutral than old Little Miss

Because we’re humans with prejudices and bias, it is easy to interpret these trends in a number of ways. For example, it may be argued that it displays the prejudice of the the Mr Men and Little Miss book series, with the Mr Men being allowed more complex characters and the Little Miss, where they have moral character at all, being relegated to the old “good, sweet and innocent” stereotype. Sugar and spice and all things nice, that’s what little female polygons are made of. Without looking in greater detail at the actual traits assigned, it is difficult if not impossible to say what this may reveal; if there is any stereotyping present or if these trends are simply random.

It could be argued that rather than morals changing over time, these data show the change in morals between Roger and Adam Hargreaves. I don’t know either of them, so can’t really say anything in that regard, but I do know that books are rarely just produced by one person on their own and the differences will at least reflect the views of two teams.

Judgement across gender stereotyping is obviously more complicated than a seemingly simple good versus bad dichotomy. The idea of gender as a binary concept is laden with all sorts of complex and subtle stereotypes and comparisons. It may be possible to broadly determine if there are any obvious stereotypical comparisons by matching the names within the Mr. Men and Little Miss populations to see if they conform to any traditional gender roles.

To examine the roles of the Mr Men and Little Miss, the populations were examined to see if their names could be paired with a counterpart with the same meaning e.g. Mr. Birthday and Little Miss Birthday, with a counterpart with the opposite meaning e.g. Mr Messy and Little Miss Tidy, or if there was no counterpart e.g. Mr. Moustache. Where pairs were available, the moral weighting (good or bad) and the meaning of the names themselves were compared. Again, it was just me that was checking, so interpretation is potentially based on any prejudice I may have lurking within my poor tired brain.

Table 1. Matched and Opposing Mr Men and Little Miss Characters

Table 1

From Table 1 we can see that is was relatively more common for Mr Men to be matched with Little Miss than for them to be opposing. We should perhaps be pleased about this meagre hint of equality, although it is perhaps notable that the majority of the matching pairs may be considered bad characteristics.

Where the Mr Men and Little Miss are compared in terms of their opposite character, they seem to be reasonably balanced in terms of which group is good or bad. However, when we look at the actual words associated with the Little Miss (tidy, neat, helpful, scary) and Mr Men (messy, brave, mean) it begins to sound too much like the parents in a sitcom for us to be comfortable about the lack of gender stereotyping. The sitcom where the husband is the silly, humorous idiot and the wife is an attractive, home-based nag. I’m sure you know the one. However, these characters represent only 13% of the total pooled population. This is perhaps too small a proportion with which to judge all of the 2D people.

In summary, we have managed to get a few bits of information by looking at the total population of Mr Men and Little Miss. We know that the population of Mr Men contains more bad characters than the population of Little Miss and this is also the case historically. Pretty much just like with humans. We also know that stereotyping is likely present in this population, but we can’t say more without cooperation between more people. Pretty much just like with humans. Finally, we know that gender and how it can be used to stereotype is a complex issue (even the word gender means different things to different individuals) and that there is a lot of thought needed to advance many issues in this field. Pretty much just like with shapes with personalities.