Tom Moultrie’s Sad facts – occasional musings on the abuse of statistics
I was alerted to this statistical horror by a friend who found it on the Guardian’s website; the screengrab above shows the original source cited by the Guardian. Nowhere on oneinnine’s website is evidence marshalled to support their contention.
“In South Africa a woman is more likely to be raped than learn how to read”. Say what?
A micro-nanosecond’s consideration should have you scratching your head in trying to interpret this “fact”.
First, the comparison is invalid. Learning how to read is an event that can occur only once in one’s life (and, typically, at a very young age). It is not, in any meaningful sense, repeatable. It is also not exposure-dependent: it is (with exceedingly rare exceptions) not the case that the longer it takes you to learn how to read, the more likely it becomes that you will. As a counter-example, think of a lightbulb – the longer it has been left on, the more likely it is to burn out. The experience of rape, by contrast, is both repeatable and exposure-dependent. The comparison is then one of apples and pears.
Second, what does it mean to assert “more likely to be raped”? The only sensible context for this statement is an implicit “in her lifetime”. (One could surmise that it might mean “more likely to be raped in the next ten minutes/week/month … but that assertion would be patently ridiculous. So let’s assume that oneinnine did indeed mean “in her lifetime”). But doing so raises a slew of statistical problems of how to measure this. The only women for whom one could empirically calculate the probability would be to evaluate the proportion of women who, at some age when they are deemed no longer to be at the risk of rape, have been raped. You can’t do it for younger women (say, looking at the proportion of 30 year olds who have been raped, because some of women may suffer rape later on in their lives). But if you interrogate the proportion of (say) 80 year olds who have been raped, many of those violations would have occurred a long time in the past – possibly up to 80 years ago – somewhere around 1930. Is the incidence of rape the same now as it was in the past? There is almost certainly a secular (period) trend in the incidence of rape which the statement does not permit. There is also certainly an age effect in the incidence of rape: rape does not affect all women of all ages equally. There may also be a cohort trend in the incidence of rape: think of women aged in their early 20s living in Rwanda in 1994.
To calculate the lifetime risk of being raped in 2010, one would need to make assumptions about future period trends in the incidence of rape; assume an age pattern of the incidence of rape into the future, and accommodate any past, present, or potential future cohort effects. By definition, this requires prognostication, and is inherently unreliable. [An identical phenomenon occurs with the calculation of life expectancies: how many years can a baby born in Swaziland today expect to live before he or she dies? The interpretation is not straightforward, as the assumption that has been made here is that when that child is (say) 10 years old in 2020, he or she will experience the mortality of a 10 year-old in Swaziland in 2010.]
So oneinnine’s assertion is contrived from a melange of unstated assumptions about age, period, and cohort dynamics of sexual violence in South Africa, covering a period of time of roughly two hundred years (the retrospective experiences of women nearing the end of their lives combined with the prospective experience of an infant in 2010). These assumptions should be clearly set out, so that people can verify and validate them.
Third, even if we gave oneinnine the benefit of the doubt that they were able to collect and correctly interpret the data on rape in South Africa, the comparison to women’s literacy is simply ludicrous: A global programme of surveys, the Demographic and Health Surveys, has been conducted using a standardised questionnaire for several decades. The surveys use an extensively tested instrument, and are designed and weighted to be nationally representative. The last such survey done in South Africa was in 1998 and showed that – of women aged 15-49 – 8.2% of African women, 3.9% of Coloured women, 0.9% of Indian women and 0.1% of White women could not read at all. In aggregate, 6.9% of South African women aged 15-49 could not read at all. Differently put, 93.1% of South African women of aged 15-49 HAD learned to read. (Other surveys, for example the 2008 National Income Dynamics Study suggest that literacy has increased since the 1998 survey – they find 3.9% of women aged 15-49 across the country reported not to be able to read at all in their mother tongue).
So oneinnine are suggesting that more than 93% of South African women would be raped in their lifetime. Really? (If one was to be generous, one would note that the incidence of illiteracy would be markedly higher among very old African women, but then we are descending into the age-period-cohort rabbit-hole again). No matter, this number is so outrageously high, it simply cannot be true.
In passing, while acknowledging that reports of rape are notoriously difficult to collect in surveys, the disjuncture between oneinnine’s statistics and those from the 1998 Demographic and Health Survey is astonishing: In answer to the question “Has anyone ever forced you to have sexual intercourse against your will by threatening, holding you down or hurting you in some way?”, 4.5% of South African women aged 15-49 answered in the affirmative (interestingly, the figure was lower for older women than younger women); while only 2.4% answered in the affirmative to the broader question “Has anyone ever persuaded you to have sexual intercourse when you did not want to?”. While I have no doubt that these data almost certainly underestimate the incidence of rape or coerced sexual intercourse, it is hard to see how one gets from answers in the range of less than 5% to an implied incidence of rape of 93%.
The incidence and extent of rape in South Africa is a major, major issue and is a shocking reflection on the country, and its attitudes to women, gender roles, patriarchy and domestic violence. Nothing can be said to condone it or explain it away. However, activists should not demean and undermine the severity of the problem by simply making up numbers and facts to suit their position. It helps no-one, and achieves nothing (and is, if anything, counterproductive).
In time, as Goebbels argued, once the lie has been repeated enough times, people will believe it; take it as axiomatic. Oneinnine gets it wrong; it is used uncritically by a journalist on a reputable paper; which is then cited by other journalists, reports and documentation. It will not be surprising to find this ‘fact’ being used to justify and mobilise support for any number of public, NGO and civil society campaigns into the future. And the error is not socially neutral – it distorts the assessment and evaluation of social priorities and funding. The perpetuation of this analytical and statistical lie and its concomitant social reproduction must be stopped.




[...] This post was mentioned on Twitter by Jacques Rousseau, Johann Eicher and 6000, FreeSocietyInstitute. FreeSocietyInstitute said: New blog post: SA women more likely to be raped than to learn how to read? – http://fsi.org.za/X6 [...]
Excellent post, thanks. I saw that claim on Twitter and my bullshit detector went crazy. Thanks for refuting it so thoroughly…
Please send more “say what?” social statistics …. I am always on the lookout for spurious and fraudulent (and simply bullshit) assertions….
Tom
Another thing, how do they get the 1-in-9 statistic? It’s a strange number considering the many assumptions and unknowns on which it is based. Maybe it’s true, but it smells dodgy.
Many public campaigns are tainted by such abuse of statistics. More’s the pity, when it undermines a campaign that addresses a serious issue.
I like your article and your reasoning and agree wholeheartedly that oneinnine’s use of statistics and logic is absolute fiddlesticks – but instead of indulging in naval gazing and existential reasoning about the validity or not of various rabbit-holes of reasoning, surely a more solid way to attack the article would have been to send a list of questions to the originators of the add, asking for the evidence and supporting documentation – and then use that as the basis on which to construct a valid rebuttal. Sorry, but while the end result is sound, the writers method is weak. George Monbiot shows how to do it properly http://www.guardian.co.uk/environment/georgemonbiot/2011/apr/13/anti-nuclear-lobby-interrogate-beliefs?CMP=twt_iph
A comment via Facebook (permission to repost here granted). Gareth Fouche said:
Thanks for Gareth’s post.
We may differ, but I do believe that the first paragraph (by which I presume Gareth means the fourth of the post) is NOT garbage. His point *might* be valid if the risk of learning to read was largely independent of age. And of course it isn’t. While ABET programmes *do* bring some adults to a point of functional literacy, such outcomes are comparatively rare, as could be shown by synthetic cohort analyses of census and survey data on literacy – the increment in the proportions literate is rather small after school-leaving age. So we are back, again, to a comparison of a lifetime risk of an event that may occur at any point in one’s life, versus that of an event which if you have not achieved it by one’s early twenties, is distinctly unlikely to occur.
By way of example, from the 1998 SA DHS, only 5.3% of African women aged 25-29 could not read. The potential for significant inceases in the proportion of this cohort (who would be approaching age 40 now) is, shall we say, somewhat limited. Among African women 15-19 in the same survey only 1.7% could not read. The same point applies, and more so.
So – according to 1-in-9′s numbers – less than 1.7% of African women aged 15-19 in 1998 will NOT be raped in their lifetime? This would be (roughly) equivalent to a constant (age- and time-independent) risk of being raped of 5% p.a. over a further 60 years of lifetime.
The numbers simply don’t add up. Cheers.
.
Correction. The equivalent constant annual risk to have a life time risk (assuming, optimistically survival to one’s late 70s) of not being raped of 1.7% is 6.5% p.a. for 60 years.
To put that 6.5% into perspective: among South African women in 2011, that is approximately the same risk of dying that a woman who has just turned 77 now has of dying before reaching her 78th birthday.
Hi,
I’m the Gareth that posted what Jacques said.
The first paragraph is garbage. There aren’t two different types of probability at work here. Probability is simply the odds of achieving a specific result out of all possible results in a set. So you can, for example, compare the probability of getting a single head when you flip a coin 3 times VS your lifetime chance of getting eaten by a shark (once, obviously).
The contexts are different, yes, but the number that pops out of the equations is a probability that can be compared to any other probability. There aren’t different ‘types’ of probability at work here, such that they can’t be compared.
As to the actual issue, whether a woman has a higher chance of being raped than learning to read. I don’t know how they arrived at that claim, so I can’t really judge. I agree that, in general, it seems unlikely. Then again, we may be missing context of some sort. Ideally, we could contact the originator of the claim and ask them.
I did some cursory searching on the net (not enough to draw any reasonable conclusions, who knows how reliable most of these sources are) and discovered that Wikipedia (not a reliable source, obviously), lists the claim as ‘women have a higher chance of getting raped than finishing secondary school’.
Which leads me to wonder if there is some sort of broken telephone effect going on here. The 2003 SADHS report says only about a third of women in the 15-49 age category finish secondary education, for whatever reasons. Which is clearly a lower target to match than nearly 100%.
Consider that the official reported figures for rape was 55000 in 2006, and the National Institute of Crime Rehabilitiation estimates that only 1 in 20 rapes are reported, putting the figure at 494000 a year.
There are 50 million people in SA, assume half are women, 494k is around 2% of that total (we’re assuming all rapes were of women, of course they won’t be). The odds of getting raped at least once between 15 and 49 is then :
1 – ( 0.98)^35 = 0.506
In other words, if those figures are accurate, a woman has a 50.6% chance of getting raped in that time period. Which is greater than her chance of finishing secondary school, but less than her odds of learning to read, indeed.
IF the figures are accurate, IF they are representative, etc etc. There are a number of assumptions there, it was simply to illustrate that a tiny probability, over time, will snowball. I have no real idea how anyone is calculating these things, whether particularly studies are based on small, non-representative samples, etc. Could be complete rubbish, could be a bit of broken telephone, could be cherry picking studies in non-representative samples.
Even if it is just ‘higher than chance to finish secondary school’, it’s still fairly horrifying.
Gareth
1. My point was, and is, that the only sensible metric for 1-in-9′s comparison is LIFETIME risk.
That is not explicit in their claim, but it’s the only sensible claim.
Therefore one is comparing the LIFETIME risk of being raped vs the LIFETIME “risk” of learning to read. Anything else is comparing apples and oranges.
2. of younger women the DHS data are as reliable as we can get.
And it’s probable that (if anything) there is greater retention of kids in school now than in the mid-late 1990s.
So. Consider again. 98.3% of those women have ALREADY learned to read. For their risk of being raped to be greater than that, fewer than 1.7 % of this women must NOT be raped in their lifetime.
3. You seem to agree that the probability of being raped in the next t years is 1-(1-p)^t, where p is the probability of being raped in any year. i.e. if p = 5%, there is a 95% chance of NOT being raped each year. Exponentiate this for 60 years (assuming independence) gives the cumulated probability of not being raped in 60 years. Difference from 1 and that is the probability of BEING raped
4. You choose p = 0.02 and t = 35. Why? You are equating the ANNUAL risk of being raped with the lifetime risk of not learning to read among younger women. And why 35? It should be over her entire life… Or do you think that no woman over 50 gets raped.
I solved for p (assuming t to be 60 – i.e, e(17.5) approximately equal to 60. We can quibble about t – but it’s certainly greater than 35.) to get 1-(1-p)^t <0.017. p comes close to 0.065 as claimed earlier.
5. There are not two kinds of probabiltiy. There would appear to be a massive confusion in your mind between an annual risk and a cumulative risk.
[...] have been blighted by association with the questionable claim that South African women are more likely to be raped than to learn how to read. Trying to remedy one problem (sexual violence) through exacerbating another (innumeracy) is [...]