In the news today, two 12 year old girls who have done Mensa assessments have been pronounced to have IQs of 162, and to “already be cleverer than Einstein or Stephen Hawking”. As someone qualified to test IQ in a validated way, this is infuriating, as it compounds public misconceptions about IQ.
Let me start with some basic explanation of how an IQ test works. An IQ test is a set of ten or more different types of puzzles and questions that is scored by comparison to a sample designed to represent the population. It is set up so that an IQ score of 100 is the population average score; the higher you score the better you compare to the population. Most people score near to the average IQ, so the distribution is a bell curve – a distribution with fixed mathematical properties called a normal distribution. A normal distribution with a mean of 100 and a standard deviation (SD) of 15 points is used to create standard scores – IQ scores are calculated by using the raw score on the test comparing it to the norm group and then transposing that onto the standard distribution.
A properly validated IQ test shows IQ scores in terms of the number of standard deviations that person’s score is from the mean in a normal distribution of other people of their age in their country. As you can see from the graph below, the average IQ score is set at 100, and just over two thirds of people have an IQ score between 85 and 115. The further a score is from 100 the more unusual it is in the population. Only 2.28% of people score above and below 2 standard deviations from the mean. At the lower end, 2.28% of people have an IQ below 70 (considered a Learning Disability) and at the upper end of the range 2.28% have an IQ over 130 (considered “gifted” or “very superior”).

So what about people who are super-bright? How accurately can we measure their ability?
Given that IQ tests are normed on only a few thousand people, the sample and thus the knowledge we have about the distribution of IQ in those ranges is pretty limited. By the time we look at the sample 3 SDs above the mean at 145+ we are studying 0.03% of the population (three people in every ten thousand) and that means that the normative sample will probably contain only two or three of those people at best. Of course most of the tests are normed in the USA, and the UK sample used to ensure it transfers to this country was only hundreds of people, so it probably didn’t contain any.
We then have the variable of error in measurement, which increases as you get to the edges of the distribution. You get a noise in the background during one item or a moment of misunderstanding and it can change the score by one or two points, and in these ranges that could make a huge difference. How valid can it be to stratify people in these extremes according to individual items of knowledge?
So, you would typically give either a confidence interval (the range at which the person is 95% likely to score if re-tested, according to the statistical properties of the test, usually the individual IQ score plus or minus 4-6 points) or a percentile. And the general practise is to say “the top 0.5%” for all scores above 140 and be no more specific than that. So the answer to my question is that we know fairly little about the distribution of IQ scores above 130 and the tests are not very good at reliably differentiating between scores in that range. That means I’d be sceptical about anyone claiming “genius” who cites a specific IQ score, as they clearly aren’t enough of a genius to understand the statistics or science of IQ measurement!
Does someone with a high IQ as a child keep getting higher?
No, cognitive assessments are designed to measure ability relative to your age peers, so that it is likely to remain a similar score as you get older. A child with an IQ of 130 is likely to become an adult with an IQ of 130, give or take the error of measurement, unless there is a significant head injury or some other explanation for the change. If someone under-performs on the test for some reason (for example because their attention is very poor) their score might improve if that reason is addressed (if their attention is improved by medication, change in their environment or practise at similar tasks). You can practise the specific tasks used in IQ tests and learn general knowledge and vocabulary deliberately to improve your score, and the score is not normally considered valid if the same test is used again within 24 months because of practise effects.
The norms for IQ tests are gathered for each language and country, so although they are meant to be “culture free” you also need to be mindful of cultural or language barriers to performance. And of course, in the end IQ scores measure how good you are at IQ tests, which may not reflect how “intelligent” you are in real life, where social skills, emotional intelligence, interests, ability to use executive functions to concentrate, self-monitor, learn from feedback and many other factors affect the degree to which you can succeed or appear exceptional. In fact it is often the people with the narrowest focus in their skill-set who are able to make the most impact in that area, and they may often not have the breadth of skills to appear that intelligent in other contexts. Of course there are some genuine polymaths, but it isn’t clear that IQ scores reflect functional skills beyond being reasonably predictive of academic attainments.
What about the IQ test I did online?
There are many online “IQ tests” that don’t have any proper norms and give meaningless results (in fact some of them like to give everybody high scores so that they will be more likely to share the link). In short, all of these “IQ tests” are done for entertainment and have little or no relationship with your actual IQ. There are also many assessments that are used in employment or education that have various levels of validation, and may be helpful to understand functional skills or predict attainments, but they don’t measure IQ. There was even a TV program linked with a “test the nation” survey a few years ago that used a simplified IQ-like survey to let people test themselves and see how they compared to other participants – but the sample collected is likely to be biased towards people who think they are clever and want to know their IQ, and to exclude people with learning disabilities.
For this reason a lot of people think they have done IQ tests, or know their IQ, when this isn’t really the case. In reality, if you want a properly validated IQ test then this is harder to come by. Only a practitioner psychologist is licensed to assess and interpret cognitive functioning with a validated IQ test. Because there is limited access to these tests in health and education settings, and the materials are restricted to certain professionals and expensive to use, IQ assessments are usually used to help identify areas of difficulty (eg for people with a learning disability or specific learning difficulties), or as part of an assessment for a particular condition (eg when looking at forms of neurodiversity like autism or ADHD). Having a validated IQ assessment is also expensive in the private sector (approx £600-£1500). So, for all these reasons, they are not typically used for vanity testing for people who think they are clever.
So what about these scores cited in the media of IQs of 162? Are they cleverer than Stephen Hawking or Einstein?
Many high IQ societies don’t use the cognitive assessment tools that Clinical Psychologists use (like the Wechsler tests, or the Stanford Binet). Mensa for example use the Cattell which is not widely accepted as a valid test of IQ and has relatively low correlation with the standardised tests I mentioned. This test has a completely different scoring system, with a standard deviation of 24 points. This makes the highs look higher and the lows look lower, and it seems it is popular amongst high IQ societies because it is cheap to administer and pleasing to their members as it gives nice high numbers (you can also practise for it by doing the same kind of puzzles). It is also a good income generator for them. On this test a standard IQ score of 130 (achieved by that top 2.28% I mentioned earlier) would be a score of 148 – a higher number, but still designed to indicate the same level of ability (a score in the top 2.28%) – and a score of 145 on a standard test (achieved by the top 0.03% of the population) would be a score of 172.
So those published scores of 162 are high, but statistically we’d expect 3 in every ten thousand people to score over 170 on that scoring system – and simple multiplication tells us that across the population of the UK we’d expect there to be 18,000 people with that level of ability – whilst the error of measurement in that range must be enormous. In fact, we can say little about their IQ beyond “it is above 130 on a standardised test” due to the confidence intervals being very wide. To differentiate amongst super high ability people we would not only need a test that is able to be sufficiently granular at that ability range, we would have to norm it on representative samples of higher ability people studied in sufficient numbers to see what happens to the distribution as raw scores go up.
As to scores higher than Stephen Hawking or Einstein, that’s purely speculation. Neither of these two people have done comparable IQ tests, and the norms change year by year. Plus people can be brilliant at some things and less good at others, and there is not a perfect relationship between IQ and ‘intelligence’ let alone IQ predicting who will be a “genius” and increase the boundaries of current knowledge.
I’ll leave the last word to Stephen Hawking, when asked what his IQ was by the New York Times:
“I have no idea. People who boast about their I.Q. are losers”.