By Rebecca Mansley, Head of Data & Insights UK
To answer this question, we need to define what Nationally Representative, or ‘Nat Rep’ is. When we discuss Nat Rep, we mean that the sample source of interest for a study reflects the population of the country in terms of gender, age, region, and socioeconomic group (SEG). The proportion of men vs women in the study should be representative, as should be the different age groups, the percentage of respondents from all regions etc.
The aim is to create a truly representative sample to provide an accurate and complete picture of views in a real-world context at a given time.
Select your key variables
For most studies in the UK, age, gender and region tend to be the only or most common key variables. Though depending on the objective of the study, other variables such as education, income and SEG can be considered or combined. If we look at European regions, the common key variables are very similar across many countries and can also include language or a socioeconomic category. Whereas in the US, many studies quite often include ethnicity.
While compiling your list of key variables for a study, it’s important to bear in mind any limitations and guidance on permitted demographic data in the market you are researching in, such as the collection of race or ethnicity in France, Germany and Italy amongst many other European countries. Similarly political opinion, religious beliefs and health can be perceived as sensitive information that is impermissible in many countries in Europe and America. Research associations such as the Market Research Society, The Research Society and ESOMAR have the expertise and knowledge to advise on restrictions on the collection and use of personal data and are usually a good place to start.
Other market variations to consider:
- How is SEG defined or broken down in the country you are researching in? Quite often markets ask about household income but does this translate to the data you are trying to collect?
- How does education level transpose across different countries?
- Internet penetration and those hard to reach audiences – whether that’s 18-24s or 65 and over, some groups are more accessible in some markets than others.
Aside from any rules relating to the collection of demographics, the management of quotas can also be challenging, particularly when quotas may be interlocked. For example, a quota for 18-24 males in Wales could be tricky to achieve if the required sample was a significant size. Hard to reach groups can be one of many factors to impact the collection of nat rep quotas for studies.
Eliminate data distortion
Fundamentally we want the sample to be balanced across all sections of society, and to closely match official government statistics in the form of census data, but we also need to ensure that it’s not biased, nor skewed.
Often panel companies will think that they are targeting correctly, but still struggle to fill certain groups, this can create challenges in supplying samples and managing quotas in the survey. Collecting and using online responses only from those who are more accessible or more willing to be selected for survey participation can already skew the sample. Similarly, excluding hard to reach groups such as older age categories and those without internet access means collected responses are underrepresented. These types of sampling bias can lead to inaccurate representation and skew the results for your study.
Solutions to balance the data
Consider the topic of age and how this is categorised to identify trends and behavioural patterns amongst generations (most commonly 18-24, 25-34, 35-44, 45-54, 55-64, 65+). Given that the average life expectancy in the UK is around 80, or even higher in some European countries, we must question why all those over the age of 65 are banded together in such a broad age range? Is it altogether a fair representation of the attitudes and behaviours of all senior citizens? Some possible solutions are to band in smaller age ranges or to drill down further by perhaps collecting on age or requesting birth year and then grouping retrospectively. This solution could be beneficial for studies that require more specific and precise data, such as the topic of how early people start considering life insurance or retirement plans, or the social impact of Coronavirus on older age groups.
Alternatively, if we view it from the perspective of the respondent, they may unconsciously be skewing their own responses by under or over reporting their answers, particularly when it comes to income, SEG and job title.
SEG in the UK is grouped as shown below:
AB: Higher & intermediate managerial, administrative, professional occupations
C1: Supervisory, clerical & junior managerial, administrative, professional occupations
C2: Skilled manual occupations
DE: Semi-skilled & unskilled manual occupations, Unemployed and lowest grade occupations
If we look closely at DE, we could question which person would in fact describe themselves as an unskilled worker? It can be argued that some skill is required for every job. If a participant doesn’t believe themselves to be unskilled in their job, they’re likely to select an option which fits better with their own interpretation. Hence, upskilling themselves and therefore skewing the data set. So, are you really getting the actual representation you want?
One way to get around this, is to reorder or reword the categories, or suggest specific job titles, or provide job roles as examples. Try rethinking natural hierarchies of employed down to retired, and consider what other options there may be for people being out of work. For example, are they travelling? Taking a sabbatical? Or looking after a family member? Looking outside the traditional constructs might lead to a richer dataset.
A similar approach could be taken for how income data is collected, whether this is at a household or personal level. Are people more likely to know their monthly take home pay better than their annual salary? If we’re trying to gauge disposable income, maybe the question needs to be flipped to delve deeper and gain more context, perhaps instead by asking about the number of takeaways per month, holiday budgets or annual paid subscriptions.
Research your researcher
As precise and conclusive as we can try to be, there are pitfalls in trying to achieve a truly nat rep sample. The examples and solutions outlined above can help improve matters and eliminate some of the challenges in obtaining a sample that accurately reflects the population’s demographics. Though ultimately, we need to closely examine whether nat rep is really the most suitable quota based on the objectives of the study.
Research buyers may not be fully aware of the best approach to achieve high quality and accurate data for their research, or they may have little visibility of how groups are selected and recruited into panels. We therefore highly recommend choosing a research provider who upholds professional ethics and standards, has robust security and quality assurance checks in place for their panel and is a member of a recognised research association.
If you would like to learn more about establishing quotas for your next study, get in touch with us. Pureprofile has access to over 20 million global panellists, we have run studies in over 100 countries, delivering large scale nationally representative samples, through to highly targeted, niche audiences.