Randomised control trials: Are they really the gold standard?
Randomised control trials (RCTs) have become a dominant methodology in research. However, issues with RCTs, including their failure to take account of history, context, or relevant findings, bring into question the superiority of the method. Professor Naila Kabeer, at the London School of Economics, highlights these issues in her research and discusses how the overfocus on RCT methods has led to important findings in relation to gender equality being dismissed.
How do the experts decide on public policy? Traditionally, by generating new theories about what will work, or by asking big questions with even bigger answers. However, this theoretical approach has been criticised for being ungrounded in reality, leading to failed development efforts. A more down-to-earth approach to policy development is the use of randomised control trials (RCTs) which is being held up as the gold standard for determining the effectiveness of an intervention. For example, the 2019 Nobel Prize in economics was awarded to three scholars for pioneering the use of RCTs to study ‘what works’ in the fight against global poverty. Research using RCTs is increasingly favoured over other evaluative methods. Regarding themselves as the ‘plumbers’ of policy development, the ‘randomistas’ say that they are interested in solving real-life problems with science, rather than theory.
Randomised control trials
To understand whether an intervention is working, non-RCT experimental methods would compare the outcomes of people who have chosen to participate in the intervention (treatment group) against people who have not chosen to participate (control group). The problem with grouping people in this way, however, is that there may be selection biases – differences between the groups which could influence the results of the study – such as personality types of people who are keen to participate in interventions versus those who are not, or unintended differences in access to the intervention due to location.
RCTs seek to overcome this problem by identifying a group of people who are eligible for the intervention and randomly assigning them to either a treatment or control group before the intervention. In so doing, RCTs aim to reduce the effect of selection biases and ensure that any evidence of impact can be ‘cleanly’ attributed to the intervention. As a result, RCTs are considered by their practitioners to be more rigorous than other methods, leading them to dismiss the usefulness of other approaches and, by extension, the findings they generate.
However, not everyone agrees. Professor Naila Kabeer, of the London School of Economics, argues that reliance on RCTs has given rise to a new way of thinking and storytelling that ignores history and the bigger picture – what Professor Kabeer terms ‘randomista economics’ – a sub-field in development economics. Using the insights from feminist economics, she illustrates this through her detailed analysis of an essay by Professor Esther Duflo, one of the Nobel Prize winners, on the relationship between gender-equality policies and economic development.
RCTs deliver poorly on what seems to be their main claim to methodological superiority.
RCTs: generalising from the particular
One of the drawbacks of randomista economics that Professor Kabeer points to is its tendency to use studies of micro-level interventions to generalise about the larger whole. For example, the essay by Professor Duflo cites findings from an RCT in India which showed that the dissemination of job advertisements to villages within commuting distance of Delhi had increased women’s employment in those villages. This is used to support her contention that there has been a rise in female employment in India. But there has not. On the contrary, much of the literature in India is focused on the fact that women’s labour force participation in India, especially rural India, has been in steady decline for a number of years.
Similarly, evidence from an RCT in one state in India is used to suggest that there is no evidence for gender discrimination in India, at least in relation to immunisation. Once again, this claim is contradicted by the national data which shows that girls were immunised significantly less than boys in almost every other state in India. By examining a small-scale situation, and relying only on their findings, randomista economists tend to provide a very skewed picture of the true state of affairs.
RCTs: ignoring structures
Another drawback is the tendency to focus entirely on individual behaviour without taking account of the history and context in which this behaviour plays out. For instance, Professor Duflo suggests that the gains from gender-affirmative policy – policies designed to improve gender equality – are exaggerated. They may have benefits for some women, but they are not efficient from the perspective of economic growth. An example put forward is a study from Sri Lanka which showed that transfers in cash and kind to micro-entrepreneurs increased profits for men but not for women. This suggests that these transfers would have been more efficiently used if they had been directed only to men. This narrative by randomista economists would discourage policymakers from implementing gender-affirmative policies in their society.
Professor Kabeer, however, rebuts this argument from a feminist economics perspective. She argues that feminist movements and gender-affirmative policies have contributed hugely to gender equality in politics, such as women’s right to vote and electing women into public office. This has helped policy decisions to better reflect women’s interests. There is also substantial evidence to suggest there is a positive impact of gender-affirmative policies on critical aspects of family wellbeing, an important dimension of development.
She suggests the reason why transfer policies in Sri Lanka only profited males was because patriarchal constraints on women (including their unpaid domestic responsibilities and restrictions on their mobility) made it more difficult for them to use transfers profitably; one policy alone will not be sufficient to level the playing field between men and women. The larger patriarchal structures that prevail in different contexts are not considered by randomista economists in judging questions of efficiency, but feminist economists argue they are essential to the argument that gender inequalities, rather than individual women, are the source of inefficiency.
RCTs: the issue of causality
Finally, Professor Kabeer argues that RCTs deliver poorly on what seems to be their main claim to methodological superiority: the unbiased attribution of causality in the evaluation of interventions. This, they claim, allows them to make a rigorous distinction between what works and what does not. However, by failing to explore the causal processes through which interventions work or fail to work, why they work in one context and not another, RCTs add limited value to the broader task of policy design. In one of her own studies, she compared the insights provided by RCTs and qualitative methods in evaluating very similar interventions in West Bengal and Sindh which aimed to transfer assets to women in extreme poverty.
This narrative by randomista economists could discourage policymakers from implementing gender-affirmative policies.
The RCT evaluations concluded that the transfers worked well in both contexts but provided little further insight into how and why. Kabeer’s qualitative study supported the finding that the transfers worked well in the Sindh community but she found that the transfers largely benefited the better-off in the Sindh community because of their initial advantage. Counterintuitively, the main beneficiaries in the West Bengal were the most marginalised. They were the most strongly motivated to take advantage of the intervention because they had been bypassed by previous development efforts. In this case, their lack of attention to how interventions work meant that the RCTs glossed over the fact that they worked better for some of their intended beneficiaries than others and, by extension, failed to consider why this may be the case.
Redressing the balance
Professor Kabeer’s research highlights critical issues in one of the most popular and widely used research methods. She argues that RCTs are limited in the information they provide. They ignore the bigger picture and historical context of current situations, which limits their explanatory power. Practical difficulties with implementing RCTs, as well as methodological oversights, often mean that RCTs underestimate challenges within an intervention. Particularly problematic is the randomista economists’ dismissal of evidence that has not used RCT methods, such as non-experimental findings or past knowledge. This means that evidence built up over the years to support claims for gender equality policies are dismissed because they failed to use RCT methods. Yet, it is often the case that ‘new’ findings from RCTs are not new at all, but simply past knowledge that has been dismissed.
By contrast, feminist economists argue that policy analysis must be grounded in an appreciation of the interaction between ground-level realities and the larger historical contexts in which they play out. They also argue for the use of a diversity of methods in order to generate a more holistic understanding of the problems under investigation. For example, a balance of quantitative and qualitative methods can provide rich information to inform policies intended to address long-standing problems of gender inequality. Quantitative methods, including sensitively designed RCTs, can provide insights into the statistical regularities in the findings of an evaluation, separating out the systematic from the random so we know whether it worked or not. Qualitative methods, which include interviews and focus groups, provide a more detailed and contextualised understanding of these findings. They can tell us how and why an intervention may or may not have worked.
RCTs and qualitative methods seem limited in the sample size they can practically achieve, which could limit the applicability of their findings in country-wide studies. What would be the best research method to overcome this?
It is true that neither RCTs nor qualitative methods work with large samples. There are two alternatives. One would be to do several RCTs within a country to take account of different contexts and use qualitative studies to tease out causal processes for what works and what doesn’t. But RCTs are very expensive. So, the other would be to go for well-designed old-fashioned surveys, but using econometric techniques for creating valid control groups, and relying on qualitative methods once again to understand causality.