The sociopolitical crisis in the two English-speaking (North West and South West) regions of Cameroon, which has persisted for nearly eight years, is an example of how data sampling bias creates fertile ground for misinformation and disinformation. Cameroon is home to a predominantly French-speaking population, with the two Anglophone regions being a minority. Given this linguistic disparity, data about the country may not properly represent its population as a whole and this mismatch creates opportunities for bias.
This article explores how data sampling bias contributes to the spread of misinformation and disinformation, using the ongoing sociopolitical crisis in Cameroon’s Anglophone regions as a case study. I will also discuss strategies to mitigate data sampling bias to ensure a more accurate representation of reality.
Misinformation vs. Disinformation
The Oxford Advanced Learners Dictionary (OALD) defines misinformation as incorrect or inaccurate information shared without the intent to deceive. This type of information often includes misleading or false details, which can spread due to ignorance, negligence or human error. On the other hand, disinformation is intentional false information, designed to mislead and manipulate public perception, often for political or strategic reasons (OALD). The key difference between the two lies in the element of intent. Disinformation is deliberately crafted and shared by entities such as government organizations, political actors or other stakeholders to achieve a specific goal. Both misinformation and disinformation can have damaging effects on society by distorting public understanding of critical issues, reducing trust in institutions and damaging social harmony.
Data Sampling Bias
In the context of research, data sampling refers to the process of selecting a subset from a larger population to estimate the characteristics of that entire group. Ideally, the sample should be representative of the population from which it is drawn. However, biases often emerge when the sample does not accurately reflect the population. Sampling bias can arise from various causes, such as convenience or volunteer sampling, where data is collected only from easily accessible participants or from those who willingly volunteer to participate. These forms of sampling can lead to skewed results because they do not represent the entire population. As a result, conclusions drawn from such data are likely to be flawed (Loftus, 2022). Biased data can thus serve as the foundation for misinformation and in more extreme cases, it may be used as a tool for disinformation. For instance, data sampling bias in conflict-affected areas can misrepresent the situation on the ground, leading to conclusions that do not reflect the true realities faced by affected populations. Inaccurate data can easily be misinterpreted or manipulated by different actors to create misleading narratives, thereby spreading false information that intensifies tensions.
Impact of Sampling Bias on Cameroon’s Anglophone Regions
The impact of sampling bias can be seen clearly in Cameroon’s Anglophone regions, which have been involved in a violent conflict for nearly a decade. An example of this bias can be observed in the reporting of school resumption rates in the North West and South West regions (NW & SW). On 9 September 2024, Cameroon Radio Television (CRTV) reported that about 5,000 students had resumed school in these regions. At face value, this number seems encouraging, suggesting that many students were returning to school despite the ongoing crisis. However, a more detailed report by Equinox Television indicated that these 5,000 students were primarily from a single town, Nkambe. When this information is contextualized, it paints a different picture. If we compare the number of students who resumed school to the total number of students expected to return across the entire region, the disparity becomes alarming. This example demonstrates how sampling bias can obscure the truth. The initial report by CRTV may have unintentionally misled the public into believing that the situation in the Anglophone regions was improving when in reality, a significant portion of the student population had not returned to school. This is a clear case where biased data reporting—whether intentional or not—can lead to the spread of misinformation. When such one-sided information is deliberately used to advance a narrative, it transforms into disinformation, which can have far-reaching consequences, especially in politically charged environments.
Also, in 2017, Cameroon's Ministry of Public Service, in collaboration with the Ministry of Justice, launched a recruitment program for common law English-speaking magistrates. Originally intended to last four years, this program is still active. According to the recruitment guidelines, 80% of successful candidates must come from the crisis-affected NW and SW regions, while the remaining 20% are to be drawn from the other eight regions. This strategy aims to address the anglophone problem.
However, a significant and unresolved question persists among Cameroonians: Who qualifies as an anglophone? Is it someone from the NW or SW, someone who grew up there and speaks English, or someone educated in the Anglophone educational system? Many Cameroonians from the other eight regions have also studied exclusively in English and are familiar with common law. The lack of a clear definition fosters sampling bias, which can negatively impact decision-making, communication, and social harmony.
Additionally, approximately 7,000 Cameroonians registered for the exams in 2017, but by 2024, that number had fallen to fewer than 200 candidates. Meanwhile, an impressive building has been constructed at the National School of Administration and Magistracy for the common law section — a recruitment that is not indefinite. This sampling bias can lead to disinformation. As a result, instead of resolving the anglophone problem, the government may be inadvertently creating new issues.
References:
1. Boukouvalas, Z. and Shafer, A., 2023. Role of statistics in detecting misinformation: A review of the state of the art, open issues, and future research directions. Annual Review of Statistics, 11, pp.27-50. Available at: https://doi.org/10.1146/annurev-statistics-040622-033806 [Accessed 12 September, 2024.
2. Loftus, E.F. and Klemfuss, J.Z., 2022. Misinformation—past, present, and future. Psychology, Crime & Law, 30(4), pp.312-318. Available at: https://doi.org/10.1080/1068316X.2023.2219813 [Accessed 20 Sep. 2024]
3. Nguyen, Trong & Shih, Ming-Hung & Srivastava, Divesh & Tirthapura, Srikanta & Xu, Bojian. (2021). Stratified random sampling from streaming and stored data. Distributed and Parallel Databases. 39. 1-46. 10.1007/s10619-020-07315-w.
4. Oxford Advanced Learners Dictionary, 7th Edition
5. Rahman, Md & Tabash, Mosab & Salamzadeh, Aidin & Abduli, Selajdin & Rahaman, Md. Saidur. (2022). Sampling Techniques (Probability) for Quantitative Social Science Researchers: A Conceptual Guidelines with Examples. SEEU Review. 17. 42-51. 10.2478/seeur-2022-0023.
Background illustration: Generated by the author using the Microsoft Co-Pilot AI tool.
💡Take the free self-paced course on Countering Disinformation to deepen your knowledge on the subject and be resilient to data sampling bias: