July 15, 2024

While designing an online experiment to evaluate the impact of social media edutainment materials on anemia (see Blog Post #3), we realized a gap in publicly available data and academic research on rates of anemia testing in India. Given that such a statistic was important to inform accurate sample size calculations for the online experiment, we designed a quick “dipstick” study on anemia testing. CSBC uses “dipstick” quite frequently — it means back-of-the-envelope, rough estimate, approximate gauge.

I launched a very simple five-question survey, primarily targeted at understanding anemia testing behavior in the states of Uttar Pradesh and Bihar (same target region as the online experiment and subsequent social media campaign) and distributed the survey through CSBC’s online research participant pool with an 100 rupee incentive for anyone who qualified for and completed the survey. The result was not all successful, and I wanted to share a few learnings on online data collection in India, below:

  1. Talk is cheap. This is no new insight for any researcher who has gathered data through surveys, but self-reported behavior expressed in a survey is wildly different from behavior in real life. Our sample suggested a 67% anemia testing rate — an improbably high rate, likely because people were subject to social desirability bias, didn’t care enough to answer honestly, or hoped that answering “Yes” would guarantee their chances of payout.
  2. Beware of fraudsters. Similarly related to the previous point, we found some suspicious activity from respondents, with one IP address generating 18 unique survey responses and another generating 10. It is likely that those responses were fraudulent, inputted by an individual to maximize their payout. This happened despite us collecting phone numbers as unique identifiers (for example, one of those individuals used the same IP address to put down 18 unique phone numbers). Given this was more of a “dipstick” to guide internal power calculations, there were no severe consequences as we just extrapolated a benchmark for power calculations from several other data sources. However, it is a good learning for me as a researcher and for the subsequent phase of our online experiment, which is much longer and thorough, in which we actually intend to evaluate the social media campaign materials and for which the results hold greater consequence.

In an ideal world, we would have been able to gather real behavioral data on anemia testing from the Ministry of Health, networks of healthcare practitioners, etc. However, given we were short for time, this “dipstick” assessment met our needs and taught me some cautionary lessons about conducting online research well in India.