In my past life as a molecular biologist, I learned Arthur Kornberg’s “Ten Commandments: Lessons from the Enzymology of DNA Replication” (J. Bacteriology. 2000 Jul; 182:3613). Dr. Kornberg was a biochemist (he won the Nobel Prize in Medicine in 1959 for identifying the enzymes that mediate DNA replication), and his fourth Commandment was, “Do not Waste Clean Thinking on Dirty Enzymes”. His point was that cells are complex (lots of different enzymes!), so before making conclusions, try to ensure that you’ve eliminated or controlled for unexpected variables that might have influenced your results.

Recent WASH sector discussions, project reports, and calls for impact assessments have reminded me of Dr. Kornberg’s lesson. Across our sector, there is increasing emphasis on experiments and impacts assessments for measuring how well interventions work and their cost-effectiveness. The growing demand for good evidence is, of course, welcome. However, I wonder if the requirements for obtaining good evidence are well considered.

Clearly, a critical requirement for measuring program impacts is an understanding of what would have happened if the intervention was not implemented: i.e., would the targeted outcomes have changed if there was no program?

Nevertheless, a few years ago I heard the head of a development organization claim credit for declines in infant mortality rates in areas of Kenya where his organization was implementing agricultural improvement programs. The very next day the Kenyan government and development partners released data indicating that the dissemination of insecticide-treated bednets was driving significant reductions in infant mortality across the country.

So how much impact did the agricultural improvement programs actually have on infant mortality? It’s hard to say, because the development organization did not have estimates of how much infant mortality would have changed in their intervention communities if they had not implemented their agricultural improvement programs.

The best strategies for estimating what would have happened if a program wasn’t implemented employ comparisons with a valid “control” group: i.e., intervention units (communities, households, institutions, etc.) that did not receive the program but were very similar to the intervention units that did receive the program.

The key is the level of similarity between the control and intervention groups: the more the two groups differed (e.g., in economic development, education levels, geographies, occupations, political leadership, other development programs, etc.) when the program was implemented, the harder it is determine whether differences in the outcome of interest (e.g., infant mortality) between the two groups are due to the presence/absence of the program or to some unrelated “confounding” factor such as a difference in average education levels between the two groups.

The best option for establishing a valid control group is to randomly select intervention units out of a large pool and assign unselected units to the control group. If the selection pool is big enough, average measures for any parameter will be identical between the intervention and control groups – this is the law of large numbers. If the selection pool is too small for random selection or other considerations are important for targeting the intervention, it is possible to establish a matched set of controls by selecting a group that looks similar to the intervention group across many different parameters.

Random selection obviously has to happen prior to program implementation. Ideally, matching a control group to the intervention group should also occur prior to starting implementation. Retrospective matching (i.e., finding a matched control group after a program is completed) requires pre-intervention data for both the intervention group and potential controls that covers many different parameters. Often this data is not available or is of poor quality.

The lack of useful pre-intervention data means that poorly matched intervention and control groups compromise many of the non-randomized WASH impact assessments that are initiated after program completion. There is potential for confounding factors (known and unknown) in the control group to influence measurements of program impact in the intervention group.

This posting, then, is one more call for government agencies, development partners, and implementing organizations to consult with impact evaluation specialists before designing and rolling-out programs. Don’t wait until it’s too late: we should not waste clean thinking on confounded comparisons.

By 2030, achieve universal and equitable access to safe and affordable drinking water for all.  Not many will argue against the importance of the United Nations’ Sustainable Development Goal target 6.1.  But there is an irony underlying this target: market trends indicate that even as public drinking water supplies improve in the developing world, consumers will spend more on bottled water and private water treatment because they don’t trust the water that does come out of their taps.

A commentary last year in Global Water Intelligence claimed that, “total spending on water utilities is growing at 3.5% per year, while total spending on bottled water, point-of-use water systems and tanker supply is growing at 9%.  Spending on private domestic solutions to water is likely to exceed total utility spending by 2030.”

This lack of public trust has important consequences:

  • -It decreases support for water tariffs and taxes, which are required to achieve target 6.1.
  • -It increases the amount that families spend to obtain water that they believe is safe.


Clearly, collecting and sharing water quality information is critical for building consumer trust and for promoting public services.  Information is also essential for holding water suppliers accountable.  In most countries, water quality testing is the law: suppliers (public and private) are required to monitor parameters, generally specified in national standards, and report their results to the government.

But water suppliers often don’t do enough testing because:

  • -Supplying water is a marginal business – most providers require government and/or donor inputs, particularly to serve low-income areas.
  • -Resource constrained suppliers will not spend money on regular testing because they are not penalized for breaking the law – poor enforcement contributes to low motivation.


These are systemic barriers to better testing: Aquaya’s Monitoring for Safe Water research shows that until these barriers are removed, new programs and innovations, including those listed below, will only have short-term impacts:

  • -Project-based funding for laboratories, supplies, and training
  • -New technologies (water testing kits, mobile phones, data platforms)
  • -New water testing business models including private services


To address the systemic barriers to water quality data collection, Aquaya’s next goals for Monitoring for Safe Water are to design and evaluate interventions that are focused on increasing both resources and motivation for testing.

Aquaya has studied and optimized water quality testing strategies in low-resource settings for over 10 years. We are happy to provide insights to help your organization develop or improve a monitoring program. Please reach out by contacting us.

Connect With Us