Abstract
Disaggregated population counts are needed to calculate health, economic, and development indicators in Low- and Middle-Income Countries (LMICs), especially in settings of rapid urbanisation. Censuses are often outdated and inaccurate in LMIC settings, and rarely disaggregated at fine geographic scale. Modelled gridded population datasets derived from census data have become widely used by development researchers and practitioners; however, none of these datasets have been evaluated for accuracy of population estimates at the grid cell-level. This is because the finest-scale population figures generally available to data producers are those input into gridded population models and disaggregated to smaller grid cells (e.g., 100x100m). We simulate a realistic "true" 2016 population in Khomas, Namibia, a majority urban region, and introduce realistic levels of outdatedness (over 15 years) and inaccuracy in slum, non-slum, and rural areas. We then aggregate these simulated realistic populations by census and administrative boundaries (to mimic census data), and generate 32 gridded population datasets that are typical of a LMIC setting using WorldPop-Global's gridded population approach. We evaluate the cell-level accuracy of these simulated WorldPop-Global datasets, using the original "true" population as a reference. In our simulation, we found large cell-level errors, particularly in urban cells, driven by WorldPop-Global's use of average population densities in large areal units to determine cell-level population densities. Age, accuracy, and aggregation of the input data did play a primary role in these errors. We suggest incorporating finer-scale training data into gridded population models generally, and WorldPop-Global in particular (e.g., from simulated populations, routine household surveys, or slum community profiles), and use of new building footprint datasets as a covariate to improve cell-level accuracy of gridded population data. It is important to measure cell-level accuracy of all gridded population datasets, especially if they are to be used for monitoring key development indicators.