July 15, 2020—New Population Council research shows that the COVID-19 data being collected and reported across the United States (U.S.) is inconsistent and incomplete exposing the country’s inability to analyze and use data to guide its response.
Shortcomings further restrict health officials’ ability to understand the key demographic factors that place certain populations and communities at increased risk of contracting and experiencing adverse health outcomes from COVID-19.
Researchers at the Population Council reviewed 70 COVID-19 data sources from the Centers for Disease Control and Prevention (CDC) and health departments across 50 states, nine territories, and 10 major cities in the United States between May 14-30, 2020 to assess how COVID-19 data is reported on testing and four key outcomes: cases, hospitalizations, recoveries, and deaths. They further looked at whether and how this data is disaggregated by a core set of demographic indicators: age, race/ethnicity, sex/gender, education level, economic status, geography, and underlying health conditions—which are essential in understanding the circumstances of people’s lives and what may put them at risk for poor health outcomes.
Researchers found that there is no standardized reporting of COVID-19 data across sources between the CDC, state, territory, and city health departments. Less than half of the 70 data sources included data on testing and four key COVID-19 outcomes: cases, hospitalizations, recoveries, and deaths. They also found that key demographic indicators—such as age, race/ethnicity, sex, and geography—were inconsistently and poorly reported. Data on COVID-19 cases and deaths were disaggregated by demographic indicator, whereas data on COVID-19 testing and hospitalizations were not.
“As new COVID-19 hotspots arise, it is alarming that it is the lack of data standardization that is holding health officials back from effectively targeting their response,” said Thoai Ngo, director of the Poverty, Gender, and Youth Program at the Population Council and a lead investigator of this study. “But what’s even more concerning is that the U.S., which is the country with the highest number of infections and deaths in the world, is not equipped with the most basic data it needs to understand the vulnerabilities of different demographic sub-populations. We hope these findings can help address the gaps in data collection, reporting, and analyses.”
While all Americans are susceptible to COVID-19, the rate of infections and deaths among Black, Latino, and Indigenous Americans are higher than among Asian and White Americans. Research shows that key demographic indicators including race/ethnicity, sex, income, occupation, education level, and geographic location all have an impact on health indicators, including infection and deaths rates.
“If we want to begin to address the inequity of COVID-19 infections, hospitalizations, and deaths that exist within the US, we need data systems that allow us to report and analyze multiple factors or vulnerabilities,” said Charlotte Brasseux, a graduate student at Columbia University, an intern at the Population Council and an investigator of this study. “More sophisticated and intersectional analyses of COVID-19 data will help us better understand the multitude of socio-demographic factors that may place certain subpopulations at increased risk for COVID-19 and help to drive a more effective response.”
Additional findings reveal that:
- Nearly all (94%) of the 70 data sources reported on cases and (93%) deaths, while only 86% reported on testing, 76% reported on hospitalizations, and 57% reported on recoveries. Data on cases and deaths was commonly disaggregated by geography, age, sex, and/or race/ethnicity; but less commonly disaggregated by underlying health conditions, economic status, and/or education level.
- None of the 70 data sources recorded education level, and only two cities, New York City and Los Angeles, included economic status (i.e. poverty levels) as a demographic indicator, thereby restricting understanding of risk profiles by economic status associated with COVID-19.
- Data on race/ethnicity was most commonly disaggregated for COVID-19 cases and deaths, and was far less commonly disaggregated for testing, hospitalizations, and recoveries. Reporting of race and ethnicity was not standardized—some data sources had race and ethnicity as separate indicators, while others combined them into one.
- The data sources used the terms “sex” and “gender” interchangeably to indicate “male” and “female,” and there was no reporting on gender identity or sexual orientation, preventing identification of cases and deaths among LGBTQ+ communities.
- Less than one third (31.4%) of the data sources examined the intersection of more than one key demographic indicator by an outcome.
The Population Council has been partnering with national health ministries, government agencies, and international non-governmental organizations in the U.S., sub-Saharan Africa, South Asia, and Latin America to share new social, behavioral, and biomedical data, evidence, and insights to inform the national and international response effort. More than 15 activities in 14 countries have been conducted to date.