Estimates of New York’s population have recently been on a roller coaster. The Census Bureau’s 2020 Annual Population Estimate for New York was 19,382,373, an increase of only 4,271 since 2010. In 2021, the Census Bureau released its 2020 Decennial Census of the nation’s population. The decennial Census showed that New York’s population was 20,201,249, an increase of 823,147. Following the publication of the Census data, I wrote a post, “A Caution about the Use of Census Population Estimates,” that highlighted the difference between the population estimates and the Census data and questioned their accuracy.

On May 19th of this year, the Census Bureau released a study of the accuracy of the 2020 Census, “Census Coverage Estimates for People in the United States by State and Census Operations.” The analysis contained revised population data from a survey that corrected erroneous enumerations, “whole person imputations and omissions.” The new study produced a 2020 population estimate for New York State (19,506,326), which is much closer to the Annual Population Estimate than the Census data. Based on the new study, the State’s population grew by only 128,224 – less than one percent. Why do the numbers differ?
The 2020 Census
During the 2020 Census, the COVID-19 pandemic adversely affected interviewers’ ability to complete their work. Because the pandemic hit in March and April, data collection was paused during the nationwide lockdown. Because of the pandemic, the New York Times reported that people were less likely to allow interviewers to speak to them in person. In addition, skepticism about the process may have arisen because the Trump administration attempted to change the Census to prevent undocumented aliens from being included in the data and to limit the data-collection period. As a result, data quality suffered somewhat compared to the 2010 Census.
The 2020 Census Post Enumeration Survey
After each Decennial Census, the Bureau conducts surveys to measure its accuracy. For those Census respondents who were interviewed, misreporting occurred in some cases. In others, recall errors were present. Because the Census cannot interview someone from every household in the country, a portion of the count used data from other sources, such as neighbors or government records. In addition, the data contained some duplicate counts and incorrect locations for persons. Some housing units were wrongly coded as occupied rather than vacant, and others were missed.
To assess the accuracy of the Census, the Bureau created two independent samples of about 150,000 people to identify data errors. The smaller scope of post-enumeration surveys permitted more intensive data analysis. Two independent samples were used to ensure sample representativeness. But the 2000 Post-Enumeration Survey data had to be reevaluated after errors were discovered that led to population overestimation. Since then, the Bureau has implemented processes to prevent a repeat of the same mistake.

Because the 2020 Post Enumeration Survey showed that New York’s population was significantly smaller than the uncorrected Census estimate, the state’s population growth from 2010 to 2020 was much smaller using the corrected data. The smaller population growth dropped New York’s rank from 7th to 34th. In percentage terms, New York’s 0.7% growth ranked 47th. Only Maine (0.3%), Hawaii (-0.3%), Rhode Island (-1.0%), and West Virginia (-4.7%) had slower growth.

The Post Enumeration Survey produced a set of population estimates that, in total, were similar to the census values. The Survey estimates were lower than Census estimates in the Northeast, the Midwest, and the coastal Western States. PES estimates were higher than census numbers in the South, the Plains states, and the interior West. The largest Census population overestimates were Hawaii (6.8%), Delaware (6%), Rhode Island (5.1%), the District of Columbia (4.6%), Nevada (4.4), Minnesota (3.8%), and New York (3.4%). States with the largest Census underestimates were Arkansas (-5%), Tennessee (-4.8%), Montana (-4.4%), Mississippi (-4.1%), and Louisiana (3.7%)
Sampling Error in the Post Enumeration Survey
Although post-enumeration improved the quality of population data, it is a sample-based approach, and its population estimates may differ from the actual values. Because of natural sampling variability, the estimates could differ from the actual population by an amount defined by a confidence interval- the range around a sample estimate within which the true value is likely to fall. Typically, researchers use a 95% confidence level. (The Census Bureau uses a slightly more lenient standard – 90%.). Although the Bureau presents a single value for State population estimates, the actual value could fall in a range anywhere within the confidence interval.
The classic example given for this concept involves a coin flip. We know that a coin flip will generate an equal number of heads and tails over time. But, as gamblers know, if we flip a coin four times, we don’t always get two heads and two tails. Sometimes three heads come up; other times, we might get lucky and get four heads, or we could get none. But the more times we flip the coin, the closer to the actual 50-50 split we are likely to get. The sample size largely determines the range around the sample value within which the true value lies.

Although the Post Enumeration Survey is large – 150,000 respondents nationally, the number of participants in each state is much smaller. Consequently, the confidence intervals for the 2020 state populations are larger than for the nation. For larger states, the range within which the actual 2020 population fell was generally within two to five percent of the published estimate. However, for smaller states, the actual 2020 population fell within a much wider range. In West Virginia, the range within the confidence interval was 16.8%, equal to 1 in 6 of the state’s residents. Six states had confidence intervals that were as large as 10% of their reported populations. In 30 states, the range of actual population values exceeded 5% of the published population.


In New York State, the Survey found that the population in 2020 was between 19,178,143 and 19,875,863 – a difference of 697,000, using a 95% confidence interval. With the state’s actual 2020 population falling within a wide confidence interval, the state might have lost as many as 220,000 residents between 2010 and 2020 or gained as many as 477,000 – a range of -1% to +2.6%. Neighboring states also had broad confidence intervals – in most cases, the ranges were hundreds of thousands of residents. In some other states, the range was much larger. Montana may have grown between 6.2% and 22.3%. West Virginia’s population might have declined by as much as 12.6% or increased by as much as 3.3%.

Researchers and politicians often focus on ranking state population changes, but the limitations of available census data make the exercise fruitless. The range of possible state populations within the confidence intervals is too wide to allow a precise analysis. With New York’s possible population change of -1% to 2.6%, the State’s rank could have been between 38th and 49th, assuming other state estimates were accurate. West Virginia’s rank could have been between 36th and 50th. Montana could have been between first and 24th.
Although the confidence intervals for state populations in the Post Enumeration Survey are relatively broad, the Survey shows that there is less than a 1-in-10 chance that the Census population values for New York and several other states were accurate. Using the 90% confidence interval, seven states had fewer residents than recorded in the 2020 census: Delaware, Hawaii, Massachusetts, Minnesota, New York, Ohio, Rhode Island, and Utah. In six states, an undercount was likely – Arkansas, Florida, Illinois, Mississippi, Tennessee, and Texas.
Conclusions
The 2020 Census Post Enumeration Survey demonstrated that the likelihood that New York’s population increased as much from 2010 as the Decennial Census reported was less than 10%. But, because the Survey included a relatively small sample of New Yorkers, it is impossible to know the state’s precise population change. The range around New York’s 2020 population estimate for 95% certainty was 3.6% of the published value – a confidence interval of 697,000. The difference between the high and low ends of the confidence interval resulted in 2010-2020 population change estimates ranging from a loss of 220,000 residents to a gain of 477,000. Smaller states had larger potential 2020 population estimate ranges and 2010-2020 population change confidence intervals.
In my earlier post, I argued that the Census Bureau should show standard errors and confidence intervals for its Annual Population Estimates. In the case of the Post Census Enumeration Study, the Bureau made this information available, showing that potential sampling errors in the data are relatively large, particularly in less populous states. Given the relatively large confidence intervals in the Post Enumeration Survey and the absence of information about sampling errors in the Bureau’s Annual Population Estimates, we cannot precisely determine the actual state populations, how much the populations changed from year to year, or the ranking of states.
Webmentions
[…] so far off? John Bacheller, former chief economist for the state Empire State Development Corp., has explored the question in an informative deep dive at his “Policy by the Numbers” blo…. His penultimate graph sums it […]