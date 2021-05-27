FILE - Rows of homes, are shown in suburban Salt Lake City, on April 13, 2019. Utah is one of two Western states known for rugged landscapes and wide-open spaces that are bucking the trend of sluggish U.S. population growth. The boom there and in Idaho are accompanied by healthy economic expansion, but also concern about strain on infrastructure and soaring housing prices. (AP Photo/Rick Bowmer, File)

ORLANDO, Fla. – First came the “noise” — small errors the U.S. Census Bureau decided to introduce into the 2020 census data to protect participants' privacy. Now the bureau is looking into “synthetic data,” manipulating the numbers widely used for economic and demographic research, to obscure the identities of people who provided information.

The moves have some researchers up in arms, worried that the statistical agency could sacrifice accuracy in its zeal to protect privacy.

Census Bureau statisticians disclosed at a virtual conference last week that over the next three years they will work toward developing a method to create “synthetic data" for files on individuals and homes that already are devoid of personalized information. These files, known as American Community Survey microdata, are used by researchers to create customized tables tailored to their research.

Census Bureau statisticians said more privacy protections are needed as technological innovations magnify the threat of people being identified through their survey answers, which are confidential. Computing power is now so vast that it can easily crunch third-party data sets that combine personal information from credit rating and social media companies, purchasing records, voting patterns and public documents, among other things.

Ad

“It’s a balancing act. The law requires us to do competing things. We need to release statistics on the nation to allow people to make useful decisions. But we also have to protect the privacy of our respondents,” said Rolando Rodriguez, a Census Bureau statistician, at the conference.

But critics say the proposal, coupled with an ongoing effort to add small inaccuracies to the 2020 census data in order to protect participants' privacy, undermines the Census Bureau's credibility as the go-to provider of precise data about the U.S. population.

University of Minnesota demographer Steven Ruggles said bluntly that synthetic data “will not be suitable for research."

“The Census Bureau is inventing imaginary threats to confidentiality to sharply reduce public access to data," Ruggles said. “I do not think this will stand, because society needs information to function."

Ad

Ad

Ad