Re-identification a real concern
Sven Bluemmel: Research shows almost any anonymised unit-level data can be re-identified
Risks of re-identification are very real, so we must approach the sharing of anonymised data carefully. But with tremendous benefits on offer it is worth the effort, according to the Victorian Information Commissioner, Sven Bluemmel.
Speaking at the InnovationAus.com Dataconomy 2018 event held at King & Wood Mallesons, Bluemmel outlined research that shows that anonymised unit-level data - data that tracks at the individual level and is most useful to researchers - can be re-identified.
“We used to think, and I used to think, until about 18 months ago, that you can have data like that with longitudinal information, and through some very careful techniques you could effectively de-identify it,” said Mr Bluemmel.
“But you probably can’t have it both ways. If you want to have that level of insight then someone who gets that data set with appropriate computing power and a little bit of external information will be able to re-identify.”
“We’ve done some work with researchers at Melbourne University who are very well credentialed in this field and they’ve been let loose on some public data sets. I don’t think they’ve failed yet.”
Mr Bluemmel said we can’t just ignore this issue, because despite many who suggest that the public does not care about privacy anymore, his perspective is that they really do care. There may just be different ways people express how they care.
One example is the common use of alternate birthdates in online services that people don’t think have a good reason for knowing such information. Or the way we often care more about privacy from data we have shared in the past than in the data we may choose to share in any given moment.
Pointing to this week’s 70th anniversary of the Universal Declaration of Human Rights, Mr Bluemmel noted that this document has stood the test of time remarkably well and that privacy is covered as an enabling right - as something necessary to the development of personal identity.
“I think you can understand why Europe, given 20th Century history, would be particularly sensitive to privacy,” said Mr Bluemmel. “Imagine yourself being a member of a particular religion, particularly, you’ve gone about your business, and suddenly there’s a massive change of government and that government and its access to fantastically accurate and far ranging records of people and census and religion and beliefs suddenly wants to persecute you and they have the data to do that.”
While speaking strongly to the importance of privacy, Mr Bluemmel also noted that his department is unique in that they do not simply operate to protect privacy - they also administer effective access to public data, and he sees there are great benefits to society when data is used well.
“It was that sort of [unit-level] data that led to the establishment of the link between lack of folate during pregnancy and neural tube defects in newborns. A wonderful insight to get and has done so much good,” said Mr Bluemmel.
“You’re not going to get that sort of insight just by going to [an aggregate] data set that says 20 percent of women don’t get enough folate and 0.4 percent of newborns are born with neural tube defects. That won’t tell you that.”
For OVIC, Mr Bluemmel says they have concluded that you must create controlled environments to allow access to data to balance the value of research with the need for data protection.
“We do that in Victoria in a lab environment, through the Victorian Centre for Data Insights,” said Mr Bluemmel. “You can still bring in those researchers, but it can’t be as broad as hoped.”
The risk of a loss of public trust is real and we need the support of the public to ensure not only the faith in the use of data for public benefit, but also to ensure that the data itself remains unpolluted.
“If you lose the social license, then people will not back it and they might be more likely to give false data,” says Mr Bluemmel. “We saw that with census which was very unfortunate. People openly campaigning to say it is an invasion of privacy so give as much false data as you can. That’s terrible. You can never prevent that completely, but the more people trust the process and the motives the more likely they will support and give good data.”
While Mr Bluemmel worries about finding the right path forward, he feels strongly that it is well worth the effort to do so. It just requires care and intelligence to get there.
“All those things we’ve heard about effective regulatory environments, of trying to have privacy and security by design. I see no reason why these can’t be done.”
“Throwing up our hands to say its too hard, let’s not bother is a terrible option. Another terrible option is to say we have to give open slather and repeal annoying privacy laws.”
“This is not an easy thing. But we have had many things that haven’t been easy for us but we have not given up.”