Did you know researchers are reading and analyzing your tweets and Facebook posts in the name of science?
If so, how do you feel about it? If you feel unsettled, what would make you feel better?
What’s legal and what’s not in the age of big-data research? And even if it is legal, is it ethical?
These are some of the questions Casey Fiesler, an assistant professor in the at CU Boulder, will explore as part of a multicenter, $3 million National Science Foundation grant announced this month.
The four-year, six-institution (Pervasive Data Ethics for Computational Research) project aims to come up with guidance for researchers, policymakers and consumers around a burgeoning and at times controversial field so new it lacks widely accepted ethical standards.
“Thanks to the internet we now have this vast amount of information about human behavior that can help us answer very important questions,” says Fiesler, noting researchers mine everything from tweets to Instagram photos to publicly shared health data and comments on news articles. “This is great for science, but we have to make sure that the ways we go about answering these questions are ethical and take into account the privacy and ownership concerns of the people creating the data.”
Several recent high-profile instances have raised ethical questions about big-data research:
In 2014, Facebook and Cornell University researchers published a study in which they manipulated the news feeds of Facebook users for one week, prioritizing positive content for some and negative content for others, to see if it changed the tone of the users’ posts. (It did.)The “emotional contagion study” sparked widespread debate about whether Facebook users should have been asked for consent.
Just because data is easy to get doesn’t mean we should do whatever we like with it.”
–Casey Fiesler
In another case, Danish researchers raised concerns about privacy when they shared a dataset in a web forum for social science researchers containing sensitive information from 70,000 users of an online dating site. And scientists sometimes quote social media posts verbatim in research papers on sensitive topics making it possible for journalists or others reading the study to identify who posted it.
“Most people have no idea this is happening, and who might be reading their content,"Fiesler says. “They tend to vastly underestimate who can see it.”
While universities have institutional review boards that oversee the ethics of research conducted on humans, research on data created by humans rests in a gray area, she says.
The PERVADE team hopes to help fill the gap, first by assessing challenges surrounding the research and then offering empirically based educational tools to researchers and consumers.
“By empowering researchers with information about the norms and risks of big-data research, we can make sure that users of any digital platform are only involved in research in ways they don’t find surprising or unfair,” says co-investigator Katie Shilton, associate professor in the College of Information Studies at the University of Maryland.
The team also includes researchers from the University of California, Irvine; Princeton University; the University of Wisconsin-Milwaukee; and the Data and Society Research Institute.
Fiesler received more than $400,000 which she will use to assess user knowledge and perceptions of big-data research and its legal and ethical implications.
“As technology changes, ethical norms have to constantly evolve to keep up,” she says. “Just because data is easy to get doesn’t mean we should do whatever we like with it.”