Data Analysis

From Weaponized Social
Revision as of 23:03, 13 July 2015 by Willow (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Data Analysis

Resources and examples

In 2012 there was an app which allowed non-citizens to encourage their friends to vote their way. However, the database was “completely insecure” and contained the names of hundreds of undocumented immigrants = BAD

Research on governance systems in software - upvoting and downvoting in slashdot; the effect of being downvoted on news websites This kind of project doesn’t necessarily look at harassment but is important to understanding it

  • Machine learning - Algorithms that uses text analysis to look at posts and asks users to reconsider before posting (used teen input to “draw the line”)
  • Cynthia Dwork’s work on differential privacy - ways to anonymize data sets and analyze them safely
  • Stanford PACS every year does a data and civil society review
  • National Justice Center has violence against women resources
  • National Crime Vicitimization Survey
  • National Network to End Domestic Violence collects annual data about harassment, violence, stalking, etc.
    • Includes tech issues
    • Report called “A Glimpse from the Field”

Question: we don’t collect data that can do harm, but how do you then do analysis?

  • There’s a difference between data collection and research. Research is meant for a specific project. If you take the data after collecting it and crunch it for another project, that becomes uninformed consent
  • Additionally complicated when data takes the form of narrative, qualitative data
  • Hard to know when you’ll do harm
  • Difference between privacy and right to know.
    • Ex: Eugenics data is hidden behind government regulations even though these people are already dead
  • More complicated: government-funded data must be made available to the public forever, even if respondants only give consent for the initial research

Tech companies are looking at the data of perpetrators and draw patterns that can be used to prevent future harm - strong business reason to do so

There’s a lack of data about tech-related violence against women, and we need that data for funding, legislation change, etc. -- Take Back the Tech is making a database now

Zocolo Public Square - media platform for public discussion

WAM Report - able to analyze how Twitter, other platforms are able to respond to harassment

Question/Example: estranged husband threatens woman on FB. Woman reports it to the FB and it’s removed, which means it can’t be used as evidence.

Another issue: women are being asked to fill out too much information - puts extra burden on the victim

  • Ex: New Mexico was asking for way too much data on intake about domestic violence. Most of the information should have been part of a voluntary research study, but the government viewed it as “great data” regardless of victim rights

Companies are so afraid about the work they do about moderation that they don’t report it

Government agencies want data in big databases, which is 1) insecure, 2) can be subpoenaed, 3) violates client confidentiality

Problem: data can be used for positive AND negative means. Having data exist publicly has the potential to be abused. Research should not stop but there needs to be a discussion and knowledge about data collection

Trolls and bad people also report fake claims to trash peoples’ names, etc. - not all data is valid

Many data concerns regarding harassment (HIPAA, other rules, government stuff, etc.)

Question: what do you wish you had better research on?

  • Companies used to be very local. Today, tech companies are humongous - should they give public data?


Extra notes

  • Take Back the Tech - map of violence
  • Domestic Violence Census (have been doing it for 9 years. Tenth is coming up) (aggregate from the beginning)
  • Drafted Definition of “personally identifiable information” in the violence against women act. Disallowed to be shared
  • A foundation’s “get out the vote app” that asked undocumented people to ask their friends to vote, created an insecure database of undocumented immigrants

Current Examples of Projects

  • Take Back the Tech - map of violence
  • Domestic Violence Census (have been doing it for 9 years. Tenth is coming up) (aggregate from the beginning)
  • literature review of online harassment related research

upvoting and downvoting Lampe - uprooting and downvoting Justin Cheng * Disqus

machine learning Karthik Dinakar - machine learning

mapping experiences of violence

  • Take Back the Tech * have been doing an analysis of information and wider studies. Creating a repository

social network analysis Glad Lotan’s work on Israel/Palestine Facebook study on political bias Grab a hashtag

reports of harassment WAM Report

research for companies under non-disclosure Cindy Southworth has asked companies to go back into its own archives, when doing research. Companies have often been willing to do.

wider data ethics Dwork - Differential Privacy (measuring the degree of effectiveness) Stanford centre for philanthropy and civil society (ethics of data in civil society) bibliography Latanya Sweeney: re-identification

Violence Against Women National Institute of Justice - research program on violence against women. Funded the urban institute to look at teens and dating abuse in tech National Crime Victimisation survey (done by the census workers). Now added some tech. Included spyware in the past, now includes social media — rolling out in 2016.

Cindy Southward: Annual survey, count how many people they assisted, and what they assisted on in a given day, and over the year. Cindy Southward: Survey of local NGOs, asking all the tech they’ve seen. How have you seen victims who have seen stalking by their ex on Facebook. Eavesdropping, surveillance. “A Glimpse from the Field”

Ethics question: ethics over research You should never have research standing between you and help. Asking people to participate in research. If you say, “now that we have all this, we’re going to crunch all this and do research.”

  • You can’t consent to research when you’re worried about trauma
  • Even when you go through IRB, arts-based work is sometimes let through
  • unintended consequences

protecting victim confidentiality Were able to get domestic violence programs protected under the violence of women act


Wall: companies in past used to relate to small numbers of users. Is it right that Facebook that owns all the data?

so much of what is out there is us-based